Skip to main content

Redshift

Overview

The Redshift source supports Full Refresh syncs. That is, every time a sync is run, Airbyte will copy all rows in the tables and columns you set up for replication into the destination in a new table.

This Redshift source connector is built on top of the source-jdbc code base and is configured to rely on JDBC 4.2 standard drivers provided by Amazon via Mulesoft here as described in Redshift documentation here.

Sync overview

Resulting schema

The Redshift source does not alter the schema present in your warehouse. Depending on the destination connected to this source, however, the schema may be altered. See the destination's documentation for more details.

Features

FeatureSupportedNotes
Full Refresh SyncYes
Incremental SyncYesCursor-based, using ORDER BY on a user-defined cursor column
Replicate Incremental DeletesNot supported in Redshift
Logical Replication (WAL)Not supported in Redshift
SSL SupportYes
SSH Tunnel ConnectionNo
NamespacesYesEnabled by default
Schema SelectionYesMultiple schemas may be used at one time. Keep empty to process all of existing schemas

Incremental Sync

The Redshift source connector supports incremental syncs. To setup an incremental sync for a table in Redshift in the Airbyte UI, you must setup a user-defined cursor field such as an updated_at column. The connector relies on this column to know which records were updated since the last sync it ran. See the incremental sync docs for more information.

Defining a cursor field allows you to run incremental-append syncs. To run incremental-dedupe syncs, you'll need to tell the connector which column(s) to use as a primary key. See the incremental-dedupe sync docs for more information.

Getting started

Requirements

  1. Active Redshift cluster
  2. Allow connections from Airbyte to your Redshift cluster (if they exist in separate VPCs)

Setup guide

1. Make sure your cluster is active and accessible from the machine running Airbyte

This is dependent on your networking setup. The easiest way to verify if Airbyte is able to connect to your Redshift cluster is via the check connection tool in the UI. You can check AWS Redshift documentation with a tutorial on how to properly configure your cluster's access here

2. Fill up connection info

Next is to provide the necessary information on how to connect to your cluster such as the host whcih is part of the connection string or Endpoint accessible here without the port and database name (it typically includes the cluster-id, region and end with .redshift.amazonaws.com).

Encryption

All Redshift connections are encrypted using SSL

Changelog

VersionDatePull RequestSubject
0.5.22024-02-1335223Adopt CDK 0.20.4
0.5.12024-01-2434453bump CDK version
0.5.02023-12-1833484Remove LEGACY state
(none)2023-11-1732616Improve timestamptz handling
0.4.02023-06-2627737License Update: Elv2
0.3.172023-06-2027212Fix silent exception swallowing in StreamingJdbcDatabase
0.3.162022-12-1420436Consolidate date/time values mapping for JDBC sources
0.3.152022-10-1315535Update incremental query to avoid data missing when new data is inserted at the same time as a sync starts under non-CDC incremental mode
0.3.142022-09-0116258Emit state messages more frequently
0.3.132022-05-25Added JDBC URL params
0.3.122022-08-1814356DB Sources: only show a table can sync incrementally if at least one column can be used as a cursor field
0.3.112022-07-1414574Removed additionalProperties:false from JDBC source connectors
0.3.102022-04-2912480Query tables with adaptive fetch size to optimize JDBC memory consumption
0.3.92022-02-219744List only the tables on which the user has SELECT permissions.
0.3.82022-02-1410256Add -XX:+ExitOnOutOfMemoryError JVM option
0.3.72022-01-269721Added schema selection
0.3.62022-01-208617Update connector fields title/description
0.3.52021-12-248958Add support for JdbcType.ARRAY
0.3.42021-10-217234Allow SSL traffic only
0.3.32021-10-126965Added SSL Support
0.3.22021-08-134699Added json config validator