Skip to main content

Kafka

This page guides you through the process of setting up the Kafka source connector.

Set up guide

Step 1: Set up Kafka

To use the Kafka source connector, you'll need:

  • A Kafka cluster 1.0 or above
  • Airbyte user should be allowed to read messages from topics, and these topics should be created before reading from Kafka.

Step 2: Setup the Kafka source in Airbyte

You'll need the following information to configure the Kafka source:

  • Group ID - The Group ID is how you distinguish different consumer groups. (e.g. group.id)
  • Protocol - The Protocol used to communicate with brokers.
  • Client ID - An ID string to pass to the server when making requests. The purpose of this is to be able to track the source of requests beyond just ip/port by allowing a logical application name to be included in server-side request logging. (e.g. airbyte-consumer)
  • Test Topic - The Topic to test in case the Airbyte can consume messages. (e.g. test.topic)
  • Subscription Method - You can choose to manually assign a list of partitions, or subscribe to all topics matching specified pattern to get dynamically assigned partitions.
  • List of topic
  • Bootstrap Servers - A list of host/port pairs to use for establishing the initial connection to the Kafka cluster.
  • Schema Registry - Host/port to connect schema registry server. Note: It supports for AVRO format only.

For Airbyte Open Source:

  1. Go to the Airbyte UI and in the left navigation bar, click Sources. In the top-right corner, click +new source.
  2. On the Set up the source page, enter the name for the Kafka connector and select Kafka from the Source type dropdown.
  3. Follow the Setup the Kafka source in Airbyte

Supported sync modes

The Kafka source connector supports the following sync modes:

FeatureSupported?(Yes/No)Notes
Full Refresh SyncYes
Incremental - Append SyncYes
NamespacesNo

Supported Format

JSON - Json value messages. It does not support schema registry now.

AVRO - deserialize Using confluent API. Please refer (https://docs.confluent.io/platform/current/schema-registry/serdes-develop/serdes-avro.html)

Changelog

VersionDatePull RequestSubject
0.2.42024-02-1335229Adopt CDK 0.20.4
0.2.42024-01-2434453bump CDK version
0.2.32022-12-0619587Fix missing data before consumer is closed
0.2.22022-11-0418648Add missing record_count increment for JSON
0.2.12022-11-04This version was the same as 0.2.0 and was committed so using 0.2.2 next to keep versions in order
0.2.02022-08-2213864Added AVRO format support and Support for maximum records to process
0.1.72022-06-1713864Updated stacktrace format for any trace message errors
0.1.62022-05-2912903Add Polling Time to Specification (default 100 ms)
0.1.52022-04-1912134Add PLAIN Auth
0.1.42022-02-1510186Add SCRAM-SHA-512 Auth
0.1.32022-02-1410256Add -XX:+ExitOnOutOfMemoryError JVM option
0.1.22021-12-218865Fix SASL config read issue
0.1.12021-12-068524Update connector fields title/description