There are different types of nodes in a distributed Kafka Connect ecosystem. This Kafka documentation uses the following terminology to refer to specific type of a cluster node:
- Kafka cluster nodes are called Kafka Brokers
- Kafka Connect cluster nodes are called Kafka Connect Workers
- GridGain cluster nodes are called GridGain Servers
Kafka Connector installation consists of 3 steps:
- Prepare Connector Package
- Register Connector with Kafka
- Optional: register Connector with GridGain
Kafka Connector is part of GridGain Enterprise or GridGain Ultimate version 8.4.9 or later. The connector is located in the
integration/gridgain-kafka-connect directory in the GridGain installation directory.
Pull missing connector dependencies into the package:
cd $GRIDGAIN_HOME/integration/gridgain-kafka-connect ./copy-dependencies.sh
For every Kafka Connect Worker:
- Copy connector package directory to where you want Kafka Connectors to be located.
- Edit Kafka Connect Worker configuration (
$KAFKA_HOME/config/connect-standalone.propertiesfor single-worker Kafka Connect cluster or
$KAFKA_HOME/config/connect-distributed.propertiesfor multiple node Kafka Connect cluster) to register the connector on the plugin path (replace
CONNECTORS_PATHwith directory where you copied the connector package):
This is optional
This step is needed only if you use
BACKLOG as a Failover Policy.
On every GridGain server node copy the below JARs into
gridgain-kafka-connect-8.4.9.jar(located on GridGain nodes in the
kafka-clients-2.0.0.jar(located on Kafka Connect workers in the
The only GridGain Source connector mandatory properties are the connector's name, class and path to Ignite configuration describing how to connect to the source GridGain cluster. Minimal source connector configuration with a name "gridgain-kafka-connect-source" might look like:
name=gridgain-kafka-connect-source connector.class=org.gridgain.kafka.source.IgniteSourceConnector igniteCfg=IGNITE_CONFIG_PATH/ignite-server-source.xml
See Source Connector Configuration for full properties list.
The only GridGain Sink connector mandatory properties are the connector's name, class, list of topics to stream data from and a path to Ignite configuration describing how to connect to the sink GridGain cluster. Minimal source connector configuration with a name "gridgain-kafka-connect-sink" might look like:
name=gridgain-kafka-connect-sink topics=topic1,topic2,topic3 connector.class=org.gridgain.kafka.sink.IgniteSinkConnector igniteCfg=IGNITE_CONFIG_PATH/ignite-server-sink.xml
See Sink Connector Configuration for full properties list.
See Installing and Configuring Kafka Connect
for detailed documentation. As a summary, you need to:
- Configure and Install Kafka Connectors
- Configure and start Zookeeper
- Configure and start Kafka brokers
- Configure and start Kafka Connect workers
We already reviewed how to configure and install Kafka connectors. Below are shell commands to run Kafka Connect ecosystem on the same host using default zookeeper, broker and connect worker configuration files (normally you would run each node on a separate host):
$KAFKA_HOME/bin/zookeeper-server-start.sh $KAFKA_HOME/config/zookeeper.properties $KAFKA_HOME/bin/kafka-server-start.sh $KAFKA_HOME/config/server.properties $KAFKA_HOME/bin/connect-standalone.sh \ $KAFKA_HOME/config/connect-standalone.properties \ gridgain-kafka-connect-source.properties \ gridgain-kafka-connect-sink.properties
Each Kafka worker exposes REST API to manage Kafka Connectors (available on port 8083 by default). See Kafka Connect REST Interface for how to create, remove, pause and resume connectors as well as see status of the connectors and tasks.