PadoGrid | Catalogs | Manual | FAQ | Releases | Templates | Pods | Kubernetes | Docker | Apps | Quick Start
This bundle integrates SnappyData/ComputeDB with Debezium for ingesting initial data and CDC records from MySQL into a SnappyData/ComputeDB cluster via a Kafka sink connector included in the padogrid
distribution. It supports inserts, updates and deletes.
install_bundle -download bundle-snappydata-1-docker-debezium_kafka
❗️ If you are running this demo on WSL, make sure your workspace is on a shared folder. The Docker volume it creates will not be visible otherwise.
This use case ingests data changes made in the MySQL database into SnappyData/ComputeDB cluster via Kafka connectors: the Debezium MySQL source connector and the snappydata-addon
Debezium sink connector.
- Docker
- Maven 3.x
- jq
The demo follows the Debezium Totorial steps shown in the link below.
All the commands provided in the tutorial are wrapped in the scripts found in the bin_sh
directory. We'll use these scripts to simplify the demo.
✏️ This bundle builds the demo enviroment based on the SnappyData version in your workspace. Make sure your workspace has been configured with the desired versions before building the demo environment.
First, change your cluster context to a SnappyData cluster. This is required in order to configure the SnappyData Docker containers.
# Create a SnappyData cluster if it does not already exist.
create_cluster -product snappydata
# Switch context
switch_cluster mysnappy
Now, build the demo by running the build_app
command as shown below. This command performs the folliwing operations.
- Downloads the SnappyData/ComputeDB JDBC driver jar file and places it in the
padogrid/lib
directory, which is mounted as a Docker volume. - Places the
snappydata-addon-common
andsnappydata-addon-core
jar files inpadogrid/lib
directory so that the SnappyData/ComputeDB Debezium Kafka connector can include them in its class path. - Places the
create_inventory_*.sql
files found in thebin_sh
directory in the SnappyData cluster'spadogrid/etc/
directory so that we can create the tables in the SnappyData cluster later in the demo.
❗️ If you created a SnappyData/ComputeDB cluster in the previous section with a cluster name other than snappy
then you must enter that name in the setenv.sh
file.
cd_docker debezium_kafka/bin_sh
vi setenv.sh
In the setenv.sh
file change the cluster name as needed.
# Change snappy to your cluster name
SNAPPYDATA_CLUSTER_NAME="snappy"
Run the build_app
command to build the Debezium Docker environment.
cd_docker debezium_kafka/bin_sh
./build_app
Upon successful build, the padogrid
directory should have jar files similar to the following:
cd_docker debezium_kafka
tree padogrid
Output:
padogrid
└── lib
├── padogrid-common-1.0.0.jar
├── snappydata-addon-core-1.0.0.jar
└── snappydata-jdbc_2.11-1.2.0.jar
Let's create the my_network
network to which all containers will join.
docker network create my_network
Let's create a SnappyData cluster to run on Docker containers with the my_network
network we created in the previous section.
create_docker -product snappydata -cluster snappy -network my_network
cd_docker snappy
The snappy
cluster's padogrid
directory should look similar to the following:
cd_docker snappy
tree padogrid
Output:
padogrid
├── etc
│ ├── gemfirexd.properties
│ ├── log4j2.properties
│ └── prometheus.yml
├── lib
│ ├── bcpkix-jdk18on-1.74.jar
│ ├── bcprov-jdk18on-1.74.jar
│ ├── cache-api-1.1.1.jar
│ ├── jmx_prometheus_javaagent-0.17.2.jar
│ ├── json-20230227.jar
│ ├── kryo-4.0.0.jar
│ ├── log4j-api-2.19.0.jar
│ ├── log4j-core-2.19.0.jar
│ ├── log4j-slf4j-impl-2.19.0.jar
│ ├── org.eclipse.paho.client.mqttv3-1.2.5.jar
│ ├── org.eclipse.paho.mqttv5.client-1.2.5.jar
│ ├── padogrid-common-1.0.0.jar
│ ├── padogrid-snappydata-all-1.0.0.jar
│ ├── padogrid-tools-1.0.0.jar
│ ├── slf4j-api-1.7.25.jar
│ ├── snakeyaml-1.23.jar
│ └── snappydata-addon-core-1.0.0.jar
├── log
├── plugins
│ └── snappydata-addon-core-1.0.0-tests.jar
├── run
└── stats
There are numerous Docker containers to this demo. We'll first start the SnappyData/ComputeDB cluster containers and then proceed with the Debezium containers. By default, the scripts provided run the containers in the foreground so that you can view the log events. You will need to launch a total of eight (8) terminals. If you have a screen splitter such as Windows Terminal, it will make things easier.
You can also run some of the scripts in the background by including the '-d' option. These scripts are mentioned below.
Start the snappy
SnappyData/ComputeDB cluster containers.
cd_docker snappy
docker compose up
Once the snappy
Docker containers are up, you need to create the inventory
schema and tables as follows.
cd_docker debezium_kafka/bin_sh
./create_inventory_schema
./create_inventory_tables
Launch six (6) terminals and run each script from their own terminal as shown below. Each script must be run from their own terminal as they will block and display log messages.
cd_docker debezium_kafka/bin_sh
# 1. Start Zookeeper (include '-d' to run it in the background)
./start_zookeeper
# 2. Start Kafka (include '-d' to run it in the background)
./start_kafka
# 3. Start MySQL database (include '-d' to run it in the background)
./start_mysql
# 4. Start Kafka Connect (include '-d' to run it in the background)
./start_kafka_connect
# 5. Start topic watcher
./watch_topic_customers
# 6. Start MySQL CLI
./start_mysql_cli
There are two (2) Kafka connectors that we must register. The MySQL connector is provided by Debezium and the SnappyData/ComputeDB connector is part of the PadoGrid distribution.
cd_docker debezium_kafka/bin_sh
./register_mysql_connector
./register_debezium_snappydata_connector
# Check status
curl -Ss -H "Accept:application/json" localhost:8083/ | jq
# List registered connectors
curl -Ss -H "Accept:application/json" localhost:8083/connectors/ | jq
The last command should display the inventory connector that we registered previously.
[
"debezium-snappydata-sink",
"inventory-connector"
]
Using the MySQL CLI, you can change table contents. The changes you make will be captured in the form of change events by the Debezium source connector. The SnappyData/ComputeDB sink connector in turn receives the change events and updates (or deletes) the assigned table, i.e., inventory.customers
.
use inventory;
SELECT * FROM customers;
UPDATE customers SET first_name='Anne Marie' WHERE id=1004;
-- Delete - First, delete the row in addresses that references the foreign key
DELETE FROM addresses WHERE customer_id=1004;
DELETE FROM customers WHERE id=1004;
INSERT INTO customers VALUES (default, "Sarah", "Thompson", "kitt@acme.com");
INSERT INTO customers VALUES (default, "Kenneth", "Anderson", "kander@acme.com");
To view the table contents in SnappyData/ComputeDB, run the start_snappydata_cli
command as follows:
cd_docker debezium_kafka/bin_sh
./start_snappydata_cli
snappy> connect client 'locator:1527';
snappy> select * from inventory.customers;
Pulse URL: http://localhost:5050
# Shutdown Debezium containers
cd_docker debezium_kafka/bin_sh
./cleanup
# Shutdown SnappyData/ComputeDB containers
cd_docker snappy
docker compose down
# Remove my_network
docker network rm my_network
# Prune all stopped containers
docker container prune -f
docker: Error response from daemon: Ports are not available: listen tcp 0.0.0.0:3306: bind: Only one usage of each socket address (protocol/network address/port) is normally permitted.
ERRO[0000] error waiting for container: context canceled
The above error occurs if you have MySQL already running locally in your host OS. The mysql
Docker container listens on the same port number. You must stop the local MySQL instance and prune the containers before runngin start_mysql
again.
# First, stop your local MySQL instance running in the host OS.
# Prune the containers
docker container prune -f
# Run start_mysql again
./start_mysql
PadoGrid | Catalogs | Manual | FAQ | Releases | Templates | Pods | Kubernetes | Docker | Apps | Quick Start