Kafka
Understanding how it works
- Kafka Introduction  - official website
- Confluent Tutorial  - step-by-step tutorial
Documentation
Notes
- Consumer API is not thread-safe
- Be always mindful about event ordering
- Evaluate early your scalability capabilities (how many partitions max your Kafka platform team allows, what are my alternative levers to scale)
- Records are supposed to be lightweight payloads, the default max size is 1MB, it’s not a temporary data storage for
- large payloads
- Kafka is performance-oriented but can easily be deteriorated by introducing API calls in the event processing part
Tools
- Confluent Platform  - contains all the tools to consume, produce, and manage topics, offsets, etc…
- kcat  - more accessible tool to start consuming/producing Kafka records
Consume:
kcat -C -b $KAFKA_CLUSTER_URL \
-t $TOPIC_NAME \
-X security.protocol=SASL_SSL \
-X sasl.mechanisms=SCRAM-SHA-512 \
-X sasl.username=$SASL_USERNAME \
-X sasl.password=$SASL_PASSWORD \
-c 1 \
-e
To consume the last 100 messages for partition 2:
-p 2 \
-o-100 \
To support Avro messages, add:
-s value=avro \
-r https://$SASL_USERNAME:$SASL_PASSWORD@$SCHEMA_REGISTRY_URL \
Produce:
kcat -P -F $HOME/path/to/producer.properties -t $TOPIC_NAME -l $HOME/path/to/data
Configuration file - producer.properties
bootstrap.servers=$KAFKA_CLUSTER_URL
sasl.mechanism=SCRAM-SHA-512
security.protocol=SASL_SSL
sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="$SASL_USERNAME" password="$SASL_PASSWORD";
client.id=my-test-kafka-producer
# schema.registry.url=https://schema-registry.endpoint.com
# basic.auth.credentials.source=SASL_INHERIT
# auto.register.schemas=false
Reset offset
Poison pill, skip data by moving all offsets to the latest:
kafka-consumer-groups --bootstrap-server $KAFKA_CLUSTER_URL --command-config $HOME/path/to/admin.properties --group $CONSUMER_GROUP_NAME --topic $TOPIC_NAME --reset-offsets --to-latest --execute
Move back your consumer group to a specific offset $OFFSET
for partition $PARTITION_NUM
kafka-consumer-groups.sh \
--command-config $HOME/path/to/admin.properties \
--bootstrap-server $KAFKA_CLUSTER_URL \
--group $CONSUMER_GROUP_NAME \
--topic $TOPIC_NAME:$PARTITION_NUM \
--reset-offsets --to-offset $OFFSET --execute
admin.properties
is similar to producer.properties
above except it doesn’t require any client ID.
Last updated on