Skip to Content
Tech PrinciplesTechnical ResourcesInfrastructurePlatformsKafka

Kafka

Understanding how it works

Documentation

Notes

  • Consumer API is not thread-safe
  • Be always mindful about event ordering
  • Evaluate early your scalability capabilities (how many partitions max your Kafka platform team allows, what are my alternative levers to scale)
  • Records are supposed to be lightweight payloads, the default max size is 1MB, it’s not a temporary data storage for
  • large payloads
  • Kafka is performance-oriented but can easily be deteriorated by introducing API calls in the event processing part

Tools

  • Confluent Platform  - contains all the tools to consume, produce, and manage topics, offsets, etc…
  • kcat  - more accessible tool to start consuming/producing Kafka records

Consume:

kcat -C -b $KAFKA_CLUSTER_URL \ -t $TOPIC_NAME \ -X security.protocol=SASL_SSL \ -X sasl.mechanisms=SCRAM-SHA-512 \ -X sasl.username=$SASL_USERNAME \ -X sasl.password=$SASL_PASSWORD \ -c 1 \ -e

To consume the last 100 messages for partition 2:

-p 2 \ -o-100 \

To support Avro messages, add:

-s value=avro \ -r https://$SASL_USERNAME:$SASL_PASSWORD@$SCHEMA_REGISTRY_URL \

Produce:

kcat -P -F $HOME/path/to/producer.properties -t $TOPIC_NAME -l $HOME/path/to/data

Configuration file - producer.properties

bootstrap.servers=$KAFKA_CLUSTER_URL sasl.mechanism=SCRAM-SHA-512 security.protocol=SASL_SSL sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="$SASL_USERNAME" password="$SASL_PASSWORD"; client.id=my-test-kafka-producer # schema.registry.url=https://schema-registry.endpoint.com # basic.auth.credentials.source=SASL_INHERIT # auto.register.schemas=false

Reset offset

Poison pill, skip data by moving all offsets to the latest:

kafka-consumer-groups --bootstrap-server $KAFKA_CLUSTER_URL --command-config $HOME/path/to/admin.properties --group $CONSUMER_GROUP_NAME --topic $TOPIC_NAME --reset-offsets --to-latest --execute

Move back your consumer group to a specific offset $OFFSET for partition $PARTITION_NUM

kafka-consumer-groups.sh \ --command-config $HOME/path/to/admin.properties \ --bootstrap-server $KAFKA_CLUSTER_URL \ --group $CONSUMER_GROUP_NAME \ --topic $TOPIC_NAME:$PARTITION_NUM \ --reset-offsets --to-offset $OFFSET --execute

admin.properties is similar to producer.properties above except it doesn’t require any client ID.

Last updated on