How Publisher publish message to topic in Apache Kafka?

I am new for Apache Kafka. I do not understand anatomy of Topic and Partition in Apache Kafka and the way the Producer pushes data to partition.

Consider a scenario, I have two Producers PR1, PR2 and three brokers B1, B2, B3. And one topic T1 with three partition as P1, P2, P3 split on to three broker. Now the first producer PR1 coordinate with Zookeeper and find the Broker and pushes the message.(say a log server pushing its log data at 1 record per second) to T1 – P1 and set the offset as 0. My doubt is how the second record is pushed. Will it pushes to partition P2 or P3 ? or the first record is itself parallely pushes to all the three partition.

Now the second Publisher is joined and publishing message to partitions. where does the message is pushed, will it pushes to P1 ? if that is the case already PR1 is pushing message to P1, will PR1 and PR2 both will simultaneously append the message to P1 back to back creating offset 0,1,2,3,4,5….?

One Solution collect form web for “How Publisher publish message to topic in Apache Kafka?”

There are multiple criteria to decide which message goes to which partition.

1. Message with Key

When you create a Kafka message with key like below, it uses the default hash partitioner to find the partition. Default partitioner create hash code based on Message key and finds the corresponding partition.

new ProducerRecord<String, String>("my-topic", "message key", "message")

2. Message without message key

It uses again Default partitioner to find the random partitioner.

new ProducerRecord<String, String>("my-topic", "message")

3. Message with partition number

When you create message you can also pass the partition number manually, so the message goes to that partition.

4. Using Custom Partitioner

You can also write a be-spoke Partitioner class to decide which partition the message has to go.

For more info on Producer API look this

This article provides detailed information on how to create Custom partitioner and default hash partitioner.

Git Baby is a git and github fan, let's start git clone.