官术网_书友最值得收藏!

Message topics

If you are into software development and services, I am sure you will have heard terms such as database, tables, records, and so on. In a database, we have multiple tables; let's say, Items, Price, Sales, Inventory, Purchase, and many more. Each table contains data of a specific category. There will be two parts in the application: one will be inserting records into these tables and the other will be reading records from these tables. Here, tables are the topics in Kafka, applications that are inserting data into tables are producers, and applications that are reading data are consumers.

In a messaging system, messages need to be stored somewhere. In Kafka, we store messages into topics. Each topic belongs to a category, which means that you may have one topic storing item information and another may store sales information. A producer who wants to send a message may send it to its own category of topics. A consumer who wants to read these messages will simply subscribe to the category of topics that he is interested in and will consume it. Here are a few terms that we need to know:

  • Retention Period: The messages in the topic need to be stored for a defined period of time to save space irrespective of throughput. We can configure the retention period, which is by default seven days, to whatever number of days we choose. Kafka keeps messages up to the defined period of time and then ultimately deletes them.
  • Space Retention Policy: We can also configure Kafka topics to clear messages when the size reaches the threshold mentioned in the configuration. However, this scenario may occur if you haven't done enough capacity planning before deploying Kafka into your organization.
  • Offset: Each message in Kafka is assigned with a number called as an offset. Topics consist of many partitions. Each partition stores messages in the sequence in which they arrive. Consumers acknowledge messages with an offset, which means that all the messages before that message offset are received by the consumer.
  • Partition: Each Kafka topic consists of a fixed number of partitions. During topic creation in Kafka, you need to configure the number of partitions. Partitions are distributed and help in achieving high throughput.
  • Compaction: Topic compaction was introduced in Kafka 0.8. There is no way to change previous messages in Kafka; messages only get deleted when the retention period is over. Sometimes, you may get new Kafka messages with the same key that includes a few changes, and on the consumer side, you only want to process the latest data. Compaction helps you achieve this goal by compacting all messages with the same key and creating a map offset for key: offset. It helps in removing duplicates from a large number of messages.
  • Leader: Partitions are replicated across the Kafka cluster based on the replication factor specified. Each partition has a leader broker and followers and all the read write requests to the partition will go through the leader only. If the leader fails, another leader will get elected and the process will resume.
  • Buffering: Kafka buffers messages both at the producer and consumer side to increase throughput and reduce Input/Output (IO). We will talk about it in detail later.
主站蜘蛛池模板: 景泰县| 彭阳县| 榆林市| 荥经县| 班戈县| 舞阳县| 武强县| 南川市| 靖江市| 札达县| 阿拉善左旗| 都兰县| 汾西县| 上高县| 宁海县| 长葛市| 刚察县| 平山县| 九寨沟县| 集安市| 德兴市| 临泉县| 崇仁县| 宁化县| 东海县| 大余县| 大安市| 家居| 台东市| 临高县| 庆云县| 乡宁县| 马尔康县| 嵩明县| 濮阳市| 东丽区| 苍梧县| 定襄县| 鱼台县| 黑山县| 昆明市|