(Our) Three Biggest Fails in Apache Kafka

Miroslav Juhos

How you learn from your mistakes.

Message brokers are a hit that has rocked the world of application development. Data saved in a simple table has received a level of time. A change in our thought process was a little bonus because applications are no longer just pipeline trying to get data from point A to a database as fast as possible. Our team fell in love with Apache Kafka, and I want to tell you how we tried to break it three separate times.

The First Fail: Don’t Use Kafka with Default Settings

I like technologies that work when you turn them on for the first time. I like languages that have a short Hello world. I like microwave ovens with two buttons: temperature and time. Unfortunately, Kafka is none of these.

A rather fundamental problem for us was the retention default topic. You spend a week playing with Kafka, sending it messages, and you come to work on Monday, and there seem to be a few messages missing. And if you don’t send more messages during the week, Kafka is completely empty. You guessed it, message retention is set to seven days by default.

Hello world: A demonstration program with "Hello world" as its output that gives developers the feeling they understand the language.

Retention: The length of time a message is kept in Kafka before it disappears into the void.

Default: A pre-set value, sometimes set to some nonsense value. This is to heighten people’s vigilance.

Fail Number 2: How Kafkas Unite

Yes! We’re putting Kafka on stage, everything is working and the world is our oyster.

But wait a minute: our messages are being sent, but they aren’t in Kafka. Not only that, but topics from other teams are appearing. We panicked because our hopes placed into a modern technological solution are slowly vanishing. It’s working, but not the way we wanted. Meanwhile, we didn’t know the mood was very much the same in the other team. When the argument about “why are you in our part of Kafka” subsided, it was time to find a solution.

We discovered a feature that we previously knew nothing about. We implemented Kafka into Kubernetes and it ran flawlessly. The other team did the same with the same result. Each Kafka had its own broker, and both successfully received messages.

But then someone redeployed them and they were on the same Kubernetes node where Kafka combined the brokers into one cluster and both teams sent messages into one box. We were lucky the two teams were doing testing, which allowed this function to rear its head.

I still shudder with terror when I think that could have happened a year later on the production version.

Stage: An environment where developers test applications, and they hope they act just like they will in the production environment.

Topic: Kafka sorts messages into channels called topics.

Broker: The Kafka access point that gives us the URL where the application working with Kafka connects.

Kubernetes: The beast that our applications run on.

Cluster: A group of computers or applications working together (mostly in the same direction).

Kubernetes node: A physical machine in the server cluster (which could also be virtual). 

Fail Number Three: Select and Verify

If there is an entity in Kafka that consumes messages, transforms them, and then sends them (to another topics), it’s called the processor. As we code in Python, we looked into the Streams java library (Streams are ideal for processing messages from Kafka), but that was it. No Streams library exists in Python.

The idea was to write a consumer, process the data, and functionally publish them as the producer. However, we encountered a library that promised regular processing like Kafka Streams. We thought that at least Java developers won’t laugh at us.

And here we’ll list all the problems we caused ourselves:

  • If you put "Kafka Faust" into Google, you’ll most probably end up in a literature forum.
  • Faust currently has 80 contributors on Github – and Kafka has 820.
  • The documentation is such that you read it a couple of times, then try something, then go back to the documentation and search and search, in other words, launching something more than the example took an enormous amount of effort.
  • The Faust project blog went dark as did the new release.

What did we learn?

  • Verify the libraries and frameworks you’re adding to your project.
  • A library is useless without comprehensible documentation.

Sorry Faust, but after what we went through together, we’ll manage without you.

Producer: The application that sends messages to Kafka

Consumer: The application that waits for Kafka to deliver a message to process it

Release: An application that a developer thinks works flawlessly

Java: A coding language that the grand majority of developers believe they know, but only a minority make their living from it.


Miroslav Juhos

Miroslav "Juhy" Juhos is a professional developer and amateur musician. This can be interchangeable. He's been working at Heureka for over four years and deals with app development for payment systems. 

<We are social too/>

Interested in our work, technology, team, or anything else?
Contact our CTO Lukáš Putna.