Inspired by the entertaining http://www.damnyouautocorrect.com/ web site, here are some thoughts on the benefits and challenges of having auto-create enabled for Kafka topics https://kafka.apache.org/documentation/#brokerconfigs

`auto.create.topics.enable`

At first, auto-create seems like a convenience, a blessing, as it means developers don’t need to write code to explicitly create topics. For a particular project the developers can focus on using the system as a service to share user-specified sets of data rather than writing extra code to interact with Zookeeper, etc. (newer releases of Kafka include the AdminClient API which deals with the Zookeeper aspects).

Effects of relying on auto-create: topics are created with the default (configured) partition and replication-counts. These may not be ideal for this topic and its intended use(s).

Adverse impacts of using auto-create

Deleting topics: The project uses Confluent Replicator to replicate data from Kafka Cluster to Kafka Cluster. As part of our testing lots of topics were created. We wanted to delete some of these topics but discovered they were virtually impossible to kill as the combination of Confluent Replicator and the Kafka Clusters were resurrecting the topics before they could be fully expunged. This caused almost endless frustration and adversely affected our testing as we couldn’t get the environment sufficiently clean to run tests in controlled circumstances (Replicator was busy servicing the defunct topics which limits it’s ability to focus on the topics we wanted to replicate in particular tests).

Coping with delays and problems creating topics: At a less complex level, auto-creation takes a while to complete and seems to happen in the background. When the tests (and the application software) tries to write to the topic immediately various problems occurred from time to time. Knowing that problems can occur is useful in terms of performance, reliability, etc. however it complicates the operational aspects of the system, especially as the errors affect producing data (what the developers and users think is happening) rather than the orthogonal aspect of creating a topic so that data can be produced.

Lack of clarity or traceability on who (what) created topics: Topics could be auto-created when code tried to write (produce) which was more-or-less what we expected. However they could also be auto-created by trying to read (consume). The Replicator duly setup replication for that topic. For various reasons topics could be created on one or more clusters with the same name; and replication happened both locally (within a Kafka Cluster) and to another cluster. We ended up with a mess of topics on various clusters which was compounded by the challenges cleaning up (deleting) the various topics. It ended up feeling like we were living through the after-effects of the Sorcerer’s Apprentice!

From a testing perspective

From a testing perspective we ended up adding code in our consumer code that checked and waited for the topic to appear in Zookeeper before trying to read from it. This, at least, reduced some of the confusion and enabled us to unambiguously measure the propagation time for Confluent Replicator for topics it needed to replicate.

We also wrote some code that explicitly created topics rather than relying on the auto-create to determine how much effort was needed to remove the dependency on auto-create being enabled and used. That code amounted to less than 10 lines of code in the proof-of-concept. Production quality code may involve more code in order to: audit the creation, as well as log, and report problems and any run-time failures.

Better Software Testing Blog

Seeking ways to improve the efficiency and effectiveness of our craft

Monthly Archives: March 2018

Damn you auto-create

Adverse impacts of using auto-create

From a testing perspective

Further reading

Six months review: learning how to test new technologies