
The book "Designing Data Intensify Applications" is probably a good read and would be hard to summarize in a reddit reply. It's a fundamental problem not really to do with Kakfa specifically. Message ordering can get very complicated for high throughput system. The cancellation token is there for graceful shutdown when you downsize your pool.
#MASSTRANSIT RETRY CONSUMER CODE#
It's faster to reconnect in your code than wait for the consumer instance to exit. ECS/docker deploy of your consumer you can set concurrency via rules that are set outside your consumer code). I recommend you use something like Polly to recreate your connections in case they die, though you could potentially just have the consumer Close() and let your infrastructure spawn another one (i.e. You can use the context outside the while to hold on to things like datastore connections that are reused instead of recreating them per message. Write it like you would a Lambda or Function. You don't need to worry about what it is buffering in the background. This is pretty 90% of what you need to know:Ī typical Kafka consumer application is centered around a consume loop, which repeatedly calls the Consume method to retrieve records one-by-one that have been efficiently pre-fetched by the consumer in background threads.

Like if I have a consumer that's assigned 3 partitions presumably I'll get up to three messages to process. I'm asking for inside my consumer, what's the best practice for concurrent processing of messages. Normally, you produce a message in kafka from some outside process, usually web service. If you would like to scale you consumation, then you should share the same Consumer Group id, which will make a group of services do the same job in parallel, but managing different messages (disjunt).Īnd have it push messages into like a concurrent bag or queue Rebalance in Kafka you do not have to worry about, that's done automatically by Kafka. Consumer takes message by message, processes it and marks done. Like I said, you should have separate process for processing the messages, which has one long running thread for consumer. What's the best practice for building something like that with Kafka? Do I run the consumer is a separate thread/long running task and have it push messages into like a concurrent bag or queue? In that case, how do I handle a rebalance? Maybe it's that I got thrown into the deep end and just feel underwater trying to make sense of this lol Our existing asynchronous job setup uses an in house thing, for not very good reasons, and patching Kafka into that seems like a bad idea tbh since it's asynchronous to process messages concurrently. The consumer side is more where my concern lies because there's already some confusion with our existing SNS to SQS setup (I've found out more than one team has been sharing queues between applications and then complaining they don't get all the messages 🙃🙃🙃). The actual topic creation with partition and replication settings will need some thought as well as figuring out the right ACK and retry settings for each service but I'm not overly worried about that. Yeah the producer side I'm not really worried about other than figuring out things like serializes and I'm hoping we get the go ahead to just throw something like AVRO at it. I'm not surprised by that since Kafka has a different model than traditional queue systems. So my hopes of pulling a toy of the shelf seem diminished. It seems NServiceBus doesn't support Kafka at all, outside of a community contribution that's been untouched for several years even in it's forks. It looks like Kafka gets layered on top of an actual queue and while I'm not sure of the benefits of this this might be my path forward since we use currently use SQS. I took a peek at mass transit but they're concept of "riders" (and the kind of cutesy "they board the bus when it starts" language) confuses me.

I found a non-confluent package (kafka-sharp?, I think) but it appeared to also use the same blocking IO but I need to look it over again. Kind of feels like they went "well the Java threading model works, so C# can just do that to" without thinking of cultural differences between the two languages. Worse, it appears that Confluent doesn't seem interested in supporting an asynchronous model, at least in the short to midterm. I'm just getting started with Kafka and I'm not exactly enthused to introduce blocking synchronous IO into our asynchronous jobs. Are you using something off the shelf, do you have your own in house framework? Any kind of gotchas or pitfalls or neat tricks I should be aware of? Did you implement things like DLQs (which to my understanding means writing the message to a different topic so it unblocks the one it was originally written to)? I don't mean "what did you build on top of Kafka" but literally your code.
