We often hear stories from Kafka users about taking nine months to implement production-ready Kafka-based data pipelines. We see customers having 50 teams relying on a single Kafka cluster managed by one person. Data engineers cannot easily simulate a production environment without a complex initial setup. Or data scientists struggle with data integration, building an offline ML pipeline to experiment, reproduce models, and debug them locally. Let's explore how to skip the headache of creating computing clusters, managing partitions, shards, and workers' setup.
The talk demonstrates a different way of implementing a cost-efficient serverless streaming pipeline using pure Python. We will leverage technologies like KEDA (Kubernetes Event-Driven Autoscaling) with integrated NATS Jetstream for the distribution of real-time events and Fission serverless framework to run lambda-like functions on Kubernetes.
ππ‘π¨ π ππ¦Bobur is a developer advocate and speaker specializing in software and data engineering. With over 10- years of experience in IT, he blogs about open-source technologies and the community around them.ππ‘ππ π ππ¨Bobur works with companies... Read More →