About the job
Join our dynamic team as a Junior Data Engineer, where you will play a critical role in enhancing modern, real-time data processing capabilities. You will assist in transitioning existing data and ML workflows from batch processing to scalable streaming solutions. This position involves hands-on engineering, close collaboration with Data Scientists, and operational oversight for production data pipelines.
Technology Environment
- Utilization of advanced real-time data streaming technologies for ML model inference.
- Experience with distributed data processing frameworks that enable scalable, low-latency pipelines.
- Work with containerized workloads orchestrated in cloud-native environments.
- Employ monitoring and observability tools to ensure the reliability and performance of data pipelines.
- Engage with a Python-based ecosystem that supports ML model integration and lifecycle management.
Key Responsibilities
- Transform batch inference workflows into efficient streaming pipelines.
- Establish streaming semantics to replace batch windows, including micro-batching, windowing, and state management.
- Design Kafka topic structures, partitioning strategies, and consumer group patterns tailored for prediction workloads.
- Implement strategies for checkpointing, backpressure handling, and delivery guarantees.
- Package and version ML model artifacts for streaming jobs to facilitate safe rollouts and rollbacks.
- Optimize performance for throughput and latency through effective batching strategies and resource allocation.
- Deploy and manage streaming jobs with comprehensive monitoring and alerting.
- Integrate streaming outputs into downstream ETL/BI systems seamlessly.
- Collaborate with Data Scientists on CI/CD for streaming models while monitoring model performance and drift.
Team & Collaboration
- Engage in a distributed delivery model closely coordinated with the central AI/BI team in Germany.
- Experience daily collaboration through MS Teams, Jira, and Confluence.
- Utilize Agile methodologies (Scrum/Kanban) within cross-functional squads.

