About the job
Tiger Analytics is on the lookout for a talented and forward-thinking Machine Learning Engineer with practical experience in Google Cloud Platform (GCP) and Vertex AI. In this role, you will be pivotal in designing, developing, and implementing robust ML solutions. You will oversee the end-to-end ML lifecycle, encompassing data ingestion, model serving, and monitoring, ensuring seamless operationalization of machine learning models.
Key Responsibilities:
- Develop, train, and refine ML models utilizing Vertex AI, including Vertex Pipelines, AutoML, and custom model training.
- Craft and establish scalable ML pipelines aimed at feature engineering, training, evaluation, and deployment.
- Deploy production models via Vertex AI endpoints and ensure integration with downstream applications or APIs.
- Collaborate effectively with data scientists, data engineers, and MLOps teams to create reproducible and reliable ML workflows.
- Monitor model performance, establishing alerting mechanisms, retraining triggers, and drift detection methods.
- Employ GCP services such as BigQuery, Dataflow, Cloud Functions, Pub/Sub, and GCS within ML workflows.
- Implement CI/CD best practices for ML models utilizing Vertex AI Pipelines, Cloud Build, and GitOps.
- Enforce model governance, versioning, explainability, and security best practices in Vertex AI.
- Thoroughly document architecture decisions, workflows, and model lifecycles for internal stakeholders.
Requirements:
1. Advanced Generative AI
- Advanced RAG including Graph-based hybrid retrieval
- Multimodal agent
- In-depth knowledge of ADK, Langchain Agentic Frameworks
- Experience in fine-tuning and distillation.
2. Proficiency in Python
- Expert in Python with strong OOP and functional programming skills
- Proficient in ML/DL libraries: TensorFlow, PyTorch, scikit-learn, pandas, NumPy, PySpark
- Experience with production-grade code, testing, and performance optimization
3. GCP Cloud Architecture & Services
- Proficiency in GCP services such as:
- Vertex AI
- BigQuery
- Cloud Storage
- Cloud Run
- Cloud Functions
- Pub/Sub
- Dataproc
- Dataflow
- Understanding of IAM, VPC
4. API Development & Integration
- Design and build RESTful APIs using FastAPI or Flask
- Integrate ML models into APIs for enhanced functionality.

