About the Team You'll Join The ML Platform Team Leader will oversee the pivotal ML Platform team responsible for the enterprise's AI/ML development and operational environment. This team comprises the 'ML Ops Part', which builds and operates a stable MLOps environment based on Kubernetes, and the 'LLM Part', which leads the development of cutting-edge LLM applications along with the necessary LLM Ops environment. Upon joining, you will oversee both Parts, establishing and executing a technical strategy that provides a scalable and efficient platform, allowing data scientists and ML engineers to focus solely on model development and experimentation. You will play a crucial role in presenting the vision for the ML Platform, fostering team growth, and solidifying the foundation of the company's AI technology capabilities. Your Responsibilities Establish and lead the technical vision and long-term roadmap for the ML Platform team (ML Ops Part, LLM Part). Mentor team members, conduct code reviews, and manage performance to drive team growth. Design and oversee the operation of a scalable and reliable ML Platform (model training, deployment, serving, monitoring) based on Kubernetes. Lead the technical development of various LLM applications and establish the LLM Ops environment for fine-tuning, serving, and evaluating large-scale models. Collaborate closely with cross-functional teams, including data scientists, ML engineers, and product teams, to resolve bottlenecks in AI/ML development and maximize developer experience (DX). Continuously research the latest MLOps and LLM technology trends, design the architecture of the enterprise AI tech stack, and provide technical direction. Define and continuously improve metrics for platform stability, cost efficiency, and performance optimization. Ideal Candidate We are looking for someone with proven experience in leading engineering teams (ML, Infrastructure, Platform, etc.), successfully presenting technical visions, establishing roadmaps, and nurturing team members. A deep practical experience in building and operating MLOps platforms (e.g., Kubeflow, MLflow, CI/CD, model serving) in a Kubernetes environment is essential. A strong understanding and practical experience in developing LLM applications (RAG, fine-tuning, agents, etc.) or LLM Ops (large-scale model serving, Vector DB, evaluation pipelines) is required. Experience in designing and optimizing system architecture capable of handling large-scale traffic and data is preferred. You should be able to define complex technical problems, communicate clearly with various stakeholders, and solve issues strategically. A passion for designing and advancing the platform while considering business impact and developer experience (DX) beyond mere technical adoption is essential. Resume Recommendations Detail your contributions to leading ML platform or system projects (technical leadership, architecture design, team management, strategic planning). Clearly outline the flow of addressing technical/organizational challenges for each project, including your proposed solutions (architecture, adopted technologies), collaboration processes with team members/stakeholders, and final results (platform performance improvements, acceleration in development speed, cost reductions). Instead of simply listing tasks, include insights and learnings gained as a technical leader, along with the criteria for making technical trade-offs. Emphasize quantifiable achievements that you have led. If you have experience successfully building/operating large-scale systems through collaboration with multiple engineering teams to resolve complex technical problems, please share that. Journey to Joining Toss Bank Application submission > Job interview > Cultural fit interview > Leadership interview > Reference check > Compensation negotiation > Final acceptance Please Note Any false information found in the resume or submitted documents may lead to disqualification.
Mar 9, 2026