About the job
Special Notice
This is a full-time position at webAI in Austin, TX. Employment is not tied to project awards or external funding.
Role Overview
webAI seeks a Senior Machine Learning Engineer to strengthen its Public Sector work. The focus: building and refining production-ready AI systems designed for secure, distributed environments. This role centers on moving models from prototype to reliable, scalable deployment, supporting government cloud platforms and edge devices, even where connectivity is limited or absent.
What You Will Do
- Design and implement agentic workflows for multi-step reasoning, tool use, and automated decision-making in production systems.
- Convert research AI models into scalable systems used in real-world applications.
- Develop adaptive machine learning solutions using LoRA, PEFT, and on-device inference. Work with frameworks like PyTorch, TensorFlow, and Hugging Face Transformers to create, fine-tune, and optimize models.
- Apply model optimization methods such as quantization, pruning, distillation, and hardware-specific acceleration.
- Build and manage Retrieval Augmented Generation (RAG) pipelines, including integration with vector databases for contextual retrieval.
- Collaborate with multi-modal AI systems across computer vision, audio, and natural language processing domains.
- Improve model performance for distributed and resource-constrained environments, ensuring systems remain dependable under different connectivity scenarios.
Requirements
- Active US Security clearance.
- At least 4 years of experience in applied AI, machine learning engineering, or production AI systems.
- Strong proficiency with PyTorch, TensorFlow, or Hugging Face Transformers.
- Hands-on experience deploying AI models to cloud, edge, and mobile hardware.
- Expertise in model compression and optimization (quantization, pruning, distillation).
- Background in building RAG pipelines and integrating vector databases.

