About the job
Fortytwo is a cutting-edge decentralized AI protocol built on Monad, which utilizes underused consumer hardware for swarm inference. Our innovative approach allows Small Language Models to execute complex multi-step reasoning at a reduced cost, outperforming the capabilities and scalability of existing leading models.
Key Responsibilities:
Deploy robust, scalable machine learning services with optimized infrastructure and automated Kubernetes clusters.
Enhance GPU resource utilization through Multi-Instance GPU (MIG) and Node Offloading System (NOS).
Oversee cloud storage solutions (e.g., S3) to guarantee availability and performance.
Incorporate state-of-the-art ML methods, including Low-Rank Adaptation (LoRA) and model merging into operational workflows:
Adapt leading ML codebases to align with organizational requirements.
Implement LoRA methodologies and model merging processes.
Manage and deploy large language models (LLM), small language models (SLM), and large multimodal models (LMM).
Utilize technologies such as Triton Inference Server for model serving.
Leverage advanced serving frameworks like vLLM and Text Generation Inference (TGI).
Optimize models using ONNX and TensorRT to ensure efficient deployment.
Develop Retrieval-Augmented Generation (RAG) systems that integrate spreadsheet, mathematics, and compiler processing.
Establish monitoring and logging systems with tools such as Grafana, Prometheus, Loki, Elasticsearch, and OpenSearch.
Create and maintain CI/CD pipelines via GitHub Actions for streamlined deployment.
Develop Helm templates for swift Kubernetes node deployment.
Automate workflows using cron jobs and Airflow Directed Acyclic Graphs (DAGs).
Qualifications:
Bachelor’s or Master’s degree in Computer Science, Engineering, or a related discipline.
Expertise in Kubernetes, Helm, and containerization technologies.
Experience in GPU optimization (MIG, NOS) and familiarity with cloud platforms (AWS, GCP, Azure).
Proficient in monitoring tools (Grafana, Prometheus) and scripting languages (Python, Bash, etc.).
