companyPalona logo

AI Reliability Engineer

PalonaSeattle, Washington, United States
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

RequirementsRequired:A minimum of 2 years of professional software engineering experience. Strong proficiency in Python programming. Experience with cloud platforms such as AWS, Azure, or GCP. Familiarity with monitoring and observability tools like Datadog, CloudWatch, Grafana, or similar. Understanding of CI/CD pipelines and infrastructure as code. Experience with APIs and distributed systems. A willingness to learn new AI/LLM concepts, frameworks, and technologies. Preferred:Experience in writing test frameworks or automated evaluation systems. Background in building internal tools or developer platforms. Exposure to LLMs, prompt engineering, or AI agent systems. Startup experience or the ability to thrive in fast-paced environments. A background in NLP, computer vision, or AI agent systems.

About the job

At Palona, we are pioneering the integration of cutting-edge generative and multimodal AI into the hospitality sector. Our dynamic engineering team drives innovation at a rapid pace, utilizing generative AI models to create products that continually adapt and improve. In this fast-evolving landscape, traditional software excellence needs to evolve to accommodate the unique nature of AI outputs, which differ significantly from conventional software failures. This position is crucial in establishing an engineering discipline that identifies potential issues before they impact our customers.

In this role, you will engage with evaluation pipelines, observability, cloud infrastructure, and CI/CD processes to enhance Palona's AI agent platform. You will blend DevOps and AI reliability, overseeing production infrastructure while developing tools that ensure optimal AI agent performance.

Responsibilities

As an AI Reliability Engineer, your key responsibilities will include:

  • Creating and implementing observability systems to identify quality degradation, latency issues, and system anomalies in production, including the development of instrumentation, dashboards, and alerting mechanisms.
  • Writing and maintaining automated tests to assess agent output quality, incorporating deterministic checks and LLM-as-judge evaluations.
  • Developing automated release and validation systems to streamline deployments across different environments and enforce quality gates for AI-driven products.
  • Building and refining platform infrastructure using infrastructure as code, with a strong emphasis on reliability, scalability, and cost efficiency.
  • Enhancing evaluation pipelines that gauge AI agent conversation quality, accuracy, and safety, collaborating with product and engineering teams to refine evaluation criteria.
  • Designing and developing internal tools and services that bolster AI reliability, evaluation, and operational workflows.
  • Architecting new systems to tackle emerging reliability and quality challenges within the AI agent platform.
  • Producing production-grade code for reliability and evaluation infrastructure, contributing as a software engineer rather than merely an operator.

About Palona

Palona is at the forefront of applying innovative generative and multimodal AI technologies to transform the hospitality industry. Our team is dedicated to delivering rapid advancements that enhance customer experiences and operational efficiencies.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.