About the job
About Gecko Robotics
At Gecko Robotics, we empower the world’s most vital organizations to maintain the availability, reliability, and sustainability of critical infrastructure. Our innovative and interconnected solutions integrate wall-climbing robots, advanced sensors, and an AI-driven data platform, offering clients unparalleled insights into the current and future conditions of their physical assets. This capability enables real-time decision-making, boosts operational efficiency and safety, enhances mission readiness, and safeguards the environment and society against the repercussions of infrastructure failures.
Your Role
As a Data Engineer, you will play a pivotal role in developing and enhancing the data infrastructure for an AI-centric product that encompasses document intelligence, time-series IoT data, and intelligent AI systems. This is a hands-on, end-to-end position ideal for individuals who excel in ambiguous environments, are comfortable working closely with data models and customers, and aspire to directly influence how data and AI are utilized in practical applications.
This position will primarily focus on the power industry, where your contributions will directly facilitate the dependable operation of power plants that service millions daily.
Your Responsibilities
You will be responsible for designing, implementing, and managing data systems throughout their lifecycle—from raw data ingestion to AI-driven outputs utilized by clients in real-world scenarios. Collaborating with customers and internal teams, you will identify real challenges, convert them into technical solutions, and iterate swiftly. You will create pipelines supporting document processing, sensor data, and machine learning workflows, engage in feature engineering and model experimentation as necessary, and take ownership of production systems. You will make practical architectural decisions, enhance reliability over time, and help establish best practices as our team and product grow.
Technologies We Leverage
- Python, SQL
- Cloud-native data and GenAI tools (GCP, VertexAI, etc.)
- Streaming and messaging systems
- Distributed processing frameworks
- Data warehouses, lakes, and object storage
- Time-series and NoSQL databases
- ML and AI tools (feature stores, vector databases, model pipelines)
- Docker, Kubernetes, and infrastructure-as-code tools

