About the job
We are in search of a meticulous and analytical Cluster Duty Engineer to become a vital part of our team in Dubai, United Arab Emirates. This pivotal role encompasses the management and maintenance of our server cluster infrastructure, ensuring our systems operate at peak performance and reliability. The ideal candidate will possess exceptional organizational skills and demonstrate the ability to swiftly tackle technical challenges in a dynamic environment.
- Monitor the performance of cluster systems and infrastructure utilizing industry-standard monitoring tools and dashboards.
- Respond efficiently to system alerts and incidents, conducting root cause analyses and implementing effective solutions.
- Conduct routine maintenance tasks including system updates, patches, and configuration management.
- Troubleshoot hardware and software issues impacting cluster operations, documenting findings meticulously.
- Maintain comprehensive technical documentation regarding cluster configurations, procedures, and incident resolutions.
- Collaborate with cross-functional teams to enhance system performance and facilitate capacity planning.
- Execute backup and disaster recovery plans to ensure data integrity and business continuity.
- Perform system diagnostics and performance analysis to uncover optimization opportunities.
- Escalate critical issues to senior engineering staff as necessary, providing detailed incident reports.
- Adhere to all established protocols and standard operating procedures for cluster management activities.

