Head of Infrastructure
Job Overview
-
Date PostedApril 13, 2026
-
Location
-
Expiration dateJune 18, 2026
Job Description
411_3422108
55,000–60,000 AED per month + bonus + family medical and insurance
We are seeking an exceptional Head of Infrastructure to lead and evolve a global technology platform supporting large scale, distributed operations.
This role owns the reliability, scalability, security, and performance of production systems across cloud, edge, and microservices environments.
Reporting to the CTO and part of the Technology Leadership Team, you will inherit a mature, complex ecosystem and drive its evolution for global scale across 50+ markets.
Requirements
- 12+ years in infrastructure, platform, or systems engineering
- 5+ years in senior leadership roles such as Head of Infrastructure, SRE, or Platform Engineering
- Proven experience managing large scale, distributed systems across cloud and edge
- Strong background in microservices and real time data processing
- Deep Kubernetes expertise in production, including multi cluster environments
- Experience across multi cloud environments such as AWS, Azure, GCP
- Track record managing high throughput messaging systems such as Kafka or RabbitMQ
- Experience with stream processing frameworks such as Flink or Spark
- Exposure to AI and ML infrastructure including GPU environments and model deployment
- Experience operating across multiple regions, ideally APAC, Middle East, and Europe
Key Responsibilities
Platform Architecture and Deployment
- Own end to end infrastructure across cloud and edge environments
- Lead Kubernetes and container orchestration strategy across hybrid deployments
- Define and execute multi cloud strategy across AWS, Azure, GCP and regional providers
- Build infrastructure as code and automated deployment pipelines
- Ensure compliance with global data residency and sovereignty requirements
Database and Storage
- Manage a polyglot database environment across relational, time series, graph, and cache layers
- Design scaling, replication, backup, and disaster recovery strategies
- Drive zero downtime migrations and performance optimisation
Messaging and Microservices
- Scale high throughput messaging systems handling large IoT data volumes
- Own service governance, API management, and microservices communication
- Implement monitoring, alerting, and capacity planning
Real Time and Data Processing
- Lead stream processing infrastructure for real time data and analytics
- Optimise latency and throughput across distributed systems
- Manage batch processing and distributed task scheduling
AI and Compute Infrastructure
- Own infrastructure for AI model training and real time inference
- Design GPU and accelerator strategy for cost and performance
- Support LLM deployment and simulation workloads
Frontend and Edge Delivery
- Optimise global content delivery and WebGL performance
- Manage CDN and caching strategies
IoT and Connectivity
- Scale infrastructure supporting millions of connected devices
- Manage edge gateways and protocol integrations
Observability and Reliability
- Own monitoring, logging, tracing, and performance management
- Establish incident response, on call, and SLA frameworks
- Improve system reliability through proactive alerting
Security and Compliance
- Implement secure infrastructure across all environments
- Ensure compliance with global data protection regulations
- Lead vulnerability management, disaster recovery, and audit readiness
AI Driven Operations
- Embed AI into infrastructure operations and automation
- Enable predictive monitoring, cost optimisation, and remediation
Leadership and Team Building
- Build and lead a global infrastructure team across SRE, DevOps, Data, and Security
- Create scalable team structures and clear ownership models
- Drive a culture of reliability and continuous improvement
Stakeholder Collaboration
- Partner with leadership on strategy, cost, and global expansion
- Align infrastructure with business, legal, and commercial priorities
Technical Expertise
- Kubernetes, container orchestration, and cluster management
- Polyglot databases across relational, time series, graph, and distributed storage
- Messaging and streaming systems focused on scale, latency, and reliability
- Service governance, API management, and microservices architecture
- Observability across metrics, logging, and distributed tracing
- Infrastructure as code using Terraform, Ansible, or similar
- Security including zero trust, network segmentation, and compliance frameworks
- Strong understanding of IoT and edge computing environments
Preferred Background
- Experience within IoT, PropTech, energy, or complex platform environments
- Exposure to real time sensor data platforms and edge deployments
- Experience with Chinese cloud platforms or global infrastructure expansion
- Background supporting AI platforms including LLM deployment
- Experience scaling infrastructure through high growth or international expansion
- Exposure to 3D visualisation or simulation platforms
#J-18808-Ljbffr
2026-03-28 08:10:45