Location: UAE
About Us:
RFS HR Consultancy connects global tech leaders with top-tier infrastructure and AI talent. We specialize in AI/ML engineering, computing hardware systems, and enterprise architecture roles across the UAE.
About Our Client:
Our client is advancing AI research and development with cutting-edge computing infrastructure. This role is central to managing high-performance systems that support large-scale AI workloads across enterprise applications.
Responsibilities:
- Support and manage GPU-based AI infrastructure in data center environments.
- Maintain HPC/AI clusters, network configurations, and compute performance.
- Collaborate with data scientists and AI teams to align system specs with ML workloads.
- Troubleshoot hardware and software issues related to AI frameworks.
- Automate deployment processes using DevOps and infrastructure-as-code tools.
- Ensure system reliability, scalability, and compliance with enterprise security standards.
Requirements:
- 3–5 years of experience in AI infrastructure, HPC systems, or large-scale computing environments.
- Strong hands-on knowledge of NVIDIA GPUs, CUDA, and Linux-based environments.
- Familiarity with AI/ML frameworks such as TensorFlow, PyTorch, or similar.
- Experience with DevOps tools like Docker, Kubernetes, Ansible, or Terraform.
- Solid understanding of storage and network architecture for distributed systems.
- Excellent problem-solving and collaboration skills.
What our client offers:
Competitive salary and a comprehensive benefits package