أبلاي إيدج ابدأ البحث عن عمل

AI / LLM Deployment Engineer

Walker Lovell · Abu Dhabi Emirate, United Arab Emirates

قدّم وتابع مع أبلاي إيدج
AI / LLM Deployment EngineerLocation: Remote (GST time zone preferred) with occasional travel to Abu Dhabi if requiredTravel: Occasional international travelCompensation: Exceptional package reflecting seniority, technical expertise and impactWhat's in it for you?This isn't another AI application role. You'll lead the deployment of large language models including DeepSeek, Kimi and Qwen into sovereign, air-gapped environments where GPU performance, inference optimisation and security are business critical. If you're passionate about high-performance AI infrastructure, this is an opportunity to solve problems that very few engineers get to tackle.Package / BenefitsExceptional package reflecting seniority and specialist expertiseFully remote initially with flexibility for future relocation if desiredVisa sponsorship available where applicableWork with cutting-edge open-weight LLMs and enterprise GPU infrastructureInfluence deployment architecture from the ground upWhy this businessJoin a globally focused technology business developing sovereign AI and intelligence platforms for highly regulated environments across multiple international markets. Working at the forefront of secure AI deployment, the organisation is investing heavily in advanced infrastructure and offers the opportunity to solve technically demanding challenges alongside a highly experienced engineering team.What you'll be doingArchitect and deploy LLMs including DeepSeek, Kimi, Qwen and LLaMA into secure, air-gapped production environmentsConfigure and optimise NVIDIA H100/H200 GPU clusters, NVLink and InfiniBand infrastructure for high-performance inferenceApply GPTQ, AWQ and GGUF quantisation techniques to maximise deployment efficiency without compromising model performanceDeploy and optimise inference runtimes including vLLM, TGI and Ollama within Kubernetes environments, delivering target throughput and latency SLAsWhat you'll bringProven commercial experience deploying production LLMs using vLLM, TGI, Ollama or equivalent inference platformsExpert knowledge of Kubernetes, NVIDIA GPU infrastructure, GPU memory optimisation and high-performance computingHands-on experience with model quantisation techniques including GPTQ, AWQ or GGUFExperience delivering on-premise or air-gapped AI deployments. Experience within government, defence, cyber security or other highly regulated environments would be advantageous.Who this suitsYou're an infrastructure engineer who thrives on solving complex deployment challenges rather than building AI applications. You understand what it takes to run large language models reliably at scale, enjoy optimising GPU performance and want to work on technically demanding projects where security, performance and engineering excellence are non-negotiable.Apply now for a confidential conversation with Walker Lovell.