Quality Assurance Engineer
Arcus Search · Abu Dhabi, Abu Dhabi Emirate, United Arab Emirates
Apply & track with Apply EdgeWe are seeking a Senior QA Automation Engineer to lead the validation and verification strategies for our clients AI transformation. In this role, you will define "what good looks like" for non-deterministic AI systems, ensuring that Large Language Models (LLMs) and predictive engines meet the strict reliability standards required for the defense and enterprise sectors.You will act as the bridge between Agile development and formal Systems Engineering. Your mandate is to build automated testing frameworks that validate AI behaviors against "Ground Truth" datasets and ensure our AI agents pass rigorous Test Readiness Reviews (TRR) and Functional Configuration Audits (FCA)Key Responsibilities1. AI & LLM Validation Non-Deterministic Testing: Architect automated frameworks to evaluate Generative AI outputs for hallucination, consistency, and factual accuracy against "Gold Standard" datasetsRAG Evaluation: Implement automated metrics (e.g., RAGAS, faithfulness, answer relevance) to verify that Retrieval-Augmented Generation pipelines accurately cite technical and regulatory documentationPrompt Regression: Design regression suites to monitor "prompt drift," ensuring model updates do not degrade the quality of AI-generated engineering documents2. Integration & System VerificationEnterprise Integration: Build robust tests to validate data consistency between AI agents and critical systems (e.g., SAP S/4HANA, Ariba), ensuring the integrity of Bill of Materials (BOM) and financial dataPerformance Benchmarking: Design tests to validate latency and throughput for forecasting models and risk-scoring engines using tools like Locust, JMeter, or K6API & Security Validation: Automate testing of secure API gateways, verifying Role-Based Access Control (RBAC) and PII redaction logic before data reaches AI models.3. Governance & TraceabilityV-Model Alignment: Map automated test cases to "System Requirements" to create digital evidence for formal Verification and Validation (V&V) reportsStage Gate Compliance: Prepare "Test Readiness" packages for formal reviews, providing quantitative evidence that systems are stable enough to move from MVP to ProductionDefect Lifecycle Management: Manage the feedback loop between Requirements Quality Assistants and development teams, tracing AI logic defects back to specific model versionsWhat You’ll BringTechnical RequirementsCore Automation: Expert proficiency in Python (Pytest) and standard libraries (Selenium/Playwright, Requests)AI Evaluation: Hands-on experience with LLM evaluation frameworks (e.g., DeepEval, TruLens) and "Ground Truth" dataset managementPerformance Engineering: Proficiency in crafting Performance Test Plans and implementations (Locust, K6, etc.)Data Validation: Expertise in SQL and data quality tools (e.g., Great Expectations) for Data Lakehouses and Vector DatabasesCI/CD & DevOps: Strong experience integrating quality gates into GitLab CI/CD pipelinesEngineering Practices: Deep understanding of modern QE practices, including Shift Left, Test Pyramid, and Mono-repo architecturesProfessional QualificationExperience: 5+ years in QA Automation, with 2+ years focused on complex data-driven applications, ML models, or AI agentsDomain Expertise: Background in Defense, Aerospace, or highly regulated industries is a strong plus. Familiarity with IV&V processes is highly desirableAnalytical Mindset: Ability to define pass/fail criteria for probabilistic systems and communicate "Confidence Levels" to engineering leadershipThis is a Fixed Term Contract role..