Software Engineer, LLM Evaluation

Blue Lynx · Netherlands

Our client’s mission is to empower people, build community, and bring the world closer together. Through their apps and services, they are building a different kind of company that connects people worldwide and provides meaningful ways to share what matters most.We are looking for a Software Engineer, LLM Evaluation, to join their team in the Netherlands remotely.The successful candidate will work closely with researchers and engineers to evaluate model performance, conduct data-driven analyses, and contribute to research initiatives related to LLM pretraining and evaluation.Job Profile for Software Engineer, LLM EvaluationResponsibilities will include, but not be limited to:Analyse and evaluate large language models and their performance across various tasks and benchmarksExecute benchmark evaluations and generate performance metrics and insightsConduct quantitative and qualitative data analysis to support research objectivesContribute to the design, implementation, and validation of new evaluation methodologiesSupport research initiatives related to LLM pretraining and model evaluationDevelop and maintain machine learning and deep learning systems and toolingBuild research workflows and experimental frameworks using Python and PyTorchEnable rapid experimentation and support the execution of research initiativesCollaborate with researchers and engineers within a multidisciplinary team environmentCandidate Profile for Software Engineer, LLM EvaluationMust be fluent in English, both written and spokenBachelor's degree in Computer Science, Artificial Intelligence, Machine Learning, or a related technical discipline. Master's degree in Computer Science, Machine Learning, AI, or a related discipline is desirable; a PhD in a relevant field would be considered a strong advantage3–5 years of experience working with large language models, including pretraining and evaluationStrong hands-on programming experience in PythonProven experience building and maintaining ML/DL systems and research infrastructureExperience developing machine learning and deep learning solutions using PyTorchPublications in machine learning, natural language processing, artificial intelligence, or related fields are a plusExperience working with transformer architectures, large language models, and/or multimodal models is an advantageAbility to write clean, efficient, and production-quality codeStrong interest in model evaluation, experimentation, and data analysisDemonstrated scientific curiosity and problem-solving skillsWhat Our Client Offers25 holidays per annumPension planOpportunity to work alongside experienced researchers and engineers on cutting-edge AI initiativesExposure to state-of-the-art methodologies and technologies within the LLM ecosystemCollaborative and research-driven environment in a technologically advanced office