Senior Data Engineer

Global Software Solutions Group · Dubai, Dubai, United Arab Emirates

We are looking for a highly skilled Senior Data Engineer with strong expertise in PySpark, Python, and Big Data technologies to design, build, and optimize scalable data platforms and pipelines. The ideal candidate will work closely with Data Scientists, Business Analysts, and Analytics Delivery teams to develop high-performance analytics solutions and support advanced data science initiatives.RequirementsKey ResponsibilitiesCollaborate with Analytics Delivery Leads and Lead Data Engineers to understand business requirements and deliver impactful data solutionsWork closely with Data Scientists and cross-functional teams to solve complex business problems through data engineering solutionsManage data onboarding, data access, and stakeholder coordination for analytics initiativesDesign, develop, and maintain scalable, secure, and high-performance data pipelinesAcquire, ingest, process, and transform large-scale structured and unstructured datasetsImplement data engineering best practices for building reliable and production-ready data platformsPerform data wrangling, cleansing, transformation, and feature engineering for machine learning and analytics use casesDesign and develop modular data pipelines to generate reusable features and modeling datasetsBuild and optimize data architectures supporting advanced analytics and machine learning workloadsContribute to data platform design by selecting appropriate technologies across Big Data, SQL, and NoSQL ecosystemsEnsure data quality, integrity, governance, security, and scalability across the data lifecycleParticipate in Agile squads and collaborate effectively with stakeholders across business and technology teamsContribute to enterprise data architecture strategy and roadmap aligned with business objectivesRequired Technical SkillsCore Data EngineeringStrong experience in Data Engineering and Big Data solutionsExpert-level proficiency in PySparkStrong programming experience in PythonExperience building large-scale distributed data processing pipelinesData Processing & ETLData IngestionData TransformationData WranglingData PreparationData ModelingFeature EngineeringETL/ELT DevelopmentBatch ProcessingData Pipeline DevelopmentBig Data TechnologiesApache Spark / PySparkDistributed Data Processing FrameworksBig Data EcosystemsDatabase TechnologiesStrong SQL expertiseExperience with Relational DatabasesExperience with NoSQL DatabasesData Warehousing ConceptsCloud & Analytics Platforms (Preferred)Azure Data PlatformAWS Data ServicesGoogle Cloud Data ServicesDatabricksHadoop EcosystemMachine Learning Data SupportFeature Engineering for ML ModelsData Preparation for Analytics and AI Use CasesBuilding Modeling and Feature TablesSupporting Data Science WorkloadsEngineering Best PracticesSoftware Engineering PrinciplesData Pipeline OptimizationData Quality ManagementPerformance TuningScalability & SecurityVersion Control (Git)CI/CD ConceptsDocumentation StandardsRequired Soft SkillsStrong stakeholder management and communication skillsExperience working with Business Analysts, Data Scientists, and Product TeamsAbility to translate business requirements into scalable data solutionsStrong analytical and problem-solving skillsExperience working in Agile/Scrum environmentsAbility to work independently and within cross-functional teamsPreferred QualificationsExperience working in enterprise-scale data platformsExperience supporting machine learning and advanced analytics initiativesExposure to cloud-native data engineering solutionsExperience designing modern data architectures and data lake environments