Data Engineer

* The Data Engineer will play a key role in the Digital Transformation. * You'll initially work alongside our implementation partner during the build of the new data lake, then take full ownership of run-and-maintain and project delivery as the platform evolves. * The role requires hands-on data architecture and delivery experience, alongside the ability to support business growth through technology and innovation.

Data Lake Build & Pipeline Development

Work alongside the implementation partner to design and build the data lake — ingestion, storage zones (raw / curated / consumption), and access patterns.
Work closely with internal stakeholders (Finance, Operations, IT, Commercial Business Leads) during the development phase.
Build and test data pipelines (batch and near-real-time) using Apache Spark on Azure, sourcing from the ERP, legacy databases, SharePoint, and other operational systems (e.g CRM).
Translate business requirements into data models, and pipeline designs; critically evaluate the partner’s designs.
Ensure pipelines are repeatable, monitored, and recoverable, with clear logging and lineage.

Data Architecture & Modelling

Maintain and evolve the data architecture as new domains come on board (Finance, Sales, Parts, Service, HR)
Apply data architecture patterns appropriate to each use case.
Propose and implement a plan to upload historical data to maintain report history.
Define and document data models, KPI definitions, and metric contracts so report consumers can rely on the numbers.
Structure and shape consumption-layer datasets so they are directly fit for Power BI dashboards and reports — including star/snowflake modelling, conformed dimensions, semantic-friendly naming, and the right granularity and aggregations to keep reports performant. Partner with report developers and business users to translate analytical requirements into reusable, well-governed datasets.
Own technical documentation — architecture diagrams, data lineage, runbooks — to ensure continuity beyond the implementation phase.

Data Quality, Governance & Audit

Embed data quality checks, cataloguing, and lineage tooling (Microsoft Purview or equivalent) so the lake remains trustworthy as it grows.
Identify, document, and resolve data discrepancies, gaps, and integrity issues across source and target systems.
Support data audit, compliance, and access-governance requirements across the platform.
Maintain auditable logs of pipeline runs, integration outcomes, and data quality results.

Platform Operations & BAU Ownership

Take operational ownership of the data lake post go-live: monitoring, incident response, performance tuning, cost control, and capacity planning.
Extend the platform with new sources and use cases as the business evolves.
Define and meet platform service levels, including availability, data freshness, and issue resolution times.
Continuously improve platform standards, documentation, and engineering practices based on operational feedback.

System & Cloud Integration

Design and deliver integrations between the data lake and source systems — REST APIs, webhooks, SFTP, and SaaS connectors.
Operationalise integrations across Microsoft Azure, Google Workspace, and third-party cloud platforms used by the business.
Maintain integration security, authentication, and credential management aligned with Group IT standards.
Diagnose and resolve integration failures with structured root-cause analysis.

Technical & Analytical Skills

Strong working knowledge of Microsoft Azure data services — Azure Data Lake Storage, Azure Data Factory, Synapse / Fabric, Azure SQL, etc
Solid Apache Spark experience (PySpark or Scala), including performance tuning, partitioning, and Delta Lake or equivalent table formats.
Working knowledge of all core data lake components — ingestion frameworks, orchestration, schema evolution, file formats, and governance layers.
Practical experience with Google Workspace as a data source and admin environment, plus Google Cloud familiarity (BigQuery, GCS) at a working level
Strong system and cloud integration experience — REST APIs, webhooks, message queues, file-based integrations, and SaaS connectors.
Strong SQL and Python; clean, readable, version-controlled code.
Exposure to Microsoft Purview or similar data audit / catalogue / lineage tools is advantageous.
Exposure to Microsoft Dynamics 365 data extraction or other large ERP environments is preferred.
Demonstrated ability to design and structure consumption-layer datasets for Power BI. Comfortable working hand-in-hand with report developers and business users to shape the data the way the dashboards need it.

Qualifications & Experience

Bachelor’s degree in a relevant field (Computer Science, Information Systems, Data Engineering, Software Engineering, Mathematics, or equivalent practical experience).
5–7 years’ experience as a Data Engineer, with at least one end-to-end data lake or lakehouse build delivered.
Demonstrated experience designing and operating data pipelines on Azure, using Spark / Delta Lake at production scale.
Experience supporting or extracting data from Microsoft Dynamics 365 or another large ERP is preferred.
Exposure to Microsoft Purview or similar data governance / audit tools is advantageous.
Hands-on experience structuring data for Power BI on the consumption side of a data lake or warehouse. Exposure to other BI tooling is a plus, but Power BI experience is required.

Similar jobs

View more jobs

Data Engineer

Share

Similar jobs