Description
We are looking for a skilled and motivated Data Engineer to join our team. In this role, you will design, build, and maintain robust data pipelines and scalable data infrastructure, enabling the efficient and secure flow of data across our organization. You will work closely with data scientists, analysts, and software engineers to ensure reliable data processing and availability.
A typical day might include the following:
- Develop, automate, and maintain data ingestion and transformation pipelines using AWS Glue, PySpark, and Apache Airflow, ensuring reliable and high-performance integration from diverse sources (S3, RDS, DynamoDB, external APIs).
- Participate in the management and optimization of data storage in AWS solutions (S3, Redshift, Aurora, DynamoDB), ensuring data is structured, secured (IAM, encryption), and governed across its lifecycle.
- Design and deploy robust ETL workflows, including the cleaning, transformation, and enrichment of data to make it usable for business teams and analytics tools (Qlik, SAP BO, Athena).
- Collaborate with Data and IT teams to provide, test, and validate datasets, SQL queries, and visualizations while maintaining the quality, performance, and security of all data processes.
- Ensure monitoring, versioning, and continuous improvement of data processes using CI/CD tools (Gitlab, Jenkins) and monitoring solutions (Datadog); contribute to technical documentation and share best practices within the team.