This AWS/Azure Data Engineer role involves designing, implementing, and managing scalable ETL/ELT pipelines using AWS services like Databricks. The position focuses on data integration, transformation, and optimization of data pipelines, as well as collaboration with various engineering teams to deliver data-driven solutions.
Requirements
- Design, implement, and manage scalable ETL/ELT pipelines using AWS services and Databricks.
- Ingest and process structured, semi-structured, and unstructured data from multiple sources into AWS Data Lake or Databricks.
- Develop advanced data processing workflows using PySpark, Databricks SQL, or Scala to enable analytics and reporting.
- Configure and optimize Databricks clusters, notebooks, and jobs for performance and cost efficiency.
- Design and implement solutions leveraging AWS-native services like S3, Glue, Redshift, EMR, Lambda, Kinesis, and Athena.
- Ensure data pipelines are secure, robust, and monitored using CloudWatch, Datadog, or equivalent tools.
- Maintain clear and concise documentation for data pipelines, workflows, and architecture.