Fusemachines is a leading AI strategy, talent, and education services provider seeking a Lead Data Engineer to design, build, test, optimize, and maintain infrastructure and code for data integration, storage, processing, pipelines, and analytics. The ideal candidate will have a strong background in Python, SQL, PySpark, Redshift, and AWS cloud-based large-scale data solutions, with a passion for data quality, performance, and cost optimization.
Requirements
- 5+ years of real-world data engineering development experience in AWS and GCP
- Strong expertise in Python, SQL, PySpark, and AWS in an Agile environment
- Strong programming skills in one or more languages such as Python, Scala
- Good understanding of Data Modeling and Database Design Principles
- Strong SQL skills and experience working with complex data sets
- Skilled in Data Integration from different sources such as APIs, databases, flat files
- Strong experience in implementing data pipelines and efficient ELT/ETL processes
- Strong experience with scalable and distributed Data Technologies such as Spark/PySpark, DBT, and Kafka
- Expert in Cloud Computing in AWS, including deep knowledge of a variety of AWS services
- Good understanding of Data Quality and Governance
- Good understanding of BI solutions including Looker and LookML
- Strong knowledge and hands-on experience of DevOps principles, tools, and technologies
Benefits
- Generous Paid Time Off
- 401k Matching
- Retirement Plan
- Visa Sponsorship
- Four Day Work Week
- Generous Parental Leave
- Tuition Reimbursement
- Relocation Assistance