Ohalo is seeking an experienced Platform/Data Engineer to join our team. This role involves building and maintaining data pipelines to support our machine learning engineering activities and managing data related to plant phenotypes and genotypes.
Requirements
- Design and implement robust data architectures.
- Build and maintain scalable data pipelines using technologies such as GCP (or AWS), BigQuery, Python, and Spark.
- Develop and manage service-oriented and event-driven architectures, utilizing tools like Pub/Sub and Kafka.
- Collaborate with machine learning engineers on model development and deployment.
- Work closely with automation engineering to automate data collection from robotics systems.
- Ensure data integrity and security across all pipelines and processes.
- Optimize data workflows and storage solutions for performance and scalability.
- Continuously monitor, troubleshoot, and improve data systems and processes.
- Candidate should have a Bachelor's degree in a technical field (e.g., Computer Science, Engineering, Information Technology).
- Minimum of 5 years of experience in a similar role.
- Proficiency with cloud platforms, preferably GCP (or AWS).
- Strong experience with data processing frameworks such as BigQuery, Spark, and Nextflow (a plus).
- Proficiency in Python and frameworks like FastAPI.
- Experience with service-oriented and event-driven architectures.
- Knowledge of data streaming and messaging systems, including Pub/Sub and Kafka.
- Strong problem-solving skills and attention to detail.
- Ability to work independently and as part of a geographically distributed team.
- Excellent communication and collaboration skills.