Apple is seeking experienced machine learning engineers to build a next-generation platform for training deep learning models at scale. The role involves designing and developing components, pushing the limits of existing solutions, and deploying novel techniques for large-scale ML workloads. It offers a respectful work environment, flexible responsibilities, and access to world-class experts and growth opportunities.
Requirements
- Experience building large-scale deep learning infrastructure or platforms for distributed model training
- Experience with large-scale AI training infra components, such as accelerators, network fabrics, CUDA, NCCL, RDMA
- Strong programming skills in Python or Go
- Understanding of data structures, software design principles, and algorithms
- Experience building large-scale distributed systems with tools such as Kubernetes, Kafka, Prometheus, etc.
- Experience with deep learning frameworks, such as PyTorch, or JAX
Benefits
- Comprehensive medical and dental coverage
- Retirement benefits
- Discounted products and free services
- Tuition Reimbursement