We're looking for a Researcher: Multimodal to conduct cutting-edge research at the intersection of machine learning, multimodal data, and generative modeling to advance the state of AI across audio, text, vision, and other modalities.
Requirements
- Expertise in machine learning, multimodal learning, and generative modeling, with a strong research track record in top-tier conferences (e.g., CVPR, ICML, NeurIPS, ICCV)
- Proficiency in deep learning frameworks such as PyTorch or TensorFlow, with experience in handling diverse data modalities (e.g., audio, video, text)
- Strong understanding of state-of-the-art techniques for multimodal modeling, such as autoregressive and diffusion modeling, and deep understanding of architectural tradeoffs
Benefits
- Lunch, dinner and snacks at the office
- Fully covered medical, dental, and vision insurance for employees
- 401(k)
- Relocation and immigration support
- Your own personal Yoshi