Machine Learning Researcher at Bland.com to work on foundational research and development across voice stack: speech-to-text, large language models, neural audio codecs, and text-to-speech. Building and scaling next-generation TTS systems, advancing speech-to-text modeling, pioneering neural audio codecs, and developing scalable training pipelines.
Requirements
- Experience with self-supervised learning, multimodal modeling, or generative modeling.
- Hands-on experience building or scaling TTS, STT, or neural audio codec systems.
- Experience training and serving large models on modern accelerators.
- Knowledge of inference optimization techniques, including quantization, kernel optimization, and memory efficiency.
- Strong intuition for audio quality, prosody, and conversational dynamics.
- Track record of designing controlled experiments and meaningful ablations.
- Experience with large scale distributed training.
- Research publications or open source contributions in speech or language AI.
- Background in real-time speech systems or telephony.
- PhD in ML, AI, or a related field, or equivalent research impact.
Benefits
- Healthcare
- Dental
- Vision
- Meaningful equity in a fast-growing company
- Every tool you need to succeed
- Beautiful office in Jackson Square, SF with rooftop views