About the Role

We’re looking for an Applied ML Intern (Speech) who wants to work on real-world systems under real production constraints, not just experiments in notebooks.

You’ll take ownership of improving speech models used in production across Southeast Asian languages, accents, and noisy environments. This means dealing with messy data, evolving requirements, and tight constraints on latency, cost, and reliability.

You’ll work closely with engineers and founders to ship models and improvements that directly impact users. Your work won’t stay experimental. It will go live, face real-world conditions, and continuously evolve.

If you’re someone who enjoys debugging hard problems, thinking beyond metrics, and shipping meaningful ML improvements, this role will push you in the right ways.

What You Will Do

Experiment with and improve speech/ASR models across SEA languages and accents
Design and run experiments under real-world constraints (latency, cost, memory)
Identify failure modes and edge cases in production speech data
Optimize inference performance and GPU utilisation
Develop strategies for multilingual and code-switching scenarios
Work with engineering to deploy models into production pipelines
Build evaluation datasets and tracking systems for model performance
Document experiments, trade-offs, and learnings clearly

What We’re Looking For

Strong fundamentals in Python and PyTorch
Understanding of speech/ASR basics
Experience with model training, fine-tuning, and evaluation
Familiarity with inference optimisation and GPU workflows
Ability to work with messy, multilingual, real-world data
Comfort making decisions with incomplete signals and evolving requirements

Founding Mindset

You think in terms of shipped improvements, not just metrics
You ask “how will this behave in production?” before trying something new
You take ownership of speech quality and system outcomes
You balance research depth with speed of execution
You proactively find model failures instead of waiting for them to surface

Bonus

Experience with multilingual or low-resource speech systems
Exposure to low-latency or on-device inference
Experience deploying ML models into production systems

What Success Looks Like

Within 4–6 weeks, you should be able to:

Own improvements for a specific speech use case or language
Ship at least one measurable gain in accuracy, robustness, or latency
Identify and document key failure modes and mitigation strategies
Contribute to evaluation, monitoring, and model diagnostics

What You’ll Get

Hands-on experience with applied ML under real production constraints
Direct collaboration with founders and experienced engineers
A portfolio of shipped improvements—not just experiments
Exposure to real-world speech challenges across languages and environments
A strong foundation for applied ML or speech-focused engineering roles

Who This Is Not For

If you only want to work on clean datasets and offline benchmarks
If you avoid messy data or complex debugging
If you prefer purely research environments disconnected from production
If you’re looking for a low-intensity internship

Who Will Thrive Here

Builders who enjoy shipping ML systems to production
Engineers who think beyond models and understand full pipelines
Calm, methodical debuggers of unpredictable system behaviour
High-agency individuals who care about real-world impact

About the Company

We’re building the speech intelligence layer for Southeast Asia—turning real-world, accented, code-switched speech into structured, usable outputs for businesses.

Speech / Applied ML Engineer (Intern)