Senior Machine Learning Engineer
Engineering·Remote (India / Europe time zones)·Full-time
Apply for this roleAbout the Role
About the role
You'll turn research into a reliable product. As a Senior ML Engineer, you own the path from a trained model to a fast, cheap, observable inference service that powers every GabForge conversation.
What you'll do
- Build and optimise our inference stack (batching, KV-cache, quantisation, speculative decoding).
- Own the model-routing layer that sends easy requests to small models and escalates hard ones.
- Instrument everything — latency, cost per request, quality regressions — and act on it.
- Work with research to productionise new checkpoints with zero-drama rollouts.
Why it matters
Every millisecond and every rupee of inference cost is multiplied across all our free users. Your optimisations are the reason the Free Covenant is affordable.
Requirements
What we're looking for
- 5+ years building production ML systems.
- Hands-on with an inference engine (vLLM, TGI, TensorRT-LLM, llama.cpp or similar).
- Strong Python; comfortable in systems-level performance work.
- You measure before you optimise and you ship behind metrics.
Nice to have
- GPU kernel / CUDA experience.
- Experience running inference on owned hardware rather than a managed API.