The Challenge: AI That Understands India
From Use Case Capital to IP Producer
Ganesh Ramakrishnan’s Vision at Cypher 2025 On September 19, 2025, Prof. Ganesh Ramakrishnan of IIT Bombay delivered a compelling talk at the KTPO Trade Centre in Bengaluru during the final day of Cypher 2025. His session marked a pivotal moment for the BharatGen initiative, articulating a shift where India moves from being a global “use case capital” to a primary producer of intellectual property.
The Significance of Cypher
Cypher 2025, held from September 17 to September 19, 2025, is India’s largest AI and data science conference. This year’s theme, “Make AI in India,” brought together over 5,000 daily attendees, including researchers, policymakers, and entrepreneurs, to build solutions that shape both India’s and the world’s future. The event serves as a critical platform for integrating AI into enterprise stacks and fostering national AI sovereignty.
The BharatGen Model Stack
During his talk, Ramakrishnan showcased a range of indigenous foundational models, from 500 million to 7 billion parameters, specifically designed for India’s linguistic and cultural diversity.
Param One
A 2.9 billion parameter model pre-trained from scratch with 33% Indian data and 25% Hindi content.
Patram
India’s first vision-language document model (7 billion parameters) that understands Indian English documents via text or voice.
Shrutam
A speech-to-text model providing accurate transcriptions of Hindi spoken in various regional accents.
Sooktam
A family of text-to-speech models generating speech in five languages with diverse Indian accents and personal styles.
Sooktam
A family of text-to-speech models generating speech in five languages with diverse Indian accents and personal styles.
Overcoming Western Model Limitations
Prof. Ganesh Ramakrishnan highlighted that Western text-to-speech models often fail in the Indian context because they rely on duration prediction, which struggles with Indian prosody and accents. BharatGen utilizes the FITTTS (Fake Fluent and Faithful Text-to-Speech) architecture to bypass these limitations. While this requires 3x the training time, it ensures speech that accurately reflects Indian emotional nuances.
Multilingual Mixture-of-Experts (MoE)
To handle linguistic diversity frugally, BharatGen employs a Mixture-of-Experts approach. This system routes languages to specialized or shared experts based on phonetic similarities: Tamil is routed to a dedicated expert due to its distinct phonetic patterns. Telugu leverages shared experts among South Indian languages. Hindi and Marathi consistently share experts due to their linguistic overlaps.
Real-World Impact: Krishi Sathi
The session featured Krishi Sathi, a hyper-local AI application delivering personalized agricultural advice to farmers in their native language. Tailored to specific soil and climate conditions, it demonstrates how sovereign AI directly empowers citizens and improves governance.
Scaling to the Trillion Mark
Backed by a recent ₹900+ crore grant through the IndiaAI Mission, BharatGen is now scaling its efforts toward models with one trillion parameters. This “whole-of-government” approach ensures that India maintains technological sovereignty while building a scalable, indigenous AI ecosystem.



Source: PM Modi LinkedIn


