Yann Lechelle is attempting something that has bedeviled French technology for decades: transforming world-class academic research into a commercial powerhouse.
His startup Probabl, which just closed an €18.5 million seed round, represents both France's AI ambitions and the persistent challenges of bridging the gap between university labs and market success.
The company's origin story reads like a case study in French technology transfer complications.
Scikit-learn is one of the world's most popular machine learning libraries. It's been downloaded over 2.5 billion times. Yet despite this global reach, few outside the data science community even knew about its French origins. It sat hidden within INRIA, France's national computer science research institute.
"People not in data science had no clue this was French, even though it is at top of the world in AI," Lechelle said.

The Backwards Startup
Probabl's path to commercialization defies Silicon Valley playbooks.
Most startups raise money to build something new. Probabl received funding to commercialize something that already existed and was freely available.
Scikit-learn has long been a pillar of the modern data science ecosystem, shaping how researchers and engineers build and apply machine learning in Python.
It would be hard to overstate Scikit-learn's impact. Effectively, Scikit-learn democratized access to machine learning by making powerful algorithms available to anyone with a basic understanding of Python. Before its emergence, implementing models for classification, regression, or clustering required deep mathematical and programming expertise. Scikit-learn changed that by providing a consistent, intuitive, and well-documented API, enabling researchers, students, and engineers to experiment, prototype, and deploy models with just a few lines of code. It turned machine learning from a specialized research discipline into a practical engineering tool, laying the groundwork for the data science revolution that followed.
Today, Scikit-learn is a foundation of the global AI ecosystem. It standardizes how machine learning is taught, evaluated, and applied, influencing countless other tools and frameworks. It is used by companies like Spotify, Airbnb, Netflix, and Google. With more than 140 million monthly downloads and a global contributor base, it remains not only a technical foundation for AI development but also a symbol of open collaboration and the enduring value of community-driven innovation.
Not bad for something that began in 2007 as a Google Summer of Code project by David Cournapeau, a French PhD student at INRIA.
His idea was to design something that would be an accessible, unified interface for machine learning algorithms built atop the scientific Python stack: NumPy, SciPy, and Matplotlib. Its first public release in 2010 was led by INRIA researchers Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Olivier Grisel and Bertrand Thirion.
Fast forward to 2023, and the French government, through its France 2030 program, allocated €32 million to maximize scikit-learn's potential. But there was a catch: INRIA was told to "please break even."
"The researchers were like, 'Okay, what is break even?'" Lechelle recalled, highlighting the chasm between academic and commercial mindsets.
When INRIA approached Lechelle in January 2023, fresh from leaving cloud computing company Scaleway, they had a problem. The government wanted to invest in securing this open-source technology, but the research institute lacked the commercial expertise. Lechelle pitched a proper spin-off where the government would be a shareholder but not a majority owner, with private capital driving competitive growth.
Making that happen was another story.
"I pitched 220 times," Lechelle said of his year-long fundraising marathon. "It's a sprint-marathon."
The challenge wasn't just finding investors willing to back an open-source project. It was creating an entirely new corporate structure that satisfied both public sector requirements and private investor expectations. For every euro of private investment, the company would unlock one euro of public funding. It's a model that required delicate negotiations and legal innovation.
The difficulties Probabl faced illuminate deeper issues in French technology transfer. France produces world-class AI researchers. For instance, Arthur Mensch of Mistral AI was a student of Probabl co-founder Varoquaux.
But the country has historically struggled to commercialize its innovations.
"The French are pretty bad at transferring IP to the private market," Lechelle said. "They've been slow. INRIA should be all over the place. It is, but not in value capture or transfer."
Building a Business Model After the Fact

Unlike typical startups that develop a business model before seeking funding, Probabl had to work backwards.
They started with a massive asset that Lechelle calls "the exclusive operator of the Coca-Cola brand for machine learning." But the recipe was already public in the domain. There was no intellectual property to protect or license.
The solution came through developing Skore, an enterprise platform that sits atop scikit-learn. The product addresses a crucial gap in the data science workflow: the chaotic journey from experimental models to production systems.
"Data scientists are given subsets of data sets. They come up with models and say, 'Hey, boss, I've got a model.' And the boss says, 'Okay, what do I do with it?'" Lechelle said. The disconnect between technical teams and business units often leads to wasted resources and abandoned projects.
Skore aims to be "the collaboration suite for machine learning R&D," helping companies bridge the pre-MLOps gap where many AI initiatives fail. While tech giants have mature data infrastructure through platforms like Databricks and Snowflake, much of the business world remains in experimental mode.
A Different Kind of AI Play
While the tech world obsesses over generative AI and large language models, Lechelle is betting on a different narrative. He sees a "winter coming" for GenAI, with disappointments mounting as return on investment fails to materialize for many companies.
"We are not tackling typical GenAI problems. We're tackling predictive AI problems," he said, positioning Probabl in the less glamorous but potentially more practical realm of traditional machine learning.
This approach aligns with the company's tagline: "Own Your Data Science." In an era of black-box AI systems, Probabl advocates for transparency and control. These values resonate particularly strongly in Europe, where AI regulation emphasizes explainability and accountability.

The Path Forward
The recent funding round, co-led by Serena and Capital Fund Management (CFM), validates this vision. Serena has developed a specific thesis around commercial open-source software, which is rare among French VCs. CFM, a quantitative hedge fund that has used scikit-learn for over a decade, sees strategic value in supporting the technology's evolution.
With seed funding secured, Probabl has finally achieved what Lechelle calls "the canonical form of a startup." The complex public-private structure has been streamlined, making the company attractive to traditional venture capital. The goal now is to achieve product-market fit in 2026 and position for a Series A round. Now Probabl must build a sales culture within an organization born from research labs, and compete globally while maintaining their French and European identity.
Still, Probabl considers itself a mission-driven company. Lechelle sees open source as perhaps the continent's best hope to compete globally. And the very existence of Probabl underscores the willingness of the French government and academia to find new models for technology transfer to amplify research impact.
In that sense, Probabl is more than a business opportunity. For Lechelle, Probabl is part of a larger vision for European technological sovereignty to compete with American and Chinese tech giants.
"We need to seize the moment," Lechelle said. "We should be sparring partners, not junior partners."
