Introducing Numo: A New Way to Contribute to AI

Story

29 April 2026

AI is running into a data problem.

Text data helped build the first generation of AI. Improving the next one will require far more real-world data: how people speak, move, respond, and behave across different environments, languages, and conditions. That data is harder to find, harder to collect, and much harder to use commercially at scale.

Today, Poseidon is launching Numo in early access, a consumer app built to help solve that problem. Poseidon’s first beta app experiment already showed what distributed data collection can look like at scale: 33,000+ hours collected in three weeks. Numo builds on those learnings and gives people a simple way to contribute training data for AI and get rewarded for it. Open the app, choose a category, complete a task, and earn.

At launch, Numo will collect voice data across Bengali, Hindi, Tamil and Telugu, with other languages to be added in incrementally very soon.

These are languages spoken by over a billion people worldwide, but they remain underrepresented in many of today’s AI voice systems. Not because the speakers are few, but because the pipelines still lack enough high-quality, rights-cleared data.

While voice is the first focus for Numo, the underlying data bottleneck extends far beyond audio. The same challenge exists across video, image, movement, sensor, and other real-world modalities where high-quality, rights-cleared data remains scarce. Over time, Numo is designed to expand into new task types and formats, building a broader pipeline of human-generated data needed to train the next generation of AI systems.

The data AI needs is not sitting on the open internet

AI is moving into environments where approximation stops working. As AI becomes embedded across everyday life, voice will no doubt be one, if not the defining interface. It is how people naturally communicate. We speak before we type, and in many contexts speaking is faster, easier, and more intuitive than using a screen.

Voice systems need to handle accents, interruptions, and background noise. Physical AI systems need to operate in messy, unpredictable settings. Autonomous systems need to make decisions in rain, fog, and low visibility. Models that work in controlled settings often break down in the real world.

The missing ingredient is not always more compute or better architectures. Increasingly, it is better data. What matters now is data collected from real people, in real environments, under real conditions. That kind of data cannot simply be scraped at the volume or quality needed to deploy AI systems at scale. It has to be collected deliberately.

Solving the supply problem task by task, market by market

The data that makes AI work in the real world is distributed by nature. It lives across billions of people, in different languages, geographies, and everyday contexts. That means the supply problem has to be solved task by task, modality by modality, and market by market. Numo is built for exactly that. Anyone with a phone should be able to contribute to building AI. The process should be simple enough to use at scale, but structured enough to produce data that is actually useful to AI builders.

From early validation to broader scale

Earlier this year, Poseidon launched its first contributor-facing app for its community. The goal was to test whether a distributed, contributor-driven model could generate meaningful volumes of high-quality, real-world data. The response exceeded expectations. Early results were strong: 33,000+ hours of rights-cleared audio collected in three weeks across 17 languages. In several languages, the dataset exceeded years of prior public collection efforts.

That early product validated a simple idea: when people understand what they are contributing to, and when the experience is straightforward, they participate. Top contributors from Poseidon’s first app will receive reward multipliers in Numo, reflecting the quality of their earlier participation. Numo builds on that proof point and brings the model to a much broader audience.

Numo is built for designed for everyone to contribute

Using Numo makes AI data contribution straightforward: users open the app, join an active campaign, complete tasks, and receive rewards for eligible contributions. At launch, campaigns focus on voice data in five languages where demand is growing and supply remains limited:

Bengali
Hindi
Tamil
Telugu

Each campaign is built around the needs of AI teams looking for better training data in real-world conditions. That means contributors are not completing generic tasks. They are helping fill specific gaps that affect model performance. Over time, the app will expand across more modalities, more task types, and more markets.

Rights-cleared data from the start

Collecting rare data from around the world is an impressive coordination problem, but deploying it is the hardest part. A dataset is only useful if it can be used commercially, without uncertainty around provenance or rights. Legislation around the world is already moving towards requiring more transparency and visibility around the data AI models are trained on. In India alone, the country’s recent standalone personal data protection law marks a pivotal shift for global businesses processing the personal data of individuals in India. That is why every contribution made through Numo is registered and licensed on Story from the start. Provenance is tracked at the point of creation. Usage rights are defined upfront.

Rather than treating compliance as something to sort out later, Poseidon has built it into the data collection process itself ensuring it is usable and commercially viable from the start. That matters for the team's training models, and it matters for the long-term value of the data being created.

Building the full pipeline

Most data businesses sit in the middle. They source data, broker data, or process data. Poseidon is building across the full pipeline: collection, processing, and rights clearing with provenance via Story.

From the moment someone contributes data on their phone to the moment that data is ready to be used by an AI team in production will all run on Poseidon and Story. Numo is one of the consumer-facing entry points into that system.

A broader bet on who gets to build AI

AI will only work well in the real world if the real world is reflected in the data behind it. That requires more than better models and better participation. Numo is built on a simple idea: the people whose voices, languages, and lived environments will shape AI systems should get rewarded in the process for their contributions. Today, that starts with voice.

Join Numo early access and start contributing today at numolabs.ai

Sources: https://www.americanbar.org/groups/business_law/resources/business-law-today/2025-may/india-data-protection-law/

CDR on Testnet: A New Way to Use Sensitive Data Without Exposing It

Subscribe to our newsletter

Introducing Numo: A New Way to Contribute to AI

Story

The data AI needs is not sitting on the open internet

Solving the supply problem task by task, market by market

From early validation to broader scale

Numo is built for designed for everyone to contribute

Rights-cleared data from the start

Building the full pipeline

A broader bet on who gets to build AI

You might also like

CDR on Testnet: A New Way to Use Sensitive Data Without Exposing It

Recent Network Patches

Why We’re Updating the $IP Unlock Schedule

A long-term approach to alignment, emissions, and network health

Subscribe to our newsletter

Learn

Build

Tools

Explore

Community

Legal

Introducing Numo: A New Way to Contribute to AI

Story

The data AI needs is not sitting on the open internet

Solving the supply problem task by task, market by market

From early validation to broader scale

Numo is built for designed for everyone to contribute

Rights-cleared data from the start

Building the full pipeline

A broader bet on who gets to build AI

You might also like

CDR on Testnet: A New Way to Use Sensitive Data Without Exposing It

Recent Network Patches

Why We’re Updating the $IP Unlock Schedule

A long-term approach to alignment, emissions, and network health

Subscribe to our newsletter