30+ Open Source LLM Statistics & Trends (2026)

Explore 30+ open source LLM statistics for 2026, covering ecosystem growth, model usage, performance, inference costs, and leading developers.

Written by Sherlock Xu

Last updated on Jul. 2, 2026

Bar chart comparing OpenRouter token usage for DeepSeek, Qwen, and Meta Llama

Open source LLMs went from a research curiosity to the backbone of real production systems in under three years. They now power coding assistants, agents, and enterprise pipelines, and in some fields they’ve overtaken the proprietary models that defined the early days of the boom.

So how big is the open source LLM ecosystem today? And how fast is it actually growing?

I pulled together the most useful open source LLM statistics I could find, all from primary sources like the Stanford AI Index, Hugging Face, Meta, OpenRouter, and the Stack Overflow Developer Survey. Each section links to the original sources so you can cite them directly.

Top open source LLM statistics

This article prioritizes first-party reports, research papers, official documentation, and original datasets.

Statistic	Value	Scope and date	Source
Public models hosted on Hugging Face	More than 2 million	All public model repositories, not only LLMs; 2025	Hugging Face
Downloads captured by the top 200 models	49.6%	Hugging Face downloads; 2025	Hugging Face
Direct derivative models in the Qwen family	More than 113,000	Hugging Face repositories; March 2026	Hugging Face
Llama downloads	More than 1 billion	Cumulative downloads reported in March 2025	Meta
Open source share of token usage	Roughly one-third	OpenRouter, late 2025	OpenRouter
Tokens processed by DeepSeek models	14.37 trillion	OpenRouter, November 2024-November 2025	OpenRouter
Open-vs.-closed performance gap	3.3%	Top Arena models, March 2026	Stanford AI Index 2026
OpenRouter annualized token run rate	About 1.5 quadrillion	All models on OpenRouter, May 2026	Menlo Ventures
Historical inference cost decline	About 10x per year	Equivalent MMLU performance in a 2024 analysis	Andreessen Horowitz

What is an open source LLM?

An open source LLM generally refers to a language model that people can download, run, and modify. Compared with proprietary models that are only accessible through an API, open source LLMs give developers much greater control over deployment, customization, and infrastructure.

The term open source is often used loosely. Many models described as open source are actually released as open-weight models. Their weights are publicly available but the license may include restrictions that differ from a traditional open source software license. Because the industry commonly uses “open source LLM” to refer to both categories, this article follows that convention for simplicity.

How many open source LLMs are there?

There is no authoritative global count of open source LLMs. Hugging Face hosted more than 2 million public model repositories in 2025, but that total includes models for text, image, audio, robotics, and other tasks, as well as fine-tunes, adapters, quantizations, and derivatives.

The broader Hugging Face ecosystem is still useful for measuring the scale and direction of open model development:

Hugging Face statistic	Value	What it measures
Registered users	13 million	Platform community size
Public model repositories	More than 2 million	All model types and derivatives
Public datasets	More than 500,000	All dataset categories
Models with fewer than 200 downloads	About 50%	The long tail of model repositories
Share of downloads captured by the top 200 models	49.6%	Concentration among the most-used repositories
Fortune 500 companies with verified accounts	More than 30%	Organizational presence, not confirmed production adoption
Industry share of model development	37%	Down from roughly 70% before 2022
Downloads attributed to unaffiliated developers	39%	Up from 17% before 2022
Mean size of a downloaded model	20.8B parameters	Up from 827M in 2023
Median size of a downloaded model	406M parameters	Up from 326M in 2023
Mean engagement period after release	About 6 weeks	How long models typically sustain attention

Source: Hugging Face

The mean downloaded model size grew by about 25x between 2023 and 2025, while the median grew by only about 25%. That difference suggests a relatively small number of large models are pulling up the average while small models remain common.

Open source LLM adoption and usage

Open models have moved well past experimentation. By late 2025, they reached roughly one-third of all token volume on OpenRouter, a unified API platform that gives developers access to hundreds of AI models. The underlying study analyzed more than 100 trillion tokens over a year, covering November 2024 through November 2025.

Open model developer	Tokens processed on OpenRouter
DeepSeek	14.37 trillion
Qwen	5.59 trillion
Meta Llama	3.96 trillion
Mistral AI	2.92 trillion
OpenAI	1.65 trillion
MiniMax	1.26 trillion
Z.ai	1.18 trillion
TNGTech	1.13 trillion
Moonshot AI	0.92 trillion
Google	0.82 trillion

Source: OpenRouter

DeepSeek processed about 2.6 times as many tokens as Qwen, the second-largest open source family in the study. By late 2025, however, no individual model consistently accounted for more than 20%-25% of open source model tokens. Usage had spread across five to seven competitive models.

The same study found that:

Models developed in China rose from as little as 1.2% of weekly token volume in late 2024 to nearly 30% in some weeks.
Chinese open source models averaged 13.0% of weekly OpenRouter token volume over the full period.
Open source models developed outside China averaged 13.7%.
Proprietary models developed outside China retained an average share of about 70%.
Roleplay represented about 52% of open source model tokens, while programming was the second-largest category at roughly 15%-20%.

OpenRouter itself continued to grow after the study period. Menlo Ventures reported that the platform increased from 2.5 million to more than 8 million developers in roughly one year and reached an annualized run rate of about 1.5 quadrillion tokens in May 2026.

Developer adoption of AI tooling is now near-universal. 84% of respondents to the 2025 Stack Overflow Developer Survey use or plan to use AI tools, up from 76% the year before, and 51% of professional developers use them daily.

Sources: OpenRouter, Menlo Ventures, Stack Overflow

The most popular open source LLMs

Popularity depends on the metric. Downloads measure distribution, derivative repositories measure how often developers build on a model, and routed tokens measure hosted API usage.

Meta reported that the Llama family passed 1 billion cumulative downloads in March 2025. On Hugging Face, however, Qwen became the largest ecosystem for derivative work: the family had more than 113,000 direct derivative models by March 2026 and more than 200,000 repositories when every Qwen-tagged model was included.

These figures count downloads and repositories, not unique users or production deployments. Automated downloads, mirrors, quantizations, and repeated pulls can all affect the totals.

You can compare current model releases, licenses, context windows, and hardware requirements on the OpenLLMStack models page.

Sources: Meta, Hugging Face

Open vs. closed model performance

The leading open model trailed the leading closed model by 3.3% on the Arena leaderboard in March 2026, according to the Stanford AI Index. The gap had been only 0.5% in August 2024 before reopening during 2025.

Date	Top closed-vs-open performance gap
August 2024	0.5%
March 2026	3.3%

The correct conclusion is not that open models have permanently reached parity. The gap is small enough to remain competitive, but it changes as new model generations arrive. Six of the top ten Arena models were closed as of March 2026.

Source: Stanford AI Index 2026

Where open source models come from

Models developed in China made up 41% of Hugging Face downloads in 2025, the largest share attributed to a single country. China surpassed the United States in both monthly and overall downloads during the year.

The contributor mix changed at the same time. The share of model development attributed to industry fell from roughly 70% before 2022 to 37% in 2025, while unaffiliated developers grew from 17% to 39% of downloads.

OpenRouter recorded a similar geographic shift, although through a different metric. Chinese open source models rose from 1.2% of weekly token usage in late 2024 to nearly 30% in some weeks of 2025.

Sources: Hugging Face, OpenRouter

The falling cost of LLM inference

An Andreessen Horowitz analysis found that the price of inference at a fixed level of MMLU performance fell by roughly 10x per year. At an MMLU score of 42, the cheapest observed price dropped from $60 per million tokens in 2021 to $0.06 in 2024, a 1,000-fold decline.

That historical estimate has limitations: it relies on MMLU, averages input and output pricing, and covers selected models from OpenAI, Anthropic, and Meta. It is useful as a directional cost trend, not a law that guarantees another 10x decline every year.

Current API pricing shows how large the spread can be. As of July 2, 2026:

Model	Input, cache miss	Cached input	Output
DeepSeek-V4-Flash	$0.14	$0.0028	$0.28
DeepSeek-V4-Pro	$0.435	$0.003625	$0.87
GPT-5.5	$5.00	$0.50	$30.00

Prices are per 1 million tokens at standard rates. On those published prices, DeepSeek-V4-Flash input is 97.2% cheaper and output is 99.1% cheaper than GPT-5.5.

Serving efficiency comes from more than cheaper hardware. Quantization, batching, caching, sparse architectures, and optimized kernels all contribute. OpenLLMStack tracks major techniques on the inference optimizations page and the engines that implement them in the inference directory.

Sources: Andreessen Horowitz, DeepSeek API pricing, OpenAI API pricing

Top open source LLM API providers

You do not need your own GPUs to run an open model. A growing set of serverless inference providers host the leading open weights behind a single API, billed per token, so you can switch models without managing infrastructure.

Provider	Known for
Together AI	Broad catalog of 200+ open models behind one unified API
Fireworks AI	Fast, production-grade serving of popular open models
Groq	Custom LPU silicon for very low latency and high throughput
DeepInfra	Low-cost, pay-per-token hosting of open models
Baseten	Custom deployment and autoscaling for open weights
Modular	Shared endpoints and reserved dedicated GPU capacity, optimized by MAX across GPU vendors
Hugging Face	Hub-native inference endpoints next to the models

For the cheapest access to a single model, the first-party API from the model maker is often the lowest-priced option, such as the APIs from DeepSeek and Alibaba for their own models. Aggregators like OpenRouter route one request across many of these providers so you can compare price and speed.

DeepSeek R1: the breakout moment for open source LLMs

No single release did more for the profile of open source LLMs than DeepSeek R1. The model launched on January 20, 2025, and within days the DeepSeek app climbed to No. 1 on the U.S. Apple App Store, displacing ChatGPT and topping the charts in more than 50 countries.

The download surge was almost vertical. The app reached 2.6 million downloads across the App Store and Google Play by the Monday after launch, with more than 80% of all downloads coming in the previous seven days, and Appfigures data ranked the app No. 1 worldwide.

The market reaction was just as dramatic. On January 27, 2025, Nvidia lost about $589 billion in market value, the largest single-day loss for any company in history, after DeepSeek showed that a frontier-grade open model could reportedly be trained for around $5.6 million.

You can trace these milestones on the OpenLLMStack timeline.

Sources: TechCrunch, Bloomberg, Forbes

Frequently asked questions

How many open source LLMs are there?

No authoritative organization maintains a global count of open source LLMs. Hugging Face hosted more than 2 million public model repositories in 2025, but that figure covers all model types and includes derivatives, adapters, and quantizations. It is not a count of unique LLMs.

Which open source LLM is the most popular?

It depends on the metric.

Meta reported more than 1 billion cumulative Llama downloads in March 2025.
Qwen had the largest derivative ecosystem reported by Hugging Face in March 2026, with more than 113,000 direct derivative models.
DeepSeek led open source usage on OpenRouter with 14.37 trillion tokens processed from November 2024 through November 2025.

What is the best open source LLM in 2026?

There is no single best open source LLM in 2026. Some models are good at coding, others at reasoning, long-context processing, or multilingual tasks. The right model depends on your workload, hardware, latency requirements, and budget.

If you self-host an open source LLM, you can also adapt it to your domain by fine-tuning the model on proprietary data. This can significantly improve performance for specialized tasks, such as legal analysis, healthcare, finance, or customer support, helping the model outperform a general-purpose foundation model in your specific domain.

By mid-2026, several open families compete at or near the proprietary frontier, each with a different strength.

Model	Maker	Strongest at
GLM-5.2	Z.ai	State-of-the-art coding and agentic engineering, 1M token context
MiniMax-M3	MiniMax	Frontier coding and agentic work, native multimodal and computer use
DeepSeek-V4-Pro	DeepSeek	Reasoning and coding with adaptive effort modes and strong world knowledge
Kimi-K2.6	Moonshot AI	Long-horizon coding and agent swarm orchestration
Qwen3.5-397B-A17B	Alibaba	Multimodal reasoning across 200+ languages, very long context
MiMo-V2.5-Pro	Xiaomi	Token-efficient coding agents with long-context reasoning
Gemma 4	Google	Top-tier reasoning and coding

A common thread runs through the 2026 leaders: state-of-the-art coding, agentic tool use, and context windows that now stretch to 1 million tokens or more.

For the full list with parameters, licenses, context windows, and recommended GPUs, see the OpenLLMStack models page.

How much open source LLM usage is there?

Open source models reached roughly one-third of token volume on OpenRouter by late 2025. This is strong evidence of adoption on that platform, but self-hosted usage and traffic on other providers are not included.

Conclusion

Open source AI in 2025 and 2026 is defined by three trends: models that now rival closed systems on quality, inference costs that have collapsed, and a center of gravity shifting toward Chinese and independent developers. Open source models are no longer the budget option. For a growing share of teams, they’re the default.

If you’re building with open models, OpenLLMStack tracks current releases, inference engines, optimization techniques, and agent frameworks in one place.