This analysis uses StarScout, an open-source tool from ICSE '26 research paper "Six Million (Suspected) Fake Stars on GitHub."
Two complementary heuristics identify suspected fake stars:
| Heuristic | What It Catches | Signal Strength |
|---|---|---|
| Low-Activity | Throwaway accounts (single-day activity, ≤2 actions) | High confidence for intentional fraud |
| Clustered/Lockstep | Coordinated starring campaigns (many users starring same repos in tight timeframes) | Catches artificial amplification; some false positives from viral organic growth |
Intentional Fraud (Low-Activity Heuristic)
- Crypto/trading bots, MEV extractors
- Game hacks, cheats, exploits
- Darkweb/tor directories
- "Predictor" and gambling signal tools
- Pirated software, cracks
Artificial Amplification (Clustered Heuristic)
- AI/LLM projects riding hype cycles
- Frontend component libraries seeking visibility
- "Awesome" curated lists gaming discoverability
- Blockchain/Web3 projects
| Fake Star % | Repos | Interpretation |
|---|---|---|
| 90-100% | 63 | Almost certainly fraudulent |
| 70-89% | 77 | Highly suspicious |
| 50-69% | 134 | Likely boosted |
| <50% | 3,068 | Mixed signals; may include organic growth |
From repos with >50% fake stars:
| Category | Fake Stars | Description |
|---|---|---|
| Bots/Automation | 6,741 | Trading bots, scrapers, automation tools |
| AI/LLM | 5,041 | AI wrappers, prompt tools |
| Hacks/Cheats | 3,169 | Game exploits, cracks |
| Predictors | 2,092 | Gambling/trading "signal" tools |
| Darkweb | 1,986 | Tor/onion link directories |
Cross-referencing the most-starred repos in January 2025 with our detection data:
| Project | Jan Stars | Flagged Stars | Fake % | Concern Level |
|---|---|---|---|---|
| unionlabs/union | 19,942 | 65,309 | 47.4% | Very High |
| langflow-ai/langflow | 4,635 | 31,515 | 47.9% | Very High |
| raga-ai-hub/RagaAI-Catalyst | 2,892 | 5,522 | 49.5% | Very High |
| raga-ai-hub/AgentNeo | 3,567 | 1,398 | 42.3% | Very High |
| shardeum/shardeum | 4,137 | 6,077 | 42.2% | Very High |
| DigitalPlatDev/FreeDomain | 2,554 | 90,091 | 32.1% | High |
| anoma/anoma | 4,569 | 31,553 | 23.0% | High |
| linera-io/linera-protocol | 11,389 | 35,584 | 21.4% | High |
These projects show enormous clustered starring activity. While some may be organic viral growth (especially established AI tools), the scale is notable:
| Project | Jan Stars | Clustered Stars | Notes |
|---|---|---|---|
| deepseek-ai/* | 120k+ | 600k+ total | AI hype cycle; likely mix of organic + amplified |
| open-webui/open-webui | 9,719 | 124,526 | Popular LLM UI |
| huggingface/open-r1 | 12,739 | 121,231 | DeepSeek-R1 replication |
| browser-use/browser-use | 13,369 | 119,278 | Browser automation AI |
| ollama/ollama | 10,704 | 117,344 | Local LLM runner |
| inkonchain/* | 98k+ | 89k+ total | Blockchain project; coordinated campaign |
| Sector | Avg Fake % | Total Fake Stars | Profile |
|---|---|---|---|
| Bots/Automation | 50.2% | 9,873 | Highest fraud rate; trading bots, scrapers |
| Blockchain/Crypto | 35.8% | 8,727 | Second highest; investor-driven visibility |
| Hacks/Exploits | 26.2% | 6,663 | Game cheats, cracks |
| AI/ML | 11.8% | 177,153 | Lower rate but massive volume due to hype |
The highest fraud rate by sector. These projects are disproportionately represented:
- unionlabs/union (47% fake)
- shardeum/shardeum (42% fake)
- anoma/anoma (23% fake)
- linera-io/linera-protocol (21% fake)
- inkonchain/* (massive clustered activity)
Star-buying appears endemic in crypto/Web3, likely driven by investor relations and token launch visibility.
AI projects have a lower fraud rate (11.8%) but the highest absolute volume (177k fake stars) due to the sector's explosive growth. The AI hype cycle creates pressure for visibility.
Worst AI offenders:
- langflow-ai/langflow (48% fake) - AI workflow builder
- raga-ai-hub/RagaAI-Catalyst (50% fake) - AI testing platform
- raga-ai-hub/AgentNeo (42% fake) - AI agent framework
- openai/openai-fm (57% fake) - Voice model demo
- sidetrip-ai/ici-core (82% fake) - AI assistant
Many smaller AI projects show 70-99% fake stars, suggesting a cottage industry of AI wrappers and "agents" using fake stars to appear legitimate.
Projects Popular Today Specifically Due to Fraud:
| Project | Sector | Fake % | Verdict |
|---|---|---|---|
| unionlabs/union | Blockchain | 47% | Nearly half fake |
| langflow-ai/langflow | AI | 48% | Half fake; major AI tool |
| raga-ai-hub/* | AI | 42-50% | Both repos heavily inflated |
| shardeum/shardeum | Blockchain | 42% | Investor-driven fraud |
| DigitalPlatDev/FreeDomain | Utility | 32% | 90k flagged stars |
| openai/openai-fm | AI | 57% | Even OpenAI projects targeted |
The pattern is clear:
- Blockchain/crypto has the highest fraud rate (~36% average)
- AI/ML follows with lower rates but massive volume—the hype cycle creates pressure to game visibility
- The two hottest sectors in tech are also the most fraudulent on GitHub
These projects' current visibility is substantially inflated by purchased or manufactured engagement.
- Low-Activity Detection Results (Jan 2025)
- Clustered/Lockstep Detection Results (Jan 2025)
- Detection tool: StarScout
- Research paper: "Six Million (Suspected) Fake Stars on GitHub" (ICSE '26)