Back to Blog

March 29, 2026 · 9 min read

How last30days-skill Built a 14K-Star Research Engine in One File

14,449 stars. 10 platforms. One SKILL.md file. Here's the architecture that made last30days-skill the most popular Claude Code skill on GitHub.

Published by GitIntel Research

TLDR

14,449

Stars

1,153

Forks

10+

Platforms

30+

Modules

455+

Tests

The Two-Layer Architecture

Most AI tools try to do everything in one layer. last30days-skill splits the problem cleanly in two:

Layer 1: SKILL.md (the AI layer) — A single markdown file with YAML frontmatter that defines the skill's name, version, allowed tools, and behavioral instructions. Claude reads this file and knows exactly what it can do, how to parse user queries, and how to synthesize results. No Python orchestration, no LangChain, no agent framework. Just a markdown file that Claude treats as its system prompt.

Layer 2: Python pipeline (the data layer) — All the expensive operations — API calls, scoring algorithms, deduplication, caching — live in Python scripts that Claude invokes via the Bash tool. A ThreadPoolExecutor fetches from 10+ sources in parallel. The pipeline returns structured JSON that Claude synthesizes into a grounded report.

This separation is the core insight. Claude handles intent parsing and natural language synthesis (what it's good at). Python handles parallel I/O, math, and data manipulation (what it's good at). Neither layer tries to do the other's job.

10 Patterns That Made It Viral

1. Query Type Detection (No LLM Required)

Every incoming query gets classified into one of 7 types using regex patterns — no LLM call needed. This classification drives everything downstream: which sources to hit, how to score results, and how to structure the output.

# query_type.py — regex-based classification
QUERY_TYPES = {
    "product":       r"(tool|app|software|platform|service|product)",
    "concept":       r"(what is|explain|define|meaning of|concept)",
    "opinion":       r"(think|opinion|take on|hot take|controversial)",
    "how_to":        r"(how to|tutorial|guide|step.by.step|setup)",
    "comparison":    r"(vs|versus|compared to|better than|or\b)",
    "breaking_news": r"(just happened|breaking|announced|launched today)",
    "prediction":    r"(will|future|predict|forecast|expect|2027)"
}

Each type maps to a source tier configuration. A “breaking news” query prioritizes X and HN (tier 1), while a “how to” query prioritizes YouTube and Reddit (tier 1). This means the skill never wastes API calls on irrelevant sources.

2. Source Tiering

Not all sources are equal for every query. The tiering system assigns each source a priority based on query type, then allocates timeout budgets accordingly.

# Source tiers for "product" query type
PRODUCT_TIERS = {
    "tier1": ["reddit", "hackernews", "youtube"],   # Always fetch
    "tier2": ["x", "bluesky", "brave_search"],      # Fetch if time allows
    "tier3": ["tiktok", "instagram", "polymarket"]   # Fetch on deep mode only
}

# Source tiers for "breaking_news" query type
BREAKING_NEWS_TIERS = {
    "tier1": ["x", "hackernews", "reddit"],
    "tier2": ["bluesky", "youtube", "brave_search"],
    "tier3": ["polymarket", "tiktok", "instagram"]
}

Tier 1 sources always get fetched. Tier 2 sources run in parallel but get killed if the timeout budget is exhausted. Tier 3 sources only activate in “deep” mode. This graduated approach means quick queries return in 2 minutes while deep research gets up to 8 minutes of coverage.

3. Engagement-Weighted Scoring

Raw search results are noise. The scoring system transforms them into signal using platform-specific engagement formulas, then normalizes everything to a 0-100 scale.

# Platform-specific engagement scoring
REDDIT_WEIGHTS = {
    "score": 0.50,        # Upvotes minus downvotes
    "comments": 0.35,     # Discussion depth
    "upvote_ratio": 0.05, # Community agreement
    "top_comment": 0.10   # Quality of top reply
}

X_WEIGHTS = {
    "likes": 0.55,
    "reposts": 0.25,
    "replies": 0.15,
    "quotes": 0.05
}

# Final composite score
composite = (
    relevance * 0.45 +    # Text similarity to query
    recency * 0.25 +      # How fresh is this
    engagement * 0.30     # Platform engagement score
)

The weights reflect real information quality signals. Reddit's comment count matters more than raw score because discussion depth indicates substance. X's like count dominates because reposts and quotes are noisier signals. Polymarket gets its own formula entirely: 30% text relevance, 30% volume, 15% liquidity, 15% velocity, 10% competitiveness.

4. Cross-Source Deduplication

The same story appears on Reddit, HN, and X with different headlines. Naive dedup misses these. last30days-skill uses a hybrid similarity approach that catches cross-platform duplicates while preserving genuinely unique perspectives.

# Hybrid similarity: max of two approaches
def similarity(text_a: str, text_b: str) -> float:
    char_trigrams_a = set(ngrams(text_a, 3))
    char_trigrams_b = set(ngrams(text_b, 3))
    trigram_sim = jaccard(char_trigrams_a, char_trigrams_b)

    tokens_a = set(text_a.lower().split())
    tokens_b = set(text_b.lower().split())
    token_sim = jaccard(tokens_a, tokens_b)

    return max(trigram_sim, token_sim)

# Cross-source linking: bidirectional refs
if similarity(item_a, item_b) > THRESHOLD:
    item_a.cross_refs.append(item_b.source)
    item_b.cross_refs.append(item_a.source)

Taking the max of character trigram and token-level Jaccard is clever — trigrams catch paraphrased content while token matching catches reformatted headlines. Items that appear on multiple platforms get bidirectional cross-references, which Claude uses to boost confidence in its synthesis.

5. Multi-Emit Output Formats

One research pipeline, five output modes. The --emit flag controls how results are delivered:

--emit=compact   # Raw data for Claude to synthesize
--emit=json      # Structured JSON for programmatic use
--emit=md        # Formatted markdown report
--emit=context   # Reusable snippet other skills can import
--emit=path      # Just save to file, return the path

The context mode is particularly interesting — it produces a condensed research summary that other Claude Code skills can consume as input. This turns last30days-skill into a research primitive that composes with the broader skill ecosystem.

6. Tiered Timeout Profiles

Different queries need different time budgets. The timeout system has three profiles with per-source budget allocation and a global watchdog that kills everything if exceeded.

TIMEOUT_PROFILES = {
    "quick":   { "total": 120,  "per_source": 15  },  # 2 min
    "default": { "total": 300,  "per_source": 30  },  # 5 min
    "deep":    { "total": 480,  "per_source": 60  },  # 8 min
}

# Global watchdog — no source can outlive the total budget
def watchdog(profile: str):
    budget = TIMEOUT_PROFILES[profile]["total"]
    time.sleep(budget)
    kill_all_pending_sources()

This is defensive engineering at its best. API calls are inherently unreliable — one slow source shouldn't block the entire research pipeline. The watchdog ensures the user always gets results within the promised time window.

7. Config Hierarchy with Graceful Degradation

API keys and preferences cascade through three layers, with the skill working even when some keys are missing.

# Priority: env vars > project config > global config
CONFIG_SOURCES = [
    os.environ,                                    # Highest priority
    load_dotenv(".claude/last30days.env"),          # Per-project
    load_dotenv("~/.config/last30days/.env"),       # Global fallback
]

# Missing API key? Skip that source, don't crash
for source in enabled_sources:
    if not has_api_key(source):
        log.warn(f"Skipping {source}: no API key")
        continue

This matters for adoption. A user with only a Reddit API key can start using the skill immediately. They don't need to configure 10 API keys before getting value. Each additional key unlocks more sources, creating a natural upgrade path.

8. Comparison Mode (3 Parallel Research Passes)

When a user asks “React vs Svelte”, the skill doesn't just search for that phrase. It runs three parallel research passes and synthesizes the results.

# Comparison mode: 3 parallel passes
async def compare(topic_a: str, topic_b: str):
    results = await gather(
        research(topic_a),           # Pass 1: Topic A alone
        research(topic_b),           # Pass 2: Topic B alone
        research(f"{topic_a} vs {topic_b}")  # Pass 3: Direct comparison
    )
    return synthesize_comparison(results)

This produces dramatically better comparisons than a single search. Pass 1 and 2 surface each topic's strengths and weaknesses independently. Pass 3 catches head-to-head discussions that already exist. Claude merges all three into a balanced analysis.

9. File-Based Caching (24h TTL)

Every research run gets cached locally with a 24-hour TTL. The cache key is a SHA256 hash of the topic, date range, and source configuration.

# Cache key: SHA256(topic + dates + sources)
cache_dir = Path.home() / ".cache" / "last30days"
cache_key = sha256(f"{topic}|{start}|{end}|{sources}").hexdigest()
cache_file = cache_dir / f"{cache_key}.json"

# 24h TTL for research, 7-day TTL for model selection
if cache_file.exists():
    age = time.time() - cache_file.stat().st_mtime
    if age < 86400:  # 24 hours
        return json.loads(cache_file.read_text())

No database. No Redis. Just JSON files in ~/.cache/last30days/. Model selection caching (7-day TTL) is stored separately because model availability changes less frequently than content. Simple, debuggable, zero dependencies.

10. Auto-Save Research Library

Every research run automatically saves a complete markdown briefing to ~/Documents/Last30Days/. Over time, this builds a personal research library that's searchable, version-controlled, and independent of any platform.

This is a retention mechanic disguised as a feature. Users accumulate research artifacts that increase switching costs. The more you use the skill, the more valuable your local library becomes — and the less likely you are to switch to a competitor.

The SKILL.md-as-Prompt Pattern

The most counterintuitive decision in the entire architecture is using a single markdown file as the AI's instruction set. No LangChain. No agent framework. No prompt management system. Just a file with YAML frontmatter that Claude reads directly.

# SKILL.md frontmatter
---
name: last30days
version: 2.4.0
description: Research any topic across 10+ platforms
allowed-tools:
  - Bash
  - Read
  - Write
  - Glob
metadata:
  author: mvanhorn
  license: MIT
---

## Instructions
You are a research analyst. When the user asks about
a topic, run the Python pipeline and synthesize results...

This works because Claude Code already has a robust tool-calling interface. The SKILL.md doesn't need to implement tool routing — it just describes what tools to use and when. The Claude runtime handles execution. The result is a skill that's readable by humans, editable without redeployment, and portable across Claude Code, Gemini CLI, and Codex variants.

Repository Structure

last30days-skill/
  SKILL.md                    # Full skill prompt (YAML frontmatter)
  scripts/
    last30days.py             # Main orchestrator (ThreadPoolExecutor)
    lib/
      query_type.py           # 7-type regex classifier
      score.py                # Platform-specific engagement scoring
      dedupe.py               # Hybrid trigram-token Jaccard
      cache.py                # SHA256 key, 24h TTL, JSON files
      env.py                  # Config hierarchy + graceful degradation
      render.py               # Multi-emit output + freshness warnings
      normalize.py            # Cross-platform schema normalization
      schema.py               # Dataclass definitions
      openai_reddit.py        # Reddit source module
      xai_x.py                # X/Twitter source module
      hackernews.py           # Hacker News source module
      youtube_yt.py           # YouTube source module
      polymarket.py           # Polymarket source module
      bluesky.py              # Bluesky source module
      tiktok.py               # TikTok source module
      ... (30+ modules total)
  .claude-plugin/
    plugin.json               # Claude Code plugin manifest
    marketplace.json          # Plugin marketplace metadata
  variants/
    open/SKILL.md             # Open variant (watchlist, SQLite)
  agents/
    openai.yaml               # Codex CLI compatibility

30+ modules, 455+ tests, and the entire AI behavior defined in a single markdown file. The separation between SKILL.md (what to do) and the Python pipeline (how to do it) is maintained rigorously throughout.

What Builders Can Learn

Separate AI reasoning from data fetching. Let the LLM do what it's good at (synthesis, judgment, natural language) and let deterministic code handle I/O, math, and data manipulation. Don't use an LLM to route API calls when regex works.

Design for graceful degradation. A skill that requires 10 API keys before it works will never get adopted. Start with zero config and let each additional key unlock more capability.

Cache aggressively, expire honestly. File-based caching with honest TTLs (24h for content, 7d for model selection) eliminates redundant API calls without serving stale data. The render module even warns users when results aren't actually from the last 30 days.

Build retention mechanics into the product. Auto-saving research to a local library isn't just convenient — it creates switching costs that compound with usage. The more research you run, the more valuable your local archive becomes.

Analyze your own repo's architecture

GitIntel scans git history to surface AI-generated code patterns, contributor workflows, and architecture decisions.

# Install
curl -fsSL https://gitintel.com/install.sh | sh

# Scan any repo
cd your-repo
gitintel scan

Open source (MIT) · Local-first · No data leaves your machine

Analysis based on mvanhorn/last30days-skill repository as of March 29, 2026. Stats verified at time of publication.