How to Produce AI Tool Insights from Raw Data

• Technical Guide
← Back to Blog
8 min read • A step-by-step walkthrough of the Vibe Data API, analysis techniques, and insight production

Every week, our blog posts surface insights like "GitHub Copilot surged 27%" or "v0 leads engagement at 7.7x" - but where does that data come from, and how do we turn raw numbers into actionable analysis? This guide walks through the entire process: calling the Vibe Data API, analyzing the results, identifying trends, and producing the kind of data-driven insights you see in our weekly reports. The same API endpoints are available to anyone with an API key.

What You'll Learn

  • The API: 22 endpoints covering 12 data sources - from Reddit mentions to NPM downloads
  • Querying data: How to call each endpoint and what the response looks like
  • Analysis techniques: Week-over-week comparisons, engagement ratios, and cross-source correlation
  • Insight patterns: The formulas behind every metric in our weekly blog posts
  • Pitfalls: Common mistakes that lead to misleading conclusions
22
API Endpoints
12
Data Sources
50+
Tools Tracked

Step 1: Understand the Data Sources

Vibe Data collects metrics from 12 platforms, each offering a different signal about developer tool adoption. No single source tells the full story - the power comes from cross-referencing them.

Source What It Measures API Endpoint Update Frequency
Reddit Developer discussion volume & sentiment /api/reddit Daily
HackerNews Technical community interest /api/hackernews Daily
Stack Overflow Developer support demand /api/stackoverflow Daily
NPM JavaScript SDK adoption /api/npm Daily
PyPI Python package adoption /api/pypi Daily
GitHub Open source popularity (stars, forks) /api/github Daily
Docker Hub Container image pulls /api/docker Daily
VS Code IDE extension installs /api/vscode-extensions Daily
JetBrains IDE plugin downloads /api/jetbrains-plugins Daily
HuggingFace Model downloads & likes /api/huggingface/models Daily
Homebrew CLI tool installs (macOS/Linux) /api/homebrew Daily
Chrome Web Store Browser extension users /api/chrome-extensions Daily

Discussion platforms (Reddit, HackerNews, Stack Overflow) tell you what developers are talking about. Package managers (NPM, PyPI, Docker) tell you what they're actually using. The gap between the two is where the most interesting insights live.

Step 2: Query the API

Every endpoint returns JSON in the same format: { success: true, count: N, data: [...] }. All API requests require an API key for authentication.

Authentication

Include your API key with every request using either method:

# Authorization header (Bearer token)
curl -H "Authorization: Bearer YOUR_API_KEY" "https://vibe-data.com/api/npm?limit=10"

API keys are available through the Pricing page. Requests without a valid key return a 401 error. Rate limits are applied per key and shown in the X-RateLimit-Remaining response header.

Basic Request

# Get the top 10 NPM packages by downloads
curl -H "Authorization: Bearer YOUR_API_KEY" "https://vibe-data.com/api/npm?limit=10"

# Response:
{
  "success": true,
  "count": 10,
  "data": [
    {
      "package_name": "openai",
      "downloads": 12369203,
      "weekly_downloads": 12369203,
      "monthly_downloads": 48245832,
      "version": "4.82.0",
      "scraped_at": "2026-02-11T..."
    },
    ...
  ]
}

Filtering by Tool

# Get Reddit mentions for a specific tool
curl -H "Authorization: Bearer YOUR_API_KEY" \
  "https://vibe-data.com/api/reddit?tool_name=cursor&limit=50"

# Get Stack Overflow questions about Claude
curl -H "Authorization: Bearer YOUR_API_KEY" \
  "https://vibe-data.com/api/stackoverflow?tool_name=claude&limit=20"

# Get HackerNews posts about GitHub Copilot
curl -H "Authorization: Bearer YOUR_API_KEY" \
  "https://vibe-data.com/api/hackernews?tool_name=copilot&limit=30"

Time-Series Data

# Get 90 days of GitHub star history for a specific repo
curl -H "Authorization: Bearer YOUR_API_KEY" \
  "https://vibe-data.com/api/github/history/anthropics/claude-code?days=90"

# Response:
{
  "success": true,
  "repo": "anthropics/claude-code",
  "count": 90,
  "data": [
    { "date": "2025-11-14", "stars": 18200, "forks": 1340, "open_issues": 89 },
    { "date": "2025-11-15", "stars": 18450, "forks": 1355, "open_issues": 91 },
    ...
  ]
}

Specialized Endpoints

# Get AI benchmark scores (HumanEval, SWE-Bench, etc.)
curl -H "Authorization: Bearer YOUR_API_KEY" \
  "https://vibe-data.com/api/benchmarks?benchmark_name=HumanEval"

# Get tool pricing snapshots
curl -H "Authorization: Bearer YOUR_API_KEY" "https://vibe-data.com/api/pricing"

# Get HuggingFace text generation models
curl -H "Authorization: Bearer YOUR_API_KEY" \
  "https://vibe-data.com/api/huggingface/models?pipeline_tag=text-generation&limit=20"

# Get global statistics summary
curl -H "Authorization: Bearer YOUR_API_KEY" "https://vibe-data.com/api/metrics/latest"

Full Endpoint Reference

  • /api/tools - All tracked tools with metadata
  • /api/tools/:slug - Single tool with latest metrics
  • /api/metrics/latest - Global statistics
  • /api/npm, /api/pypi, /api/docker, /api/homebrew - Package managers
  • /api/github, /api/github/history/:repo - GitHub repos & time-series
  • /api/reddit, /api/hackernews, /api/stackoverflow, /api/twitter - Discussion platforms
  • /api/vscode-extensions, /api/jetbrains-plugins, /api/chrome-extensions - IDE & browser
  • /api/huggingface/models, /api/huggingface/datasets - ML models
  • /api/benchmarks - AI benchmark scores
  • /api/pricing - Tool pricing data
  • /api/jobs/recent, /api/jobs/stats - Job market data
  • /api/chinese-ai - Chinese AI ecosystem
  • /api/github-marketplace - GitHub Marketplace apps

Step 3: Analyze the Data

Raw API data is just numbers. The analysis techniques below are exactly what we use to produce the insights in our weekly blog posts.

Technique 1: Week-over-Week Comparison

This is the backbone of every weekly report. Pull data for two consecutive periods, then calculate the percentage change:

// Fetch Reddit mentions for two time windows
const headers = { 'Authorization': 'Bearer YOUR_API_KEY' };
const thisWeek = await fetch('/api/reddit?tool_name=cursor&limit=500', { headers });
const thisWeekData = await thisWeek.json();

// Filter by date range
const thisWeekStart = new Date('2026-02-04');
const thisWeekEnd = new Date('2026-02-11');
const lastWeekStart = new Date('2026-01-28');
const lastWeekEnd = new Date('2026-02-04');

const thisWeekMentions = thisWeekData.data.filter(m => {
  const d = new Date(m.created_utc || m.scraped_at);
  return d >= thisWeekStart && d < thisWeekEnd;
});

const lastWeekMentions = thisWeekData.data.filter(m => {
  const d = new Date(m.created_utc || m.scraped_at);
  return d >= lastWeekStart && d < lastWeekEnd;
});

// Calculate week-over-week change
const thisCount = thisWeekMentions.length;
const lastCount = lastWeekMentions.length;
const wowChange = ((thisCount - lastCount) / lastCount * 100).toFixed(1);

console.log(`Cursor: ${thisCount} mentions (${wowChange}% WoW)`);

When we report "GitHub Copilot surged 26.7%", this is the exact calculation: (403 - 318) / 318 * 100 = 26.7%.

Technique 2: Engagement Ratios

Mention count alone doesn't capture intensity. The engagement ratio measures how much community interaction each mention generates:

// Calculate engagement ratio from Reddit data
function engagementRatio(mentions) {
  const totalUpvotes = mentions.reduce((sum, m) => sum + (m.score || 0), 0);
  const mentionCount = mentions.length;
  return (totalUpvotes / mentionCount).toFixed(1);
}

// Example: v0 with 7,506 upvotes from 978 mentions = 7.7x
// This means each v0 discussion generates 7.7x more
// community interaction than the average mention

When our blog says "v0 leads engagement at 7.7x", that's total upvotes divided by mention count. Tools with high engagement ratios have smaller but more passionate communities - a leading indicator of growth.

Technique 3: Cross-Source Correlation

The most powerful insights come from combining data across sources. When Reddit mentions spike but NPM downloads stay flat, that's hype. When both move together, that's real adoption.

// Compare discussion volume vs actual SDK adoption
const headers = { 'Authorization': 'Bearer YOUR_API_KEY' };
const redditData = await fetch('/api/reddit?tool_name=claude&limit=100', { headers });
const npmData = await fetch('/api/npm?limit=20', { headers });

const reddit = await redditData.json();
const npm = await npmData.json();

// Find Claude/Anthropic in NPM data
const anthropicSDK = npm.data.find(p =>
  p.package_name === '@anthropic-ai/sdk'
);

console.log(`Claude Reddit mentions: ${reddit.count}`);
console.log(`Anthropic SDK weekly downloads: ${anthropicSDK.weekly_downloads}`);
// Cross-reference: high discussion + high downloads = confirmed adoption
// High discussion + low downloads = hype or early-stage interest

Technique 4: Market Share Calculation

To compare tools in the same category, calculate their share of total activity:

// NPM market share for AI SDKs
const headers = { 'Authorization': 'Bearer YOUR_API_KEY' };
const npm = await fetch('/api/npm?limit=10', { headers });
const data = await npm.json();

const totalDownloads = data.data.reduce(
  (sum, pkg) => sum + pkg.weekly_downloads, 0
);

data.data.forEach(pkg => {
  const share = (pkg.weekly_downloads / totalDownloads * 100).toFixed(1);
  console.log(`${pkg.package_name}: ${share}% of total SDK downloads`);
});

// Output:
// openai: 38.2% of total SDK downloads
// ai (Vercel): 24.2% of total SDK downloads
// @anthropic-ai/sdk: 18.4% of total SDK downloads
// ...

Step 4: Identify Trends

Individual data points are facts. Trends are insights. Here are the patterns we look for when producing each weekly report.

Pattern: Momentum Shifts

Leading Indicator

Look for tools that reverse direction after multiple weeks of decline (or growth). A tool that drops for 3 weeks then rebounds often signals a fundamental change - a product update, competitor stumble, or community event. Example: GitHub Copilot's 26.7% surge in the week of after weeks of flat activity.

Pattern: Discussion-Adoption Divergence

Cross-Source

When Reddit mentions grow but NPM downloads don't follow within 2-4 weeks, the hype isn't converting. Conversely, tools with growing downloads but flat discussion may be quietly winning through word-of-mouth. The Vercel AI SDK at 7.83M weekly downloads with moderate Reddit presence is a textbook example of silent adoption.

Pattern: Engagement Decay

Warning Signal

Rising mentions with declining engagement ratio means a tool is generating more noise but less passion. This often precedes a mention decline by 2-3 weeks. Track the engagement ratio weekly and flag any tool where mentions grow 10%+ but engagement drops 15%+.

Pattern: Stack Overflow as Lagging Indicator

Confirmation

Stack Overflow questions typically lag Reddit and NPM trends by 3-6 weeks. When a tool starts appearing in Stack Overflow questions, developers are past the exploration phase and into production integration. A spike in Stack Overflow activity for a tool confirms that the earlier discussion hype was real.

Step 5: Produce the Insight

Here's the complete workflow we use to go from API call to published insight, using a real example from the weekly report.

1 Pull the Numbers

// Query Reddit mentions for this week and last week
const tools = ['cursor', 'bolt', 'chatgpt', 'v0', 'claude',
               'openai', 'lovable', 'anthropic', 'copilot', 'replit'];

for (const tool of tools) {
  const res = await fetch(`/api/reddit?tool_name=${tool}&limit=500`, {
    headers: { 'Authorization': 'Bearer YOUR_API_KEY' }
  });
  const data = await res.json();
  // Filter into weekly windows and count...
}

// Query NPM for SDK rankings
const npm = await fetch('/api/npm?limit=10', {
  headers: { 'Authorization': 'Bearer YOUR_API_KEY' }
});

// Query Stack Overflow for support demand
const so = await fetch('/api/stackoverflow?limit=100', {
  headers: { 'Authorization': 'Bearer YOUR_API_KEY' }
});

2 Calculate the Metrics

// For each tool, compute:
// 1. This week's mention count
// 2. Last week's mention count
// 3. Week-over-week % change
// 4. Total upvotes (engagement)
// 5. Engagement ratio (upvotes / mentions)

const results = tools.map(tool => ({
  name: tool,
  thisWeek: countThisWeek(tool),
  lastWeek: countLastWeek(tool),
  change: calculateWoW(tool),
  engagement: totalUpvotes(tool),
  ratio: engagementRatio(tool)
}));

// Sort by mention count for rankings
results.sort((a, b) => b.thisWeek - a.thisWeek);

3 Find the Story

Scan the calculated metrics for the most newsworthy patterns:

4 Write the Lead

Always lead with the most compelling stat. The formula is: [Tool] [action verb] [specific number] [context].

Lead Formula Examples

  • "GitHub Copilot surged 26.7% week-over-week to 403 Reddit mentions"
  • "Total AI tool discussions rebounded 2.6% to 9,780 mentions"
  • "Anthropic's SDK crossed 6M weekly NPM downloads"

Each lead includes: the tool name, the direction, the exact number, and the time context. No hedging, no qualifiers - just data.

5 Validate Before Publishing

Before publishing any insight, run these checks:

  1. Sample size: Does the metric have at least 100 data points? Small samples produce misleading percentages.
  2. Date accuracy: Are you comparing equivalent time windows? A 6-day week vs a 7-day week will skew results.
  3. Outlier check: Is a single viral post responsible for the entire change? If one post with 500 upvotes drove 80% of the engagement, note it.
  4. Cross-reference: Does at least one other data source support the claim? A Reddit spike that doesn't show up in HackerNews or NPM may be an anomaly.
  5. Historical context: Is the change significant relative to the tool's normal volatility? A 10% swing for a tool that regularly swings 15% isn't a story.

Common Pitfalls

After producing weekly insights for over a year, here are the mistakes we've learned to avoid.

1. Treating All Mentions as Equal

A Reddit post with 500 upvotes carries more signal than 50 posts with 1 upvote each. Always pair mention counts with engagement metrics. A tool with fewer mentions but higher engagement is often in a stronger position than one with high volume and low engagement.

2. Ignoring Seasonality

Developer discussions drop predictably during holidays, conference weeks, and major product launches (when everyone is testing instead of discussing). Week-over-week comparisons across these boundaries will show misleading declines. Compare to the same week in prior months when possible.

3. Confusing Correlation with Causation

When Cursor mentions rise the same week OpenAI launches GPT-5, it doesn't mean GPT-5 caused the Cursor spike. Look for direct evidence in the discussion content before attributing causation. The API gives you counts, not reasons.

4. Cherry-Picking Time Windows

Choosing a "bad" week as your baseline will make any tool look like it's surging. Always use consistent, rolling 7-day windows with fixed start days. Our reports run Tuesday to Tuesday for consistency.

Putting It All Together

Here's a complete Node.js script that pulls data from the API and produces a basic weekly summary:

const BASE = 'https://vibe-data.com';
const API_KEY = 'YOUR_API_KEY'; // Replace with your key
const headers = { 'Authorization': `Bearer ${API_KEY}` };

async function weeklyInsights() {
  // 1. Get Reddit mentions
  const tools = ['cursor', 'bolt', 'chatgpt', 'v0', 'claude',
                 'openai', 'lovable', 'anthropic', 'copilot', 'replit'];

  const results = [];

  for (const tool of tools) {
    const res = await fetch(`${BASE}/api/reddit?tool_name=${tool}&limit=500`, { headers });
    const { data } = await res.json();

    const now = new Date();
    const weekAgo = new Date(now - 7 * 24 * 60 * 60 * 1000);
    const twoWeeksAgo = new Date(now - 14 * 24 * 60 * 60 * 1000);

    const thisWeek = data.filter(m =>
      new Date(m.created_utc) >= weekAgo
    );
    const lastWeek = data.filter(m =>
      new Date(m.created_utc) >= twoWeeksAgo &&
      new Date(m.created_utc) < weekAgo
    );

    const thisCount = thisWeek.length;
    const lastCount = lastWeek.length;
    const change = lastCount > 0
      ? ((thisCount - lastCount) / lastCount * 100).toFixed(1)
      : 'N/A';

    const upvotes = thisWeek.reduce((s, m) => s + (m.score || 0), 0);
    const ratio = thisCount > 0
      ? (upvotes / thisCount).toFixed(1) : 0;

    results.push({ tool, thisCount, lastCount, change, upvotes, ratio });
  }

  // 2. Sort and display
  results.sort((a, b) => b.thisCount - a.thisCount);
  const total = results.reduce((s, r) => s + r.thisCount, 0);
  const lastTotal = results.reduce((s, r) => s + r.lastCount, 0);
  const marketChange = ((total - lastTotal) / lastTotal * 100).toFixed(1);

  console.log(`Total mentions: ${total} (${marketChange}% WoW)\n`);
  console.log('Rank | Tool         | Mentions | Change | Engagement');
  console.log('-----|--------------|----------|--------|----------');

  results.forEach((r, i) => {
    console.log(
      `${i + 1}.   | ${r.tool.padEnd(12)} | ${String(r.thisCount).padStart(8)} | ${String(r.change + '%').padStart(6)} | ${r.ratio}x`
    );
  });

  // 3. Get NPM data for cross-reference
  const npm = await fetch(`${BASE}/api/npm?limit=5`, { headers });
  const npmData = await npm.json();
  console.log('\nTop NPM packages:');
  npmData.data.forEach(p => {
    console.log(`  ${p.package_name}: ${(p.weekly_downloads / 1e6).toFixed(2)}M/week`);
  });
}

weeklyInsights();

This script produces the same rankings table and market summary that appears in every weekly blog post. Extend it with Stack Overflow queries, GitHub star tracking, and engagement analysis to build a full reporting pipeline.

Ready-to-Use AI Prompt

Want to skip the coding and go straight to insights? Copy the prompt below into any AI tool (Claude, ChatGPT, Gemini, etc.), paste in API response data, and get a formatted analysis. This is the same methodology described above, packaged as a single instruction.

How to use it

  1. Call the Vibe Data API endpoints listed in the prompt (or use curl from your terminal)
  2. Copy the prompt below into your AI tool
  3. Paste the JSON responses from the API where indicated
  4. The AI will produce a formatted weekly insight report
You are a data analyst producing a weekly report on AI development tool adoption.
You will be given raw JSON data from the Vibe Data API (vibe-data.com). Analyze it
and produce a formatted insight report.

## Data Sources Available

Call these endpoints (replace YOUR_API_KEY with your key) and paste the JSON
responses below. Authenticate with: -H "Authorization: Bearer YOUR_API_KEY"

- https://vibe-data.com/api/reddit?limit=500 (developer discussions)
- https://vibe-data.com/api/npm?limit=10 (JavaScript SDK downloads)
- https://vibe-data.com/api/stackoverflow?limit=100 (support questions)
- https://vibe-data.com/api/hackernews?limit=100 (technical community)
- https://vibe-data.com/api/github?limit=20 (open source repos)
- https://vibe-data.com/api/pypi?limit=10 (Python package downloads)
- https://vibe-data.com/api/vscode-extensions?limit=20 (IDE adoption)

Example:
curl -H "Authorization: Bearer YOUR_API_KEY" "https://vibe-data.com/api/npm?limit=10"

## Analysis Instructions

For each tool in the data, calculate:

1. **Mention count**: Total mentions in the current 7-day window
2. **Week-over-week change**: ((this_week - last_week) / last_week) * 100
3. **Engagement ratio**: total_upvotes / mention_count (Reddit data)
4. **Market share**: tool_mentions / total_mentions * 100 (for discussion data)
   OR tool_downloads / total_downloads * 100 (for package manager data)

## Trend Patterns to Identify

Scan the data for these specific patterns:

- **Momentum shift**: A tool reverses direction after 2+ weeks of decline or growth
- **Discussion-adoption divergence**: Reddit mentions rising but NPM downloads flat
  (hype), or downloads rising but discussion flat (silent adoption)
- **Engagement decay**: Mentions up 10%+ but engagement ratio down 15%+
  (warning signal - usually precedes a mention decline in 2-3 weeks)
- **Cross-source confirmation**: When Reddit, HackerNews, AND NPM all move in the
  same direction for a tool, the signal is strong

## Validation Rules

Before including any claim in the report:
- Minimum 100 data points for statistical claims
- Compare equal time windows only (7 days vs 7 days)
- Flag if a single viral post accounts for >50% of engagement
- Note if a trend is only visible in one data source

## Output Format

Structure the report as follows:

**Opening paragraph**: Lead with the single most compelling stat.
Format: "[Tool] [action verb] [specific number] [context]"
Example: "GitHub Copilot surged 26.7% week-over-week to 403 Reddit mentions"

**Key findings box**: 5-6 bullet points of the most important metrics

**Top 10 rankings table** with columns:
Rank | Tool | This Week | Last Week | Change (%) | Engagement

**2-3 deep-dive sections** on the most interesting tools or trends,
each with specific numbers and context

**NPM/package download rankings table** with columns:
Rank | Package | Weekly Downloads | Monthly Downloads

**"What This Data Tells Us" section**: 3-4 numbered insights that explain
WHY the numbers matter, not just what they are

## Style Rules

- Use exact numbers, not approximations
- Always specify the date range for the data
- Attribute every claim to a specific data source
- No hedging language - state findings directly
- Explain what metrics mean for developers making tool choices

## API Data

Paste the JSON responses from the API endpoints here:

[PASTE REDDIT DATA HERE]

[PASTE NPM DATA HERE]

[PASTE STACKOVERFLOW DATA HERE]

[PASTE ANY ADDITIONAL DATA HERE]

The prompt works with as little as one data source (Reddit alone produces useful insights) or as many as all seven. More sources means stronger cross-referencing and higher-confidence conclusions.

Start Exploring the Data

The same data that powers our weekly insights is available through the API. View the live dashboard or start querying the endpoints directly.

View Live Dashboard Read Methodology