ScatterAI

About ScatterAI

ScatterAI is a daily intelligence service for AI practitioners. We publish two products: Brief, covering AI research papers, and Signal, covering AI industry news.

The problem we solve: there is more signal than any individual can track. arXiv publishes 200–400 AI papers per day. The AI industry generates dozens of significant moves per week. The result is that most practitioners are flying partially blind — catching fragments from Twitter, Slack links, and the occasional newsletter that doesn't explain the mechanism behind the news.

ScatterAI's editorial philosophy: know what's real, know what matters. We don't celebrate every model release. We don't speculate without evidence. We explain the mechanism, state the number, name the caveat, and give you the practitioner-level implication.

How Brief Works

Brief is a daily curation of 3–5 AI research papers, written for practitioners — founders, PMs, engineers, and analysts who don't have time to read arXiv but need to know what's changing and why it matters.

Paper Selection

Each morning, our pipeline collects 200–400 papers from arXiv (categories cs.AI, cs.CL, cs.LG, cs.CV, cs.MA, cs.IR) plus community signals from HuggingFace Daily Papers. Papers are scored on 8 signals:

Signal Points Criteria
S1 Institution 0–3 Tier 1 labs (Google/OpenAI/Anthropic/Meta/DeepMind/MSR): +3; top universities: +2; other research labs: +1
S2 HF Pick 0–4 Appears in HuggingFace Daily Papers feed: +4
S3 HF Upvotes 0–3 >100 upvotes: +3 / 30–100: +2 / 10–30: +1
S4 Venue 0–3 ICLR/NeurIPS/ICML/CVPR/ACL/EMNLP/ICCV accepted: +3
S5 Code 0–2 GitHub repo linked in paper: +2; mentioned but not linked: +1
S6 Keywords 0–2 Practitioner keywords in title/abstract (inference, agent, benchmark, efficiency, etc.): +1 each, capped at 2
S7 Citations 0–2 >50 citations: +2 / 10–50: +1
S8 GitHub Trending 0–2 Associated repo in GitHub trending: +2

Papers scoring 12+ become Featured entries (3–5 per issue). Papers scoring 6–11 become Also Worth Noting (8–12 per issue). If fewer than 3 papers reach 12, the top 3 by score are Featured regardless.

Issue Structure

Each Brief issue follows a fixed structure:

  • Today's Overview — 3–5 bullets, one per featured or notable paper, stating the core finding and practical implication
  • Featured entries (01–05) — Full write-ups: setup and surprise, mechanism, practical implication, three key takeaways, source link
  • Also Worth Noting (06–14) — One-liner entries: bold insight statement, topic tag, SO WHAT consequence
  • Today's Observation — 3-paragraph synthesis connecting 2+ featured papers at a structural level

Writing Standards

Featured entries are written by Claude Sonnet. Also Worth Noting entries by Gemini Flash. Both run against a strict style guide enforcing practitioner-first voice, specific numbers over vague claims, and zero academic hedging. Human editorial review is applied to Featured entries before publication.

Forbidden patterns include "This paper proposes...", "The authors suggest...", "breakthrough", "revolutionary", "improved performance" without numbers, and "Furthermore" / "Additionally" filler transitions.

Timing

Collection runs at 06:00 UTC, targeting papers published 2–3 days prior — enough time for HuggingFace upvotes and citation signals to accumulate. Issues publish by 08:00 UTC.

How Signal Works

Signal is a daily briefing on AI industry news. Each issue covers 3 main stories with competitive and structural analysis, plus a News Roundup of 5–8 shorter items.

Source Collection

Signal monitors three tiers of sources, collected every 2 hours:

Tier 1 — Core AI News

  • TechCrunch AI
  • The Verge (AI filter)
  • Ars Technica (AI filter)
  • VentureBeat AI
  • MIT Technology Review

Tier 2 — Company Blogs

  • OpenAI News
  • Anthropic Blog
  • Google DeepMind Blog
  • Meta AI Blog
  • Mistral News
  • HuggingFace Blog

Tier 3 — Community Signal

  • Hacker News (front page, AI-related filter)
  • X/Twitter (10–15 key accounts: @sama, @demishassabis, @ylecun, @karpathy, @OpenAI, @AnthropicAI, @GoogleDeepMind)

Story Selection

The pipeline clusters collected items by topic using semantic similarity. Clusters with 3+ sources from different tiers receive priority. The editorial engine selects 3 main stories and 5–8 roundup items based on: number of sources, source tier weight, topic novelty (not covered in the past 7 days), and practitioner relevance.

Issue Structure

Each Signal issue follows a fixed structure:

  • Main Stories (1–3) — Full analysis: 5 paragraphs (facts, competitive dynamics, historical context, connecting signals, synthesis flywheel) + Why it matters bullets + sources
  • News Roundup — 5–8 short items, 2–4 sentences each, prose format, source link at end

Writing Standards

Main stories are written in journalist + analyst voice: first-principles business reasoning, specific numbers and named sources, structural thinking (flywheels, incentives), and dry wit where irony is real.

"Why it matters" bullets are forward projections, not summaries. They name specific stakeholders (pure-model companies, open-weight ecosystems, cloud providers, agent startups) and state second-order consequences.

Forbidden patterns: "Company Announces X" headlines, "According to reports..." sourcing, "businesses" or "people" as stakeholders, roundup items with full analysis.

Timing

Collection runs continuously (every 2 hours). Daily issue is compiled at 07:00 UTC, published by 09:00 UTC.