The Next Step After Karpathy's Wiki Idea

100% open-source and runs locally!

Apr 08, 2026

The next step after Karpathy’s wiki idea

Karpathy’s LLM Wiki compiles raw sources into a persistent MD wiki with backlinks and cross-references.

The LLM reads papers, extracts concepts, writes encyclopedia-style articles, and maintains an index. The knowledge is compiled once and kept current, so the LLM never re-derives context from scratch at query time.

This works because research is mostly about concepts and their relationships, which are relatively stable.

But this pattern breaks when you apply it to actual work, where context evolves across conversations constantly, like deadlines, plans, meetings, etc.

A compiled wiki would have a page about the project, but it wouldn’t track ground truth effectively.

Tracking this requires a different data structure altogether, which is not a wiki of summaries, but a knowledge graph of typed entities where people, decisions, commitments, and deadlines are separate nodes linked across conversations.

Rowboat is an open-source implementation of exactly this, built on top of the same Markdown-and-Obsidian foundation that Karpathy uses, but extended into a work context.

The way it works is that it ingests conversations from Gmail, Granola, and Fireflies, and instead of writing a summary page per topic, it extracts each decision, commitment, and deadline as its own MD file with backlinks to the people and projects involved.

That’s structurally different from a wiki, because a wiki page about “Project X” gives you a summary of what was discussed.

A knowledge graph gives you every decision made, who made it, what was promised, when it was promised, and whether anything has shifted since.

It also runs background agents on a schedule, so something like a daily briefing gets assembled automatically from whatever shifted in your graph overnight. You control what runs and what gets written back into the vault.

You bring your own model through Ollama, LM Studio, or any hosted API, and everything is stored as plain Markdown you can open in Obsidian, edit, or delete.

You can find the GitHub repo here →

Rowboat GitHub Repo

We are working on a hands-on demo for this and will share that in the coming week!

TL;DR: Karpathy’s LLM Wiki compiles research into a persistent Markdown wiki. It works well for concepts and their relationships, but breaks down for real work where the context evolves over time. Rowboat builds a knowledge graph instead of a wiki, extracts typed entities with backlinks, and runs background agents that act on that accumulated context. Open-source, local-first, bring your own model.Karpathy nailed the foundation. The next layer is here.

16 AI Agent Skills for AI Engineers

Claude Code’s .claude/ skills system lets you package reusable instructions, workflows, and tool configurations into portable folders that any agent session can pick up. The ecosystem around this has grown fast.

Here are 16 powerful Agent skills for AI engineers:

Superpowers: A structured dev workflow that forces Claude to brainstorm, plan, and test before writing any code. Useful when you want rigor over speed.
InsForge: Semantic backend layer that exposes auth, database, storage, and functions through one agent-friendly API. Think of it as a unified backend for agents.
Bright Data Skills: Teaches Claude to orchestrate 60+ MCP tools for web scraping and structured data extraction. Handles the messy parts of live web access.
Context7: MCP server that feeds live, version-specific library docs directly into Claude’s context. No more hallucinated APIs from outdated training data.
Claude-Mem: Persistent memory plugin that auto-captures sessions and reinjects relevant context into future ones. Solves the “Claude forgot everything” problem between sessions.
Everything Claude Code: Curated skills and rules collection with smart token-saving compaction at logical breakpoints. A good starting point if you’re building your own .claude/ setup.
Planning with Files: Persistent markdown files for planning, progress tracking, and knowledge storage across sessions. Simple approach, surprisingly effective for multi-session projects.
Sentry Security Review: Security review skill built on 15 years of real Sentry patches and Django ORM pitfalls. Catches the kind of bugs that only show up in production.
Frontend Design: Official Anthropic skill for distinctive, non-generic UI output with bold design choices. Ships with Claude Code and pushes past the default “looks like every other AI-generated UI” problem.
Web Quality Skills: Lighthouse and Core Web Vitals optimization for performance, accessibility, and SEO. Bakes web quality checks directly into the agent loop.
n8n-MCP: MCP server with docs and schemas for all 1,396 n8n automation nodes. If you’re building automations with n8n, this gives Claude full visibility into the node catalog.
Claude-Reflect: Captures your repeated corrections and turns them into reusable commands with human review. The agent learns your preferences over time instead of making the same mistakes.
cc-DevOps Skills: Generator and validator loops for Terraform, Kubernetes, Docker, and CI/CD configs. Generates infra code, then validates it before you apply.
Agent Sandbox: Isolated E2B cloud sandboxes for building, hosting, and testing apps without touching local files. Good for when you want the agent to experiment freely without risk.
Agile Workflow: Full agile delivery pipeline with multi-model parallel review via Codex and Gemini agents. Brings structured software delivery practices into the agent workflow.
Claude Code Plugins+: Plugin directory with a CLI package manager for searching and installing niche skills. Think npm but for Claude Code skills.

The .claude/ skills folder is becoming the package manager layer for agent behavior. Each of these skills is a self-contained instruction set that shapes how Claude approaches a specific type of work.

The interesting thing to note here is that skills aren’t just prompts. They combine instructions, file templates, tool configurations, and validation loops into composable units. The best ones encode real practitioner knowledge (like Sentry’s 15 years of security patches) into something an agent can apply consistently.

👉 Over to you: Which skills are you using with Claude Code, and have you built any custom ones for your workflow?

Thanks for reading!

Daily Dose of Data Science

Discussion about this post

Ready for more?