diff --git a/skills/codebase-onboarding/SKILL.md b/skills/codebase-onboarding/SKILL.md new file mode 100644 index 00000000..a008205c --- /dev/null +++ b/skills/codebase-onboarding/SKILL.md @@ -0,0 +1,233 @@ +--- +name: codebase-onboarding +description: Analyze an unfamiliar codebase and generate a structured onboarding guide with architecture map, key entry points, conventions, and a starter CLAUDE.md. Use when joining a new project or setting up Claude Code for the first time in a repo. +origin: ECC +--- + +# Codebase Onboarding + +Systematically analyze an unfamiliar codebase and produce a structured onboarding guide. Designed for developers joining a new project or setting up Claude Code in an existing repo for the first time. + +## When to Use + +- First time opening a project with Claude Code +- Joining a new team or repository +- User asks "help me understand this codebase" +- User asks to generate a CLAUDE.md for a project +- User says "onboard me" or "walk me through this repo" + +## How It Works + +### Phase 1: Reconnaissance + +Gather raw signals about the project without reading every file. Run these checks in parallel: + +``` +1. Package manifest detection + → package.json, go.mod, Cargo.toml, pyproject.toml, pom.xml, build.gradle, + Gemfile, composer.json, mix.exs, pubspec.yaml + +2. Framework fingerprinting + → next.config.*, nuxt.config.*, angular.json, vite.config.*, + django settings, flask app factory, fastapi main, rails config + +3. Entry point identification + → main.*, index.*, app.*, server.*, cmd/, src/main/ + +4. Directory structure snapshot + → Top 2 levels of the directory tree, ignoring node_modules, vendor, + .git, dist, build, __pycache__, .next + +5. Config and tooling detection + → .eslintrc*, .prettierrc*, tsconfig.json, Makefile, Dockerfile, + docker-compose*, .github/workflows/, .env.example, CI configs + +6. Test structure detection + → tests/, test/, __tests__/, *_test.go, *.spec.ts, *.test.js, + pytest.ini, jest.config.*, vitest.config.* +``` + +### Phase 2: Architecture Mapping + +From the reconnaissance data, identify: + +**Tech Stack** +- Language(s) and version constraints +- Framework(s) and major libraries +- Database(s) and ORMs +- Build tools and bundlers +- CI/CD platform + +**Architecture Pattern** +- Monolith, monorepo, microservices, or serverless +- Frontend/backend split or full-stack +- API style: REST, GraphQL, gRPC, tRPC + +**Key Directories** +Map the top-level directories to their purpose: + + +``` +src/components/ → React UI components +src/api/ → API route handlers +src/lib/ → Shared utilities +src/db/ → Database models and migrations +tests/ → Test suites +scripts/ → Build and deployment scripts +``` + +**Data Flow** +Trace one request from entry to response: +- Where does a request enter? (router, handler, controller) +- How is it validated? (middleware, schemas, guards) +- Where is business logic? (services, models, use cases) +- How does it reach the database? (ORM, raw queries, repositories) + +### Phase 3: Convention Detection + +Identify patterns the codebase already follows: + +**Naming Conventions** +- File naming: kebab-case, camelCase, PascalCase, snake_case +- Component/class naming patterns +- Test file naming: `*.test.ts`, `*.spec.ts`, `*_test.go` + +**Code Patterns** +- Error handling style: try/catch, Result types, error codes +- Dependency injection or direct imports +- State management approach +- Async patterns: callbacks, promises, async/await, channels + +**Git Conventions** +- Branch naming from recent branches +- Commit message style from recent commits +- PR workflow (squash, merge, rebase) +- If the repo has no commits yet or only a shallow history (e.g. `git clone --depth 1`), skip this section and note "Git history unavailable or too shallow to detect conventions" + +### Phase 4: Generate Onboarding Artifacts + +Produce two outputs: + +#### Output 1: Onboarding Guide + +```markdown +# Onboarding Guide: [Project Name] + +## Overview +[2-3 sentences: what this project does and who it serves] + +## Tech Stack + +| Layer | Technology | Version | +|-------|-----------|---------| +| Language | TypeScript | 5.x | +| Framework | Next.js | 14.x | +| Database | PostgreSQL | 16 | +| ORM | Prisma | 5.x | +| Testing | Jest + Playwright | - | + +## Architecture +[Diagram or description of how components connect] + +## Key Entry Points + +- **API routes**: `src/app/api/` — Next.js route handlers +- **UI pages**: `src/app/(dashboard)/` — authenticated pages +- **Database**: `prisma/schema.prisma` — data model source of truth +- **Config**: `next.config.ts` — build and runtime config + +## Directory Map +[Top-level directory → purpose mapping] + +## Request Lifecycle +[Trace one API request from entry to response] + +## Conventions +- [File naming pattern] +- [Error handling approach] +- [Testing patterns] +- [Git workflow] + +## Common Tasks + +- **Run dev server**: `npm run dev` +- **Run tests**: `npm test` +- **Run linter**: `npm run lint` +- **Database migrations**: `npx prisma migrate dev` +- **Build for production**: `npm run build` + +## Where to Look + +| I want to... | Look at... | +|--------------|-----------| +| Add an API endpoint | `src/app/api/` | +| Add a UI page | `src/app/(dashboard)/` | +| Add a database table | `prisma/schema.prisma` | +| Add a test | `tests/` matching the source path | +| Change build config | `next.config.ts` | +``` + +#### Output 2: Starter CLAUDE.md + +Generate or update a project-specific CLAUDE.md based on detected conventions. If `CLAUDE.md` already exists, read it first and enhance it — preserve existing project-specific instructions and clearly call out what was added or changed. + +```markdown +# Project Instructions + +## Tech Stack +[Detected stack summary] + +## Code Style +- [Detected naming conventions] +- [Detected patterns to follow] + +## Testing +- Run tests: `[detected test command]` +- Test pattern: [detected test file convention] +- Coverage: [if configured, the coverage command] + +## Build & Run +- Dev: `[detected dev command]` +- Build: `[detected build command]` +- Lint: `[detected lint command]` + +## Project Structure +[Key directory → purpose map] + +## Conventions +- [Commit style if detectable] +- [PR workflow if detectable] +- [Error handling patterns] +``` + +## Best Practices + +1. **Don't read everything** — reconnaissance should use Glob and Grep, not Read on every file. Read selectively only for ambiguous signals. +2. **Verify, don't guess** — if a framework is detected from config but the actual code uses something different, trust the code. +3. **Respect existing CLAUDE.md** — if one already exists, enhance it rather than replacing it. Call out what's new vs existing. +4. **Stay concise** — the onboarding guide should be scannable in 2 minutes. Details belong in the code, not the guide. +5. **Flag unknowns** — if a convention can't be confidently detected, say so rather than guessing. "Could not determine test runner" is better than a wrong answer. + +## Anti-Patterns to Avoid + +- Generating a CLAUDE.md that's longer than 100 lines — keep it focused +- Listing every dependency — highlight only the ones that shape how you write code +- Describing obvious directory names — `src/` doesn't need an explanation +- Copying the README — the onboarding guide adds structural insight the README lacks + +## Examples + +### Example 1: First time in a new repo +**User**: "Onboard me to this codebase" +**Action**: Run full 4-phase workflow → produce Onboarding Guide + Starter CLAUDE.md +**Output**: Onboarding Guide printed directly to the conversation, plus a `CLAUDE.md` written to the project root + +### Example 2: Generate CLAUDE.md for existing project +**User**: "Generate a CLAUDE.md for this project" +**Action**: Run Phases 1-3, skip Onboarding Guide, produce only CLAUDE.md +**Output**: Project-specific `CLAUDE.md` with detected conventions + +### Example 3: Enhance existing CLAUDE.md +**User**: "Update the CLAUDE.md with current project conventions" +**Action**: Read existing CLAUDE.md, run Phases 1-3, merge new findings +**Output**: Updated `CLAUDE.md` with additions clearly marked