Xingbo Wang 77a0e347bb Add Claude code review CI workflow (#14480 )

Summary:
Add GitHub Actions workflows for AI-powered code review on PRs using Claude (Anthropic). Follows the clang-tidy security pattern with two separate workflows for privilege separation.

## Trigger Modes

**1. Auto** — Runs after `pr-jobs` workflow completes successfully via `workflow_run`. Safe for fork PRs (runs from default branch, never executes PR code).

**2. Manual** — Maintainers comment `/claude-review [focus area]` or `/claude-query <question>` on any PR. Restricted to 15 authorized team members.

**3. workflow_dispatch** — For manual testing.

## Security Model (Two-Workflow Separation)

Same pattern as clang-tidy:

**`claude-review.yml`** (analysis):
- Runs Claude with `ANTHROPIC_API_KEY`
- Has ONLY `contents: read` — no PR write, no issue write
- Saves review markdown + metadata as artifact

**`claude-review-comment.yml`** (posting):
- Triggers on `workflow_run` completion
- Downloads artifact and posts/updates PR comment
- Has `pull-requests: write` but never runs AI

This separation prevents a crafted PR from tricking Claude into exfiltrating write tokens.

## Review Methodology

Review prompt in `claude_md/code_review.md` (shared with local Claude Code reviews). Five perspectives:
- Call-chain analysis (3-5 levels up/down)
- Correctness & edge cases
- Cross-component & adversarial (10 execution contexts)
- Performance
- API compatibility & test coverage

## Shared Scripts

- `.github/scripts/post-pr-comment.js` — Create-or-update PR comment with marker-based dedup. Now used by both clang-tidy and Claude review.
- `.github/scripts/parse-claude-review.js` — Parses `claude-code-base-action` execution log into markdown.

## Files Changed

| File | Description |
|------|-------------|
| `.github/workflows/claude-review.yml` | Analysis workflow (476 lines) |
| `.github/workflows/claude-review-comment.yml` | Comment posting workflow (146 lines) |
| `.github/scripts/post-pr-comment.js` | Shared PR comment utility (57 lines) |
| `.github/scripts/parse-claude-review.js` | Execution log parser (78 lines) |
| `.github/workflows/clang-tidy-comment.yml` | Updated to use shared script |
| `claude_md/code_review.md` | Review methodology (104 lines) |

## Setup Required

Add `ANTHROPIC_API_KEY` secret to the repo settings.

## Testing

Tested end-to-end on `xingbowang/rocksdb` fork — both auto and manual triggers, artifact upload/download, comment posting, and duplicate detection all verified working.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/14480

Reviewed By: omkarhgawde

Differential Revision: D97832666

Pulled By: xingbowang

fbshipit-source-id: f80c7d8683ac980614dc4ca66c1e545deb3be504

2026-03-23 16:20:01 -07:00

3.9 KiB

Raw Permalink Blame History

Code Review Skill

This document defines the methodology for performing thorough, multi-perspective code reviews on RocksDB changes. It is used both by CI (GitHub Actions) and local review workflows.

Prerequisites

Before starting a review:

Read CLAUDE.md in the repository root for project-specific guidelines
Read any files in claude_md/ for architecture context
Identify the commit/diff/PR to review and parse the changed files

Review Methodology

Conduct the review from multiple perspectives, then synthesize findings into a single report. Each perspective catches different classes of bugs.

Perspective 1: Codebase Context & Call-Chain Analysis

Before reviewing the diff itself, build deep context:

For each changed function/method, trace the call chain UPWARD 3-5 levels. Who consumes the changed behavior? What invariants do callers rely on?
Trace DOWNWARD into callees. What side effects do they have? What shared state do they mutate (counters, sequence numbers, flags, metadata)?
Identify the "sibling" or "reference" implementation that handles the same scenario the standard way. Compare behaviors.
For data written to shared structures (memtable, SST block, cache entry), find ALL readers and verify their visibility rules match the writer.

Perspective 2: Correctness & Edge Cases

Thread safety and concurrency (lock ordering, data races)
Error handling and Status propagation
Edge cases: empty inputs, overflow, boundary conditions
Data corruption scenarios
Behavioral contract changes: do return value semantics match callers?

Perspective 3: Cross-Component & Adversarial Analysis

Check the change in ALL execution contexts:

Context	Key difference
WritePreparedTxnDB	read_callback_ controls visibility
ReadOnly DB / SecondaryInstance	No mutable memtable
CompactionService / Remote compaction	Different process
User-defined timestamps	Extra key comparison dimension
MemPurge	Memtable-to-memtable path
BlobDB	Values in blob files, not inline
Old snapshots	Seqno far behind current
Concurrent writers	Lock-free vs locked paths
FIFO / Universal compaction	Different invariants
Prefix seek / total order seek	Different iterator behavior

When the change claims a property ("logically redundant", "safe because X"), systematically try to break it: list preconditions, construct counterexamples.

Perspective 4: Performance

RocksDB is a high-performance storage engine.

Memory allocation on hot paths
Unnecessary copies (prefer move semantics, Slice, string_view)
Cache efficiency and data locality
Loop optimization, branch prediction (LIKELY/UNLIKELY)

Perspective 5: API, Compatibility & Testing

Public API backwards compatibility
Serialization format correctness and versioning
Test coverage for edge cases, failure modes, and system-level interactions

How to Explore the Codebase

Use available tools to explore deeply — do NOT just review the diff in isolation. The most critical bugs hide at component boundaries.

Read the header files for changed implementations
Search for callers of changed functions (3-5 levels up)
Read existing tests to understand guaranteed behaviors
Check related implementations for conventions and patterns

Output Format

Summary

Brief overall assessment (1-2 sentences).

Issues Found

Categorize by severity:

🔴 Critical: Must fix (correctness, data corruption, security)
🟡 Suggestion: Should consider (performance, edge cases)
🟢 Nitpick: Minor style issues

For each issue include:

File and line reference
Which review perspective found it
Description with root cause analysis
Suggested fix

Cross-Component Analysis

Execution context checks and assumption stress-test results.

Positive Observations

Good patterns, clever optimizations, or improvements.

3.9 KiB Raw Permalink Blame History