Stack-2-9-finetuned / docs /pattern-moat.md
walidsobhie-code
docs: Add official launch plan
d083607

Pattern Memory Evolution

The Pattern Memory Moat is a system for capturing, storing, and sharing code patterns across teams. It transforms individual learning into collective intelligence.

Table of Contents

  1. Auto-Extraction
  2. Team Sync
  3. Weight Fusion
  4. API Reference

Auto-Extraction

Extract patterns automatically from your Git history. The system analyzes commit messages, identifies bug fixes and features, and stores the before/after code changes.

How It Works

The extract_patterns_from_git.py script:

  1. Scans Git History: Reads through commit messages and diffs
  2. Identifies Patterns: Uses keywords to classify commits as bug fixes or features
  3. Extracts Context: Captures before/after code with metadata
  4. Stores in JSONL: Outputs structured data suitable for training

Usage

# Extract patterns from all commits
python scripts/extract_patterns_from_git.py \
    --repo-path /path/to/repo \
    --output patterns.jsonl

# Only recent commits
python scripts/extract_patterns_from_git.py \
    --repo-path /path/to/repo \
    --output patterns.jsonl \
    --since-date "2024-01-01"

Output Format

Each line in the JSONL output:

{
  "pattern_id": "a1b2c3d4e5f6g7h8",
  "problem_type": "bug_fix",
  "before_code": "def buggy_function():\n    return None + 1",
  "after_code": "def fixed_function():\n    return 1",
  "commit_msg": "fix: handle None case in function",
  "author": "developer@example.com",
  "date": "2024-03-15 10:30:00",
  "confidence": 0.85
}

Problem Types

  • bug_fix: Commits that resolve issues (keywords: fix, bug, hotfix, patch, resolve)
  • feature_addition: Commits that add new functionality (keywords: feat, add, implement, enhance)
  • unknown: Other commits (typically skipped)

Confidence Scoring

The confidence score (0.0-1.0) reflects pattern quality:

  • Base: 0.5
  • +0.2 for clear bug fix keywords
  • +0.15 for clear feature keywords
  • +0.15 for having both before and after code
  • +0.1 for substantial changes (>100 chars)
  • +0.1 for large changes (>500 chars)

Team Sync

Share and sync patterns across your team using a shared PostgreSQL database.

PostgreSQL Schema

CREATE TABLE patterns (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    problem_type VARCHAR(50) NOT NULL,
    solution_hash VARCHAR(64) NOT NULL,
    code_before TEXT NOT NULL,
    code_after TEXT NOT NULL,
    success_count INTEGER DEFAULT 0,
    last_used TIMESTAMP,
    created_by VARCHAR(255) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
    -- Indexes
    CONSTRAINT unique_solution UNIQUE (solution_hash),
    INDEX idx_problem_type (problem_type),
    INDEX idx_success_count (success_count DESC)
);

CREATE TABLE pattern_feedback (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    pattern_id UUID REFERENCES patterns(id),
    user_id VARCHAR(255) NOT NULL,
    helpful BOOLEAN NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE adapter_versions (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    version_name VARCHAR(100) NOT NULL,
    adapter_path VARCHAR(500) NOT NULL,
    created_by VARCHAR(255) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    is_active BOOLEAN DEFAULT FALSE
);

FastAPI Endpoints

GET /patterns

List patterns with filtering and pagination.

curl -H "X-API-Key: your-api-key" \
     "http://localhost:8000/patterns?problem_type=bug_fix&limit=20"

Response:

{
  "patterns": [...],
  "total": 150,
  "page": 1,
  "per_page": 20
}

POST /patterns

Add a new pattern.

curl -X POST -H "X-API-Key: your-api-key" \
     -H "Content-Type: application/json" \
     -d '{"problem_type": "bug_fix", "code_before": "...", "code_after": "..."}' \
     "http://localhost:8000/patterns"

POST /patterns/{id}/feedback

Submit feedback on a pattern.

curl -X POST -H "X-API-Key: your-api-key" \
     -H "Content-Type: application/json" \
     -d '{"helpful": true}' \
     "http://localhost:8000/patterns/123e4567-e89b-12d3-a456-426614174000/feedback"

Authentication

API key authentication via X-API-Key header:

# Server-side middleware
async def verify_api_key(request: Request, call_next):
    api_key = request.headers.get("X-API-Key")
    if not api_key or api_key != settings.API_KEY:
        raise HTTPException(status_code=401, detail="Invalid API key")
    return await call_next(request)

Conflict Resolution

When multiple team members contribute similar patterns:

  1. Pattern Similarity Detection: Hash-based deduplication
  2. Merge Strategy: Patterns with similar solution_hash are merged
  3. Success Rate Tracking: success_count increases with positive feedback
  4. Priority: Patterns with higher success_count rank higher in queries

Weight Fusion

Combine LoRA adapters from multiple users using weighted averaging based on success rates.

Algorithm

merged_weight = Σ(adapter_i.weight * adapter_i.success_rate) / Σ(success_rate)

This ensures adapters that have shown better results contribute more to the final merged adapter.

Merge Script Usage

# Basic merge with manual weights
python scripts/merge_lora_adapters.py \
    --adapters user1_adapter.safetensors user2_adapter.safetensors \
    --weights 0.6 0.4 \
    --output merged_adapter.safetensors

# Merge using success rates (auto-computes proportional weights)
python scripts/merge_lora_adapters.py \
    --adapters alice_adapter.safetensors bob_adapter.safetensors \
    --success-rates 0.85 0.65 \
    --output team_adapter.safetensors

# Equal weights (default)
python scripts/merge_lora_adapters.py \
    --adapters adapter1.safetensors adapter2.safetensors \
    --output merged.safetensors

Versioning

Each merge creates a version record:

{
  "version_name": "v2.1-team-merge",
  "adapter_path": "/adapters/merged_v2.1.safetensors",
  "created_by": "alice@example.com",
  "created_at": "2024-03-15T10:30:00Z",
  "parent_versions": ["v2.0", "user-alice-v3", "user-bob-v2"]
}

Rollback

To revert to a previous merged adapter:

# List available versions
ls -la adapters/versions/

# Restore previous version
cp adapters/versions/v2.0.safetensors adapters/merged.safetensors

Or via API:

curl -X POST -H "X-API-Key: your-api-key" \
     -d '{"version_id": "123e4567-e89b-12d3-a456-426614174000"}' \
     "http://localhost:8000/adapters/rollback"

API Reference

Patterns API

Method Endpoint Description
GET /patterns List patterns
GET /patterns/{id} Get pattern by ID
POST /patterns Create pattern
POST /patterns/{id}/feedback Submit feedback
DELETE /patterns/{id} Delete pattern

Adapter API

Method Endpoint Description
GET /adapters List adapter versions
POST /adapters/merge Merge multiple adapters
POST /adapters/{id}/activate Set as active adapter
POST /adapters/rollback Rollback to previous version

Health Check

curl "http://localhost:8000/health"

Response:

{
  "status": "healthy",
  "version": "1.0.0",
  "database": "connected"
}

Example Workflow

1. Extract Patterns from Project

# Extract patterns from your codebase
python scripts/extract_patterns_from_git.py \
    --repo-path ./my-project \
    --output patterns.jsonl \
    --since-date "2024-01-01"

2. Upload to Team Database

import requests

with open('patterns.jsonl') as f:
    for line in f:
        pattern = json.loads(line)
        requests.post(
            "http://team-patterns.example.com/patterns",
            headers={"X-API-Key": "your-key"},
            json=pattern
        )

3. Merge Team Patterns

# Merge adapters from team members
python scripts/merge_lora_adapters.py \
    --adapters alice_adapter.safetensors bob_adapter.safetensors carol_adapter.safetensors \
    --success-rates 0.90 0.75 0.85 \
    --output team_merged.safetensors

4. Activate for Team Use

The merged adapter with the highest success rate becomes the new team baseline.


Files Reference

File Description
scripts/extract_patterns_from_git.py Git history pattern extractor
scripts/merge_lora_adapters.py LoRA adapter merger
docs/pattern-moat.md This documentation