# AI Intelligence Engine System

## Table of Contents
1. [System Overview](#system-overview)
2. [Architecture](#architecture)
3. [Marcus Ecosystem Integration](#marcus-ecosystem-integration)
4. [Workflow Integration](#workflow-integration)
5. [Core Features](#core-features)
6. [Technical Implementation](#technical-implementation)
7. [Pros and Cons](#pros-and-cons)
8. [Design Rationale](#design-rationale)
9. [Future Evolution](#future-evolution)
10. [Task Complexity Handling](#task-complexity-handling)
11. [Board-Specific Considerations](#board-specific-considerations)
12. [Cato Integration](#cato-integration)

## System Overview

The AI Intelligence Engine is Marcus's AI-powered analysis and decision engine that provides intelligent task assignment, blocker resolution, and project risk analysis using Claude API. The system gracefully falls back to rule-based approaches when AI is unavailable, ensuring Marcus continues functioning even without AI access.

**Primary Location**: `src/integrations/ai_analysis_engine.py`

### Core Principle
**AI-enhanced decision making with graceful degradation.**

The system operates on a pragmatic principle: use AI when available for intelligent analysis, fall back to deterministic rules when AI is unavailable. This ensures reliability while leveraging AI's semantic understanding capabilities when possible.

### Key Capabilities
- **Intelligent Task Assignment**: AI-powered task-to-agent matching based on skills, capacity, and project context
- **Task Instruction Generation**: Context-aware, detailed task instructions tailored to developer experience
- **Blocker Analysis**: Root cause analysis and resolution suggestions for task blockers
- **Project Risk Assessment**: Proactive risk identification with mitigation strategies
- **Project Health Analysis**: Overall project health monitoring and recommendations
- **Feature Request Analysis**: Intelligent analysis of feature requests and implementation planning
- **Graceful Fallback**: Automatic fallback to rule-based logic when AI unavailable

## Architecture

### Single-Class Design

The AI Intelligence Engine is implemented as a single, cohesive class:

```mermaid
graph TB
    subgraph "AIAnalysisEngine"
        Init[Initialization]
        Claude[Claude API Client]
        Prompts[Prompt Templates]
        Analysis[Analysis Methods]
        Fallback[Fallback Logic]
        Tracking[Token Tracking]
    end

    Init --> Claude
    Init --> Prompts
    Analysis --> Claude
    Analysis --> Fallback
    Claude --> Tracking

    style Claude fill:#e1f5fe
    style Fallback fill:#fff3e0
    style Analysis fill:#e8f5e9
```

**Component Structure:**
- **AIAnalysisEngine**: Single class providing all AI analysis functionality
  - Claude API client (Anthropic SDK)
  - Prompt template system
  - Analysis methods for different use cases
  - Fallback methods for when AI unavailable
  - Token usage tracking integration

### Core Class Structure

```python
class AIAnalysisEngine:
    """
    AI-powered analysis and decision engine using Claude API.

    Provides intelligent analysis for project management decisions with
    graceful fallback to rule-based approaches when AI is unavailable.
    """

    def __init__(self) -> None:
        # Initialize Anthropic client
        self.client: Optional[anthropic.Anthropic] = None
        self.current_project_id: Optional[str] = None
        self.current_agent_id: Optional[str] = None

        # Load API key from config or environment
        api_key = config.get("ai.anthropic_api_key") or os.getenv("ANTHROPIC_API_KEY")
        if api_key:
            self.client = anthropic.Anthropic(api_key=api_key)

        # Model configuration
        self.model: str = "claude-3-5-sonnet-20241022"

        # Prompt templates (4 keys)
        self.prompts: Dict[str, str] = {
            "task_assignment": "...",
            "task_instructions": "...",
            "blocker_analysis": "...",
            "project_risk": "..."
        }
        # Note: project_health and feature_analysis methods build prompts inline
        # and are NOT stored as keys in self.prompts
```

## Marcus Ecosystem Integration

### Position in Marcus Architecture

The AI Intelligence Engine provides cognitive capabilities across the Marcus ecosystem:

```mermaid
graph TB
    subgraph "Marcus Ecosystem"
        MCP[MCP Tools]
        TM[Task Management]
        AIE[AI Intelligence Engine]
        Memory[Memory System]
        Kanban[Kanban Integration]
        Tokens[Token Tracking]
    end

    MCP --> AIE
    TM --> AIE
    AIE --> Memory
    AIE --> Kanban
    AIE --> Tokens

    style AIE fill:#e8f5e8,stroke:#4caf50,stroke-width:3px
```

### Integration Points

**Consumers of AI Engine:**
- **MCP Tools**: `create_project`, `request_next_task`, `report_blocker`
- **Task Management**: Task assignment optimization
- **Project Analysis**: Risk assessment and health monitoring

**Dependencies:**
- **Anthropic SDK**: Direct Claude API integration
- **Token Tracking**: Cost monitoring via `ai_usage_middleware`
- **Config System**: API key and model configuration
- **Memory System**: Historical context for analysis

### Data Flow

1. **Input**: Task data, agent profiles, project state, blocker descriptions
2. **Processing**:
   - Check if Claude API available
   - If available: AI analysis via Claude
   - If unavailable: Fallback to rule-based logic
3. **Output**: Structured decisions, instructions, risk assessments

## Workflow Integration

The AI Intelligence Engine is invoked at key points in the Marcus workflow:

### 1. Project Creation (`create_project`)

**MCP Tool Integration:**
```python
# Called from create_project MCP tool
# Handles PRD parsing and task generation
# See: src/intelligence/prd_parser.py and intelligent_task_generator.py
```

**Note**: PRD parsing is handled by separate intelligence modules (`PRDParser`, `IntelligentTaskGenerator`), not directly by `AIAnalysisEngine`. See System 23 (Task Management Intelligence) for details.

### 2. Task Assignment (`request_next_task`)

**AI Engine Method:**
```python
async def match_task_to_agent(
    self,
    available_tasks: List[Task],
    agent: WorkerStatus,
    project_state: ProjectState
) -> Optional[Task]:
    """Find optimal task for agent using AI analysis."""

    if not self.client:
        return self._fallback_task_matching(available_tasks, agent)

    # Prepare data for AI (limit to 10 tasks to avoid context limits)
    tasks_data = [serialize(t) for t in available_tasks[:10]]

    # Call Claude with task assignment prompt
    prompt = self.prompts["task_assignment"].format(
        tasks=json.dumps(tasks_data),
        agent=json.dumps(serialize(agent)),
        project_state=project_state.value
    )

    response = await self._call_claude(prompt)
    result = json.loads(response)

    # Find and return the recommended task
    return find_task_by_id(result["recommended_task_id"])
```

**Fallback Logic:**
```python
def _fallback_task_matching(
    self, tasks: List[Task], agent: WorkerStatus
) -> Optional[Task]:
    """Rule-based task matching when AI unavailable."""

    # Score tasks by priority and skill match
    priority_scores = {
        Priority.URGENT: 10,
        Priority.HIGH: 3,
        Priority.MEDIUM: 2,
        Priority.LOW: 1,
    }

    # Find best scoring task
    best_task = max(tasks, key=lambda t: (
        priority_scores.get(t.priority, 0) +
        skill_match_score(t, agent)
    ))

    return best_task
```

### 3. Task Instructions (`generate_task_instructions`)

**AI Engine Method:**
```python
async def generate_task_instructions(
    self, task: Task, agent: WorkerStatus
) -> str:
    """Generate detailed instructions for a task."""

    if not self.client:
        return self._generate_fallback_instructions(task, agent)

    # AI-generated instructions tailored to task type
    prompt = self.prompts["task_instructions"].format(
        task=json.dumps(serialize(task)),
        agent=json.dumps(serialize(agent))
    )

    instructions = await self._call_claude(prompt)
    return instructions
```

**Fallback Logic:**
```python
def _generate_fallback_instructions(
    self, task: Task, agent: WorkerStatus
) -> str:
    """Template-based instructions when AI unavailable."""

    instructions = f"""Task: {task.name}

Description:
{task.description}

Steps:
1. Review the task requirements
2. Implement the necessary changes
3. Test your implementation
4. Submit for review

Acceptance Criteria:
{task.acceptance_criteria or 'Complete the task as described'}
"""
    return instructions
```

### 4. Blocker Analysis (`report_blocker`)

**AI Engine Method:**
```python
async def analyze_blocker(
    self,
    task_id: str,
    description: str,
    severity: str,
    agent: Optional["WorkerStatus"] = None,
    task: Optional["Task"] = None
) -> Dict[str, Any]:
    """Analyze blocker and suggest resolution steps."""

    if not self.client:
        return self._generate_fallback_blocker_analysis(...)

    # AI analysis with agent context
    prompt = self.prompts["blocker_analysis"].format(
        task_id=task_id,
        description=description,
        severity=severity,
        agent_context=f"Agent: {agent.name}, Skills: {agent.skills}",
        task_context=f"Task: {task.name}"
    )

    response = await self._call_claude(prompt)
    return json.loads(response)
```

**Fallback Logic:**
```python
def _generate_fallback_blocker_analysis(
    self, description: str, severity: str, ...
) -> Dict[str, Any]:
    """Rule-based blocker analysis."""

    return {
        "root_cause": f"Blocker: {description}",
        "impact_assessment": f"Severity: {severity}",
        "resolution_steps": [
            "Review the error message or issue",
            "Check documentation for similar issues",
            "Consider asking team for assistance"
        ],
        "estimated_hours": 2.0,
        "escalation_needed": severity == "high"
    }
```

### 5. Project Risk Analysis (`analyze_project_risks`)

**AI Engine Method:**
```python
async def analyze_project_risks(
    self,
    project_state: ProjectState,
    recent_blockers: List[BlockerReport],
    team_status: List[WorkerStatus]
) -> List[ProjectRisk]:
    """Analyze and identify project risks."""

    if not self.client:
        return self._generate_fallback_risk_analysis(project_state)

    # AI-powered risk identification
    prompt = self.prompts["project_risk"].format(
        project_state=project_state.value,
        recent_blockers=json.dumps([serialize(b) for b in recent_blockers[:5]]),
        team_status=json.dumps([serialize(a) for a in team_status])
    )

    response = await self._call_claude(prompt)
    risks_data = json.loads(response)

    # Convert to ProjectRisk objects
    return [
        ProjectRisk(
            risk_level=RiskLevel[r["impact"].upper()],
            description=r["description"],
            mitigation_strategy=r["mitigation"]
        )
        for r in risks_data["risks"]
    ]
```

## Core Features

### 1. AI-Enhanced Decision Making

**What makes it special**: When Claude API is available, provides semantic understanding of tasks, agents, and context for intelligent decisions.

**Capabilities:**
- Semantic task-agent matching (not just keyword matching)
- Context-aware instruction generation
- Intelligent blocker root cause analysis
- Risk pattern identification across project history
- Recommendations tailored to agent skill level

**Example AI Analysis:**
```python
# AI considers:
# - Agent's Python skills match backend task requirements
# - Agent has bandwidth (40% capacity remaining)
# - Task is on critical path for milestone
# - Agent previously succeeded on similar tasks
# → Confidence: 0.85, Recommended: Yes
```

### 2. Graceful Fallback System

**What makes it special**: Marcus continues functioning even when AI is completely unavailable.

**Fallback Strategy:**
```python
if not self.client:  # AI unavailable
    return self._fallback_method(...)  # Use rules instead
```

**Fallback Methods:**
- `_fallback_task_matching()` - Priority + skill-based scoring
- `_generate_fallback_instructions()` - Template-based instructions
- `_generate_fallback_blocker_analysis()` - Rule-based analysis
- `_generate_fallback_risk_analysis()` - Metric-based assessment

**Fallback Quality:**
- Basic but functional - allows project to continue
- Deterministic and predictable
- No API costs
- Instant response (no network latency)

### 3. Prompt Template System

**What makes it special**: Maintainable, version-controlled prompts for different analysis types.

**Template Categories:**
```python
self.prompts = {
    "task_assignment": "...",      # Task-to-agent matching
    "task_instructions": "...",     # Detailed task guidance
    "blocker_analysis": "...",      # Problem resolution
    "project_risk": "...",          # Risk identification
}
# Note: project_health and feature_analysis build prompts inline — not stored in self.prompts
```

**Template Design:**
- Clear instructions to Claude
- Structured JSON output format
- Context-specific guidance
- Examples where helpful

### 4. Token Usage Tracking

**What makes it special**: Automatic cost monitoring for all Claude API calls.

**Implementation:**
```python
async def _call_claude(self, prompt: str) -> str:
    """Call Claude API with error handling and token tracking."""

    response = self.client.messages.create(
        model=self.model,
        max_tokens=2000,
        temperature=0.7,
        messages=[{"role": "user", "content": prompt}]
    )

    # Token tracking happens automatically via middleware
    # See: ai_usage_middleware.wrap_ai_provider()

    return str(response.content[0].text)
```

**Tracking Benefits:**
- Cost per project/task/agent
- Usage trends over time
- Budget alerts
- Optimization opportunities

## Technical Implementation

### Initialization

```python
class AIAnalysisEngine:
    def __init__(self) -> None:
        """Initialize with Anthropic client."""

        # Load API key
        from src.config.marcus_config import get_config
        config = get_config()
        api_key = config.ai.anthropic_api_key or os.getenv("ANTHROPIC_API_KEY")

        # Initialize client (None if no key)
        if api_key:
            self.client = anthropic.Anthropic(api_key=api_key)
        else:
            print("⚠️  No API key - AI features will use fallback mode")
            self.client = None

        # Model selection (config uses attribute access, not .get())
        self.model = getattr(getattr(config, "ai", None), "model", None) or "claude-3-5-sonnet-20241022"
```

### Claude API Integration

```python
async def _call_claude(self, prompt: str) -> str:
    """
    Call Claude API with error handling.

    Parameters
    ----------
    prompt : str
        The prompt to send to Claude

    Returns
    -------
    str
        Claude's response text

    Raises
    ------
    Exception
        If API call fails after retries
    """
    try:
        response = self.client.messages.create(
            model=self.model,
            max_tokens=2000,
            temperature=0.7,
            messages=[{"role": "user", "content": prompt}]
        )
        return str(response.content[0].text)

    except Exception as e:
        print(f"Claude API error: {e}", file=sys.stderr)
        raise
```

### Error Handling

**Approach**: Fail gracefully to fallback methods rather than crash.

```python
async def match_task_to_agent(self, tasks, agent, project_state):
    """AI task matching with fallback."""

    # Check if AI available
    if not self.client:
        return self._fallback_task_matching(tasks, agent)

    try:
        # Try AI analysis
        response = await self._call_claude(prompt)
        return parse_and_find_task(response)

    except Exception as e:
        # AI failed - use fallback
        print(f"AI analysis failed: {e}", file=sys.stderr)
        return self._fallback_task_matching(tasks, agent)
```

### Context Serialization

**Challenge**: Convert Marcus domain objects to JSON for Claude.

**Solution**: Serialization helpers that extract relevant fields.

```python
def serialize(obj: Any) -> Dict[str, Any]:
    """Convert domain object to JSON-serializable dict."""

    if isinstance(obj, Task):
        return {
            "id": obj.id,
            "name": obj.name,
            "description": obj.description,
            # Note: Task has no 'type' or 'skills_required' fields
            "priority": obj.priority.value,
            "estimated_hours": obj.estimated_hours,
            "labels": obj.labels,
        }
    elif isinstance(obj, WorkerStatus):
        return {
            "name": obj.name,
            "skills": obj.skills,
            "capacity": obj.capacity,
            # Note: WorkerStatus has no 'current_load' or 'experience_level' fields
        }
    # ... other types
```

### Undocumented Methods

The following methods exist in `src/integrations/ai_analysis_engine.py` but are not covered in the sections above:

- `generate_clarification(task, questions)` — generates clarifying questions for ambiguous tasks
- `analyze_integration_points(tasks)` — identifies integration points between tasks
- `generate_structured_response(prompt, schema)` — calls Claude and parses the response against a given JSON schema
- `initialize()` — async initialization hook (sets up client and prompt templates after construction)

## Pros and Cons

### Advantages

1. **Simple Architecture**: Single class, easy to understand and maintain
2. **Direct Integration**: No abstraction layers, minimal overhead
3. **Graceful Degradation**: Works without AI (fallback mode)
4. **Cost Tracking**: Built-in token usage monitoring
5. **Fast Task Assignment**: Typically < 1 second with AI, instant with fallback
6. **Flexible Prompts**: Easy to update and version control
7. **No Complex Dependencies**: Just Anthropic SDK + Marcus core
8. **Battle-Tested**: Claude 3.5 Sonnet proven for this use case

### Limitations

1. **Single AI Provider**: Anthropic/Claude only (no OpenAI, local models, etc.)
2. **No Rule-Based Safety Layer**: AI can make any decision (no validation layer)
3. **No Learning/Pattern Storage**: Doesn't persist insights across sessions
4. **No Provider Abstraction**: Tightly coupled to Anthropic SDK
5. **Limited Context**: Max 10 tasks per analysis (context window limits)
6. **No Hybrid Confidence**: Simple AI-or-fallback, no weighted combination
7. **Basic Fallback**: Rule-based fallbacks are functional but not sophisticated
8. **No Streaming**: Waits for complete response (no real-time output)

### Trade-offs Made

**Chose Simplicity Over Sophistication:**
- Single provider vs multi-provider abstraction
- Direct API calls vs complex framework
- Simple fallbacks vs advanced rule engine
- Template prompts vs dynamic prompt generation

**Reasoning**: Get Marcus working with AI quickly, iterate based on real usage.

## Design Rationale

### Why This Approach Was Chosen

#### 1. Single Class Design

**Decision**: Implement as one cohesive `AIAnalysisEngine` class.

**Rationale**:
- Easier to understand and maintain
- Faster development iteration
- Sufficient for current needs
- Can refactor later if needed

**Trade-off**: Less modular than multi-class architecture, harder to extend certain features.

#### 2. Claude API Direct Integration

**Decision**: Use Anthropic SDK directly, no abstraction layer.

**Rationale**:
- Claude 3.5 Sonnet excellent for these tasks
- Anthropic SDK stable and well-documented
- No need for multi-provider support yet
- Faster to implement

**Trade-off**: Vendor lock-in, can't easily switch to OpenAI or local models.

#### 3. Graceful Fallback

**Decision**: Every AI method has a fallback version.

**Rationale**:
- Marcus must work even if API down or key missing
- Fallbacks provide basic functionality
- Users can use Marcus without AI budget
- Development/testing doesn't require API access

**Trade-off**: Maintaining two code paths for each feature.

#### 4. Template-Based Prompts

**Decision**: Store prompts as string templates in code.

**Rationale**:
- Easy to version control
- Quick to iterate and improve
- No external prompt management needed
- Reviewable in code reviews

**Trade-off**: Less flexible than dynamic prompt construction.

## Future Evolution

### Planned Enhancements

See `docs/source/systems/intelligence/07-ai-intelligence-engine-FUTURE.md` for the aspirational multi-component hybrid architecture.

**Near-term improvements (Current System):**
1. **Streaming Responses**: Real-time output for long analyses
2. **Better Fallbacks**: More sophisticated rule-based logic
3. **Prompt Versioning**: A/B test different prompt styles
4. **Context Expansion**: Handle more tasks per analysis

**Medium-term improvements:**
5. **Learning System**: Persist AI insights across sessions
6. **Multi-Model Support**: Use different models for different tasks
7. **Confidence Calibration**: Better confidence score accuracy
8. **Cost Optimization**: Smarter decisions about when to use AI

**Long-term vision** (see aspirational doc):
- Multi-provider LLM abstraction
- Hybrid rule-based + AI architecture
- Contextual learning system
- Advanced PRD parsing with dependency inference

## Task Complexity Handling

### Simple Tasks

**Approach**: Fast AI analysis or instant fallback.

```python
# For simple tasks:
# - AI response typically < 1 second
# - Fallback response instant
# - Simple prompt, concise analysis
```

**Example**: "Implement login button"
- AI quickly identifies: frontend task, junior-friendly, 2 hours
- Fallback: Check labels, match to frontend dev, assign

### Complex Tasks

**Approach**: More detailed AI analysis with richer context.

```python
# For complex tasks:
# - Include project history
# - Reference similar past tasks
# - Detailed prompt with examples
# - Longer max_tokens for thorough analysis
```

**Example**: "Design and implement payment system with Stripe"
- AI analyzes: security implications, integration complexity, testing needs
- Considers: agent experience level, needs senior dev, estimates 40 hours
- Fallback: Priority-based assignment, generic instructions

## Board-Specific Considerations

### Kanban Integration

**Data Sources:**
- Task metadata from Kanban cards
- Agent availability from board state
- Project progress from board metrics

**Adaptations:**
- Different task types → different analysis depth
- Board structure affects priority scoring
- Team size impacts assignment strategy

### Board Quality Impact

**High-Quality Boards** (clear descriptions, good labels):
- AI analysis more accurate
- Better task-agent matching
- Richer instructions possible

**Low-Quality Boards** (sparse descriptions, poor labels):
- AI does best with limited data
- Fallback methods work same regardless
- Recommendations to improve board quality

## Cato Integration

### Current Integration

**Marcus Role**: Provides AI analysis capabilities via MCP tools.

**Cato Role**: Consumes analysis results for visualization and UI.

**Data Exchange**: JSON-formatted analysis results via MCP protocol.

### API Patterns

```python
# Marcus MCP tool calls AI Engine
result = await ai_engine.analyze_blocker(...)

# Returns structured data
{
    "root_cause": "...",
    "resolution_steps": [...],
    "estimated_hours": 2.0
}

# Cato displays in UI
```

### Shared Context

- Project state
- Team information
- Historical data
- Analysis results

---

**This documentation represents the current state of the AI Intelligence Engine as implemented in `src/integrations/ai_analysis_engine.py`. For the aspirational multi-component hybrid architecture, see `07-ai-intelligence-engine-FUTURE.md`.**