src.core.assignment_lease module#
Assignment Lease System for automatic task recovery.
This module implements a lease-based assignment system where tasks are assigned with time-limited leases that must be renewed through progress reports. Tasks with expired leases are automatically returned to the TODO state for reassignment.
Key features: - Automatic lease renewal on progress reports - Configurable lease durations based on task complexity - Escalation for tasks with excessive renewals - Integration with assignment persistence
- class src.core.assignment_lease.LeaseStatus[source]#
Bases:
EnumStatus of an assignment lease.
- ACTIVE = 'active'#
- EXPIRING_SOON = 'expiring_soon'#
- EXPIRED = 'expired'#
- RENEWED = 'renewed'#
- class src.core.assignment_lease.AssignmentLease[source]#
Bases:
objectRepresents a time-limited assignment lease.
- property median_update_interval: float | None#
Calculate median seconds between progress updates.
- Returns:
Median interval in seconds, or None if fewer than 2 timestamps.
- Return type:
Optional[float]
- property status: LeaseStatus#
Get current lease status.
- calculate_renewal_duration(lease_manager=None)[source]#
Calculate renewal duration based on progress and history.
- Parameters:
lease_manager (
Optional[AssignmentLeaseManager]) – Optional reference to lease manager for config.- Return type:
- Returns:
Renewal duration (adaptive based on multiple factors)
- __init__(task_id, agent_id, assigned_at, lease_expires, last_renewed, renewal_count=0, estimated_hours=4.0, progress_percentage=0, last_progress_message='', grace_period_seconds=None, update_timestamps=<factory>, merge_conflict_extensions=0)#
- class src.core.assignment_lease.AssignmentLeaseManager[source]#
Bases:
objectManages assignment leases with automatic expiration and renewal.
- __init__(kanban_client, assignment_persistence, default_lease_hours=0.0667, max_renewals=10, warning_threshold_hours=0.0167, priority_multipliers=None, complexity_multipliers=None, grace_period_minutes=1.0, renewal_decay_factor=0.9, min_lease_hours=0.05, max_lease_hours=0.1, stuck_task_threshold_renewals=5, enable_adaptive_leases=True, task_list=None, silence_multiplier=5.0)[source]#
Initialize the lease manager.
- Parameters:
kanban_client (
KanbanInterface) – Interface to kanban board.assignment_persistence (
AssignmentPersistence) – Assignment persistence layer.default_lease_hours (
float) – Default lease duration in hours.max_renewals (
int) – Maximum allowed renewals before escalation.warning_threshold_hours (
float) – Hours before expiry to warn.priority_multipliers (
Optional[Dict[str,float]]) – Lease duration multipliers by priority.complexity_multipliers (
Optional[Dict[str,float]]) – Lease duration multipliers by label/type.grace_period_minutes (
float) – Grace period in minutes (float) after expiry before recovery.renewal_decay_factor (
float) – Factor to reduce renewal duration over time.min_lease_hours (
float) – Minimum allowed lease duration.max_lease_hours (
float) – Maximum allowed lease duration.stuck_task_threshold_renewals (
int) – Renewals before considering task stuck.enable_adaptive_leases (
bool) – Enable smart lease duration adjustments.task_list (
Optional[List[Task]]) – Optional reference to project tasks for recovery info updates.silence_multiplier (float)
- active_leases: Dict[str, AssignmentLease]#
- property lease_lock: Lock#
Get lease lock for the current event loop.
- update_task_list(task_list)[source]#
Update the task list reference.
Called by MarcusServer when project_tasks is refreshed.
- async create_lease(task_id, agent_id, task=None)[source]#
Create a new assignment lease.
- Parameters:
- Return type:
- Returns:
Created assignment lease
- async renew_lease(task_id, progress, message='')[source]#
Renew an existing lease based on progress report.
Uses progressive timeout strategy to adapt lease duration based on task progress and agent reliability.
- Parameters:
- Return type:
- Returns:
Renewed lease or None if not found/expired
- async touch_lease(agent_id)[source]#
Extend an agent’s lease without changing progress.
Called on any MCP tool activity to prove the agent is alive. This is a lightweight alternative to renew_lease that doesn’t require progress data or update cadence tracking.
- async check_expired_leases()[source]#
Check for expired leases that need recovery.
Two-phase to avoid holding
lease_lockduring git subprocess I/O (Codex P2 on PR #350). Holding the global lock during per-leasegit statuscalls would serialize every concurrentrenew_leaseandtouch_leasefor the duration of the slowest probe, and could cause active agents’ renewals to starve and look expired in the next cycle.- Phase 1 (lock held, no I/O):
Snapshot leases that have crossed the grace deadline.
- Phase 2 (lock released, may do I/O):
For each candidate, try the merge-conflict extension. The extension helper does a git probe outside the lock and briefly re-acquires the lock only to mutate + persist the lease atomically.
Before returning a lease as expired, the merge-conflict extension may grant up to
MAX_MERGE_CONFLICT_EXTENSIONSextensions ofMERGE_CONFLICT_EXTENSION_SECONDSeach when the agent’s worktree has unresolved git conflicts. See the constants at the top of this module for the rationale.- Return type:
- Returns:
List of expired leases (considering grace period)
- async recover_expired_lease(lease)[source]#
Recover a task with an expired lease.
Implements dual-write pattern: 1. Updates task model with structured RecoveryInfo (source of truth) 2. Posts to Kanban comments for audit trail (observability)
- Parameters:
lease (
AssignmentLease) – The expired lease to recover.- Return type:
- Returns:
True if recovery successful
- async get_expiring_leases()[source]#
Get leases that are expiring soon.
- Return type:
- Returns:
List of leases expiring within warning threshold
- calculate_adaptive_timeout(progress, update_count, has_recent_activity)[source]#
Calculate adaptive timeout based on task state (progressive timeout).
- Parameters:
- Returns:
(lease_seconds, grace_seconds) timeout configuration
- Return type:
Notes
Progressive timeout phases (widened 2026-04-12 after experiment 66 evidence showed agents routinely go 2+ minutes between progress reports during implementation bursts — the previous 90-120s timeouts caused leases to expire mid-implementation, recovering in-progress tasks and reassigning them to other agents):
Phase 1 (Unproven): No updates yet → 180s + 60s = 240s total
Phase 2 (Working): First update → 240s + 60s = 300s total
Phase 3 (Proven): 25-75% progress → 300s + 60s = 360s total
Phase 4 (Finishing): >75% progress → 360s + 90s = 450s total
Phase 4 was widened 2026-04-25 (snake_game-v1 cascade). The original “near completion = faster recovery” intuition was backwards: tail-phase activities (test runs, builds, commits, push, conflict resolution) take LONGER between progress reports than the middle phase, not shorter. Empirical evidence showed 161-215s gaps during the final 25% routinely tripped the old 210s window, causing recovery on tasks that were actually completing successfully. Phase 4 is now the longest window, not the shortest. The 90s grace also covers silent validator LLM calls (60-120s each) that run after 100% is reported; touch_lease is called before each attempt.
These tolerances accommodate the observed 116-120s gap between progress reports during contract-first implementation work, plus a comfortable buffer for agents reading contract files and running tests locally without touching MCP tools.
- async should_recover_expired_lease(lease)[source]#
Determine if expired lease should be recovered using cadence detection.
Compares time since last progress update against the agent’s own median update interval * silence_multiplier. If the agent has been silent for longer than expected based on its established cadence, it’s considered dead and the task should be recovered.
Defense-in-depth guard (Simon decision 011b3fad): if the task is already in a terminal state (DONE/BLOCKED) on the board, skip recovery regardless of cadence. The lease is stale bookkeeping at that point — recovering a finished task only causes a fresh agent to redo work that’s already complete (snake_game-v1 cascade). The lease will be cleared on the next monitor pass; we just don’t reassign.
- Parameters:
lease (
AssignmentLease) – The expired lease to evaluate- Returns:
True if task should be recovered, False to give more time or because the task is already terminal.
- Return type:
Notes
Real data from logs: median progress interval ~47s, mean ~60s. Default silence_multiplier is 1.5x — configurable via constructor.
Fallback: if fewer than 2 progress updates exist (can’t compute median), always recover since the agent has no established cadence.
- class src.core.assignment_lease.LeaseMonitor[source]#
Bases:
objectBackground monitor for lease expiration and recovery.
- __init__(lease_manager, check_interval_seconds=60)[source]#
Initialize the lease monitor.
- Parameters:
lease_manager (
AssignmentLeaseManager) – The lease manager instance.check_interval_seconds (
int) – How often to check for expired leases.