src.core.assignment_lease module#

Assignment Lease System for automatic task recovery.

This module implements a lease-based assignment system where tasks are assigned with time-limited leases that must be renewed through progress reports. Tasks with expired leases are automatically returned to the TODO state for reassignment.

Key features: - Automatic lease renewal on progress reports - Configurable lease durations based on task complexity - Escalation for tasks with excessive renewals - Integration with assignment persistence

class src.core.assignment_lease.LeaseStatus[source]#

Bases: Enum

Status of an assignment lease.

ACTIVE = 'active'#
EXPIRING_SOON = 'expiring_soon'#
EXPIRED = 'expired'#
RENEWED = 'renewed'#
class src.core.assignment_lease.AssignmentLease[source]#

Bases: object

Represents a time-limited assignment lease.

task_id: str#
agent_id: str#
assigned_at: datetime#
lease_expires: datetime#
last_renewed: datetime#
renewal_count: int = 0#
estimated_hours: float = 4.0#
progress_percentage: int = 0#
last_progress_message: str = ''#
grace_period_seconds: float | None = None#
update_timestamps: list[datetime]#
merge_conflict_extensions: int = 0#
property median_update_interval: float | None#

Calculate median seconds between progress updates.

Returns:

Median interval in seconds, or None if fewer than 2 timestamps.

Return type:

Optional[float]

property time_remaining: timedelta#

Calculate time remaining on lease.

property is_expired: bool#

Check if lease has expired.

property is_expiring_soon: bool#

Check if lease expires within 1 hour.

property status: LeaseStatus#

Get current lease status.

calculate_renewal_duration(lease_manager=None)[source]#

Calculate renewal duration based on progress and history.

Parameters:

lease_manager (Optional[AssignmentLeaseManager]) – Optional reference to lease manager for config.

Return type:

timedelta

Returns:

Renewal duration (adaptive based on multiple factors)

__init__(task_id, agent_id, assigned_at, lease_expires, last_renewed, renewal_count=0, estimated_hours=4.0, progress_percentage=0, last_progress_message='', grace_period_seconds=None, update_timestamps=<factory>, merge_conflict_extensions=0)#
Parameters:
Return type:

None

class src.core.assignment_lease.AssignmentLeaseManager[source]#

Bases: object

Manages assignment leases with automatic expiration and renewal.

__init__(kanban_client, assignment_persistence, default_lease_hours=0.0667, max_renewals=10, warning_threshold_hours=0.0167, priority_multipliers=None, complexity_multipliers=None, grace_period_minutes=1.0, renewal_decay_factor=0.9, min_lease_hours=0.05, max_lease_hours=0.1, stuck_task_threshold_renewals=5, enable_adaptive_leases=True, task_list=None, silence_multiplier=5.0)[source]#

Initialize the lease manager.

Parameters:
  • kanban_client (KanbanInterface) – Interface to kanban board.

  • assignment_persistence (AssignmentPersistence) – Assignment persistence layer.

  • default_lease_hours (float) – Default lease duration in hours.

  • max_renewals (int) – Maximum allowed renewals before escalation.

  • warning_threshold_hours (float) – Hours before expiry to warn.

  • priority_multipliers (Optional[Dict[str, float]]) – Lease duration multipliers by priority.

  • complexity_multipliers (Optional[Dict[str, float]]) – Lease duration multipliers by label/type.

  • grace_period_minutes (float) – Grace period in minutes (float) after expiry before recovery.

  • renewal_decay_factor (float) – Factor to reduce renewal duration over time.

  • min_lease_hours (float) – Minimum allowed lease duration.

  • max_lease_hours (float) – Maximum allowed lease duration.

  • stuck_task_threshold_renewals (int) – Renewals before considering task stuck.

  • enable_adaptive_leases (bool) – Enable smart lease duration adjustments.

  • task_list (Optional[List[Task]]) – Optional reference to project tasks for recovery info updates.

  • silence_multiplier (float)

active_leases: Dict[str, AssignmentLease]#
recoveries_skipped_terminal_status: int#
on_recovery_callback: Callable[[str, str], None] | None#
lease_history: Deque[Dict[str, Any]]#
property lease_lock: Lock#

Get lease lock for the current event loop.

update_task_list(task_list)[source]#

Update the task list reference.

Called by MarcusServer when project_tasks is refreshed.

Parameters:

task_list (List[Task]) – Updated list of project tasks

Return type:

None

async create_lease(task_id, agent_id, task=None)[source]#

Create a new assignment lease.

Parameters:
  • task_id (str) – ID of the task being assigned.

  • agent_id (str) – ID of the agent receiving assignment.

  • task (Optional[Task]) – Optional task object for additional context.

Return type:

AssignmentLease

Returns:

Created assignment lease

async renew_lease(task_id, progress, message='')[source]#

Renew an existing lease based on progress report.

Uses progressive timeout strategy to adapt lease duration based on task progress and agent reliability.

Parameters:
  • task_id (str) – ID of the task.

  • progress (int) – Current progress percentage.

  • message (str) – Progress message.

Return type:

Optional[AssignmentLease]

Returns:

Renewed lease or None if not found/expired

async touch_lease(agent_id)[source]#

Extend an agent’s lease without changing progress.

Called on any MCP tool activity to prove the agent is alive. This is a lightweight alternative to renew_lease that doesn’t require progress data or update cadence tracking.

Parameters:

agent_id (str) – ID of the agent whose lease to extend.

Returns:

True if a lease was touched, False if no active lease found.

Return type:

bool

async check_expired_leases()[source]#

Check for expired leases that need recovery.

Two-phase to avoid holding lease_lock during git subprocess I/O (Codex P2 on PR #350). Holding the global lock during per-lease git status calls would serialize every concurrent renew_lease and touch_lease for the duration of the slowest probe, and could cause active agents’ renewals to starve and look expired in the next cycle.

Phase 1 (lock held, no I/O):

Snapshot leases that have crossed the grace deadline.

Phase 2 (lock released, may do I/O):

For each candidate, try the merge-conflict extension. The extension helper does a git probe outside the lock and briefly re-acquires the lock only to mutate + persist the lease atomically.

Before returning a lease as expired, the merge-conflict extension may grant up to MAX_MERGE_CONFLICT_EXTENSIONS extensions of MERGE_CONFLICT_EXTENSION_SECONDS each when the agent’s worktree has unresolved git conflicts. See the constants at the top of this module for the rationale.

Return type:

List[AssignmentLease]

Returns:

List of expired leases (considering grace period)

async recover_expired_lease(lease)[source]#

Recover a task with an expired lease.

Implements dual-write pattern: 1. Updates task model with structured RecoveryInfo (source of truth) 2. Posts to Kanban comments for audit trail (observability)

Parameters:

lease (AssignmentLease) – The expired lease to recover.

Return type:

bool

Returns:

True if recovery successful

async get_expiring_leases()[source]#

Get leases that are expiring soon.

Return type:

List[AssignmentLease]

Returns:

List of leases expiring within warning threshold

async load_active_leases()[source]#

Load active leases from persistence on startup.

Return type:

None

get_lease_statistics()[source]#

Get statistics about current leases.

Return type:

Dict[str, Any]

calculate_adaptive_timeout(progress, update_count, has_recent_activity)[source]#

Calculate adaptive timeout based on task state (progressive timeout).

Parameters:
  • progress (int) – Current progress percentage (0-100)

  • update_count (int) – Number of progress updates received

  • has_recent_activity (bool) – Whether task shows recent activity

Returns:

(lease_seconds, grace_seconds) timeout configuration

Return type:

tuple[int, int]

Notes

Progressive timeout phases (widened 2026-04-12 after experiment 66 evidence showed agents routinely go 2+ minutes between progress reports during implementation bursts — the previous 90-120s timeouts caused leases to expire mid-implementation, recovering in-progress tasks and reassigning them to other agents):

  • Phase 1 (Unproven): No updates yet → 180s + 60s = 240s total

  • Phase 2 (Working): First update → 240s + 60s = 300s total

  • Phase 3 (Proven): 25-75% progress → 300s + 60s = 360s total

  • Phase 4 (Finishing): >75% progress → 360s + 90s = 450s total

Phase 4 was widened 2026-04-25 (snake_game-v1 cascade). The original “near completion = faster recovery” intuition was backwards: tail-phase activities (test runs, builds, commits, push, conflict resolution) take LONGER between progress reports than the middle phase, not shorter. Empirical evidence showed 161-215s gaps during the final 25% routinely tripped the old 210s window, causing recovery on tasks that were actually completing successfully. Phase 4 is now the longest window, not the shortest. The 90s grace also covers silent validator LLM calls (60-120s each) that run after 100% is reported; touch_lease is called before each attempt.

These tolerances accommodate the observed 116-120s gap between progress reports during contract-first implementation work, plus a comfortable buffer for agents reading contract files and running tests locally without touching MCP tools.

async should_recover_expired_lease(lease)[source]#

Determine if expired lease should be recovered using cadence detection.

Compares time since last progress update against the agent’s own median update interval * silence_multiplier. If the agent has been silent for longer than expected based on its established cadence, it’s considered dead and the task should be recovered.

Defense-in-depth guard (Simon decision 011b3fad): if the task is already in a terminal state (DONE/BLOCKED) on the board, skip recovery regardless of cadence. The lease is stale bookkeeping at that point — recovering a finished task only causes a fresh agent to redo work that’s already complete (snake_game-v1 cascade). The lease will be cleared on the next monitor pass; we just don’t reassign.

Parameters:

lease (AssignmentLease) – The expired lease to evaluate

Returns:

True if task should be recovered, False to give more time or because the task is already terminal.

Return type:

bool

Notes

Real data from logs: median progress interval ~47s, mean ~60s. Default silence_multiplier is 1.5x — configurable via constructor.

Fallback: if fewer than 2 progress updates exist (can’t compute median), always recover since the agent has no established cadence.

class src.core.assignment_lease.LeaseMonitor[source]#

Bases: object

Background monitor for lease expiration and recovery.

__init__(lease_manager, check_interval_seconds=60)[source]#

Initialize the lease monitor.

Parameters:
  • lease_manager (AssignmentLeaseManager) – The lease manager instance.

  • check_interval_seconds (int) – How often to check for expired leases.

async start()[source]#

Start monitoring for expired leases.

Return type:

None

async stop()[source]#

Stop the lease monitor.

Return type:

None