Service Registry System#
Overview#
The Marcus Service Registry is a lightweight, filesystem-based service discovery mechanism that enables multiple clients (like Cato, Claude Desktop, and other integrations) to automatically discover and connect to running Marcus instances without manual configuration.
What the System Does#
The Service Registry provides a decentralized approach to service discovery by:
Service Advertisement: Each Marcus instance registers itself in a discoverable location when it starts
Automatic Discovery: Clients can find available Marcus instances without knowing connection details beforehand
Health Monitoring: Tracks service health and automatically cleans up stale registrations
Connection Information: Provides MCP command strings and metadata for establishing connections
Multi-Instance Support: Handles multiple concurrent Marcus instances across different projects
Architecture#
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Marcus β β Service β β Client β
β Instance 1 βββββΆβ Registry ββββββ (Cato) β
β β β (~/.marcus/ β β β
βββββββββββββββββββ β services/) β βββββββββββββββββββ
β β
βββββββββββββββββββ β βββββββββββββββ β βββββββββββββββββββ
β Marcus βββββΆβ β marcus_1234 β β β Client β
β Instance 2 β β β .json β β β (Claude β
β β β βββββββββββββββ β β Desktop) β
βββββββββββββββββββ β β β β
β βββββββββββββββ β βββββββββββββββββββ
βββββββββββββββββββ β β marcus_5678 β β
β Marcus βββββΆβ β .json β β
β Instance 3 β β βββββββββββββββ β
β β β β
βββββββββββββββββββ ββββββββββββββββββββ
Core Components#
MarcusServiceRegistry Class: Main registry management
Service Files: JSON files in
~/.marcus/services/directoryGlobal Registry Instance: Singleton pattern for process-wide access
Convenience Functions: Simplified API for common operations
Marcus Ecosystem Integration#
The Service Registry serves as the discovery backbone for the Marcus ecosystem:
Marcus Server: Registers itself on startup, updates heartbeat periodically
Cato: Discovers available Marcus instances for GUI connections
Claude Desktop: Can auto-connect to running Marcus without manual MCP configuration
CLI Tools: Development tools can find running instances for debugging
Monitoring Systems: External monitoring can discover and health-check Marcus instances
Workflow Integration#
In the typical Marcus workflow, the Service Registry operates parallel to the main task flow:
create_project β register_agent β request_next_task β report_progress β report_blocker β finish_task
β β β β β β
[Service Registration] ββββββββββ [Heartbeat Updates] ββββββββββββββββ [Cleanup]
When Itβs Invoked#
Startup Registration: When Marcus MCP server starts (
src/marcus_mcp/server.py)Heartbeat Updates: Periodic updates during operation (optional)
Shutdown Cleanup: Automatic cleanup via
atexithandlerClient Discovery: When external clients need to find Marcus instances
What Makes This System Special#
1. Zero-Configuration Discovery#
Unlike traditional service discovery that requires service registries or configuration files, this system works automatically:
No external dependencies (Redis, Consul, etc.)
No network configuration required
Works across different user sessions and environments
2. Process-Aware Cleanup#
# Automatic stale service detection
if cls._is_process_running(service_info.get("pid")):
services.append(service_info)
else:
# Clean up stale service file
service_file.unlink()
3. Cross-Platform Compatibility#
def _get_registry_dir(self) -> Path:
if platform.system() == "Windows":
base_dir = Path(os.environ.get("APPDATA", tempfile.gettempdir()))
else:
base_dir = Path.home()
registry_dir = base_dir / ".marcus" / "services"
4. Rich Service Metadata#
Each service registration includes:
Connection details (MCP command)
Project context (current project, provider)
Runtime information (PID, working directory, Python version)
Lifecycle timestamps (started_at, last_heartbeat)
Technical Implementation Details#
Registration Process#
def register_service(self, mcp_command: str, log_dir: str, project_name: str = None,
provider: str = None, **kwargs) -> Dict[str, Any]:
service_info = {
"instance_id": self.instance_id,
"pid": os.getpid(),
"mcp_command": mcp_command, # Key for client connections
"log_dir": str(Path(log_dir).absolute()),
"project_name": project_name,
"provider": provider,
"status": "running",
"started_at": datetime.now().isoformat(),
"last_heartbeat": datetime.now().isoformat(),
"platform": platform.system(),
"python_version": platform.python_version(),
"working_directory": str(Path.cwd()),
**kwargs,
}
# Atomic write to prevent corruption
with open(self.registry_file, "w") as f:
json.dump(service_info, f, indent=2)
Discovery Algorithm#
@classmethod
def discover_services(cls) -> List[Dict[str, Any]]:
services = []
# Scan all service files
for service_file in registry.registry_dir.glob("marcus_*.json"):
try:
with open(service_file, "r") as f:
service_info = json.load(f)
# Health check via process existence
if cls._is_process_running(service_info.get("pid")):
services.append(service_info)
else:
try:
service_file.unlink() # Cleanup stale entries
except (OSError, PermissionError):
pass
except (json.JSONDecodeError, FileNotFoundError):
try:
service_file.unlink() # Cleanup corrupted files
except (OSError, PermissionError):
pass
return sorted(services, key=lambda x: x.get("started_at", ""))
Instance Identification#
def __init__(self, instance_id: str = None):
# Uses PID for uniqueness across restarts
self.instance_id = instance_id or f"marcus_{os.getpid()}"
self.registry_file = self.registry_dir / f"{self.instance_id}.json"
Pros and Cons#
Advantages#
Simplicity: No external dependencies or complex setup
Reliability: File system operations are atomic and reliable
Performance: Fast discovery via filesystem globbing
Debugging: Human-readable JSON files for troubleshooting
Security: Uses userβs home directory with standard file permissions
Multi-Platform: Works consistently across Windows, macOS, Linux
Disadvantages#
Local Only: Cannot discover services across network boundaries
File System Dependency: Requires writable filesystem access
Cleanup Timing: Stale entries persist until next discovery operation
Concurrency: No locking mechanism for concurrent registration/discovery
Scale Limitations: Not designed for high-frequency operations or many services
Why This Approach Was Chosen#
Design Rationale#
Developer Experience: Eliminates manual MCP server configuration for common use cases
Zero Dependencies: Avoids external service registry dependencies that would complicate deployment
Debugging Friendly: Service files can be inspected directly for troubleshooting
Graceful Degradation: System continues working even if some service files are corrupted
Alternative Approaches Considered#
Network-based discovery (mDNS/Bonjour): Too complex for local development use case
Database registry: Overkill and would require database setup
Configuration files: Would require manual management and updates
Environment variables: Not dynamic enough for multiple instances
Evolution and Future Directions#
Planned Enhancements#
Network Discovery: Support for remote Marcus instances via optional network protocols
Service Metadata: Enhanced metadata for capability-based discovery
Health Monitoring: More sophisticated health checks beyond process existence
Load Balancing: Client-side load balancing for multiple available instances
Potential Improvements#
# Future: Enhanced service metadata
service_info = {
# Current fields...
"capabilities": ["project_management", "ai_analysis", "kanban_integration"],
"load_metrics": {"active_agents": 3, "cpu_usage": 15.2, "memory_mb": 128},
"api_version": "2.1.0",
"supported_providers": ["github", "jira", "trello"],
}
# Future: Service selection by capability
def find_service_with_capability(capability: str) -> Optional[Dict[str, Any]]:
services = discover_services()
return next((s for s in services if capability in s.get("capabilities", [])), None)
Integration with Service Mesh#
As Marcus scales, the Service Registry could evolve to integrate with service mesh technologies:
Service discovery integration with Consul, etcd
Health check endpoints for external monitoring
Metrics exposure for observability platforms
Circuit breaker integration for resilience
Task Complexity Handling#
The Service Registry operates independently of task complexity:
Simple Tasks#
Registration occurs once at startup regardless of task complexity
Same discovery mechanism for all clients
No task-specific metadata in service registration
Complex Tasks#
Service registration includes project context that may be relevant for complex, multi-project scenarios
Heartbeat updates could include progress information for long-running operations
Multiple Marcus instances can handle different complexity levels simultaneously
Board-Specific Considerations#
Provider Integration#
# Service registration includes provider information.
# Note: register_marcus_service() is a module-level convenience wrapper with
# signature register_marcus_service(**kwargs: Any) -> Dict[str, Any].
# The named parameters (mcp_command, log_dir, project_name, provider) are
# forwarded to MarcusServiceRegistry.register_service(), not accepted by
# the wrapper's own signature.
register_marcus_service(
mcp_command=command,
log_dir=log_directory,
project_name="my_project",
provider="github", # or "jira", "trello", etc.
)
Multi-Board Support#
Each Marcus instance can register with different provider information
Clients can discover instances by provider type
Supports scenarios where different boards require different Marcus configurations
Board-Aware Discovery#
# Future: Board-specific service discovery
def discover_services_by_provider(provider: str) -> List[Dict[str, Any]]:
all_services = discover_services()
return [s for s in all_services if s.get("provider") == provider]
Cato Integration#
The Service Registry is crucial for Catoβs auto-connection capability:
Discovery Flow#
Cato calls
MarcusServiceRegistry.discover_services()Gets list of available Marcus instances with connection details
Uses
mcp_commandfrom service info to establish MCP connectionCan present user with choice of multiple available instances
Connection Establishment#
# Cato discovers Marcus instances
services = MarcusServiceRegistry.discover_services()
preferred = MarcusServiceRegistry.get_preferred_service()
if preferred:
mcp_command = preferred["mcp_command"]
# Use mcp_command to establish connection
client = MCPClient(command=mcp_command.split())
GUI Integration#
Service metadata provides rich information for Catoβs GUI
Project names, providers, and status information for user selection
Log directory paths for integrated log viewing
Monitoring and Observability#
Health Monitoring#
def _is_process_running(pid: int) -> bool:
"""Check if a process is running by PID"""
if not pid:
return False
try:
return psutil.pid_exists(pid)
except Exception:
return False
Service Lifecycle Tracking#
started_at: Service startup timestamplast_heartbeat: Most recent activity indicatorstatus: Current service stateAutomatic cleanup of dead services
Debugging Support#
Human-readable JSON service files
Rich metadata for troubleshooting connection issues
Log directory references for detailed investigation
Error Handling and Resilience#
Graceful Degradation#
Continues operation if some service files are corrupted
Automatic cleanup of invalid registrations
No cascading failures from registry issues
Recovery Mechanisms#
try:
with open(service_file, "r") as f:
service_info = json.load(f)
except (json.JSONDecodeError, FileNotFoundError):
# Clean up invalid service files
try:
service_file.unlink()
except (OSError, PermissionError):
pass # Fail silently for cleanup operations
Security Considerations#
File System Security#
Uses userβs home directory with standard file permissions
No network exposure reduces attack surface
JSON format prevents code injection through service files
Process Isolation#
PID-based health checking ensures process ownership
Each service registration is isolated to its own file
No shared state between different Marcus instances
Performance Characteristics#
Discovery Performance#
O(n) file system scan where n = number of registered services
Typically very fast for expected number of services (< 10)
Caching possible at client level for high-frequency discovery
Registration Performance#
Single file write operation (atomic)
No network round-trips required
Minimal overhead during Marcus startup
Resource Usage#
Minimal memory footprint (small JSON files)
No persistent connections or background processes
Clean automatic cleanup prevents resource leaks
Integration Testing#
The Service Registry system should be tested with:
Multi-instance scenarios: Multiple Marcus instances registering simultaneously
Crash recovery: Service cleanup after ungraceful shutdown
Client discovery: Various clients finding and connecting to services
Cross-platform: Registry behavior on Windows, macOS, Linux
File corruption: Recovery from corrupted service files
Permission issues: Handling of read-only filesystems or permission errors
This service registry system provides the foundational infrastructure that makes Marcusβs multi-client ecosystem possible, enabling seamless service discovery while maintaining simplicity and reliability.