src.ai.providers.local_provider module#

Local LLM Provider for Marcus AI.

Implements support for local models via Ollama or other OpenAI-compatible servers. This provider enables running Marcus with complete local AI inference, removing dependency on external API services.

Classes#

LocalLLMProvider

Local model provider supporting Ollama and OpenAI-compatible endpoints

Notes

Requires a local LLM server running (e.g., Ollama, llama.cpp server, etc.) Model selection via MARCUS_LOCAL_LLM_PATH environment variable. Base URL configurable via MARCUS_LOCAL_LLM_URL (defaults to Ollama).

Examples

>>> # With Ollama running locally
>>> os.environ['MARCUS_LOCAL_LLM_PATH'] = 'codellama:13b'
>>> os.environ['MARCUS_LLM_PROVIDER'] = 'local'
>>> provider = LocalLLMProvider('codellama:13b')
class src.ai.providers.local_provider.LocalLLMProvider[source]#

Bases: BaseLLMProvider

Local LLM provider for semantic AI analysis.

Supports Ollama and other OpenAI-compatible local inference servers. Optimized for coding and reasoning tasks with models like CodeLlama, DeepSeek-Coder, or Mixtral.

Parameters:

model_name (str) – Name of the model to use (e.g., ‘codellama:13b’, ‘deepseek-coder:6.7b’)

base_url#

Local LLM server URL (default: http://localhost:11434/v1 for Ollama)

Type:

str

model#

Model identifier for the local server

Type:

str

max_tokens#

Maximum tokens for responses

Type:

int

timeout#

API request timeout in seconds

Type:

float

client#

Async HTTP client for API calls

Type:

httpx.AsyncClient

Examples

>>> provider = LocalLLMProvider('codellama:13b')
>>> analysis = await provider.analyze_task(task, context)
__init__(model_name)[source]#

Initialize local LLM provider.

Parameters:

model_name (str) – Model to use (e.g., ‘codellama:13b’)

Return type:

None

async analyze_task(task, context)[source]#

Analyze task semantics using local LLM.

Parameters:
  • task (Task) – Task to analyze

  • context (Dict[str, Any]) – Project context including related tasks

Returns:

Comprehensive semantic analysis of the task

Return type:

SemanticAnalysis

async infer_dependencies(tasks)[source]#

Infer semantic dependencies between tasks.

Parameters:

tasks (List[Task]) – All tasks to analyze for dependencies

Returns:

Inferred dependencies with confidence scores

Return type:

List[SemanticDependency]

async generate_enhanced_description(task, context)[source]#

Generate enhanced task description.

Parameters:
  • task (Task) – Task needing better description

  • context (Dict[str, Any]) – Project context

Returns:

Enhanced, detailed task description

Return type:

str

async estimate_effort(task, context)[source]#

Estimate task effort using local LLM.

Parameters:
  • task (Task) – Task to estimate

  • context (Dict[str, Any]) – Project context with team velocity

Returns:

Hours estimate with confidence and factors

Return type:

EffortEstimate

async analyze_blocker(task, blocker, context)[source]#

Analyze blocker and suggest solutions.

Parameters:
  • task (Task) – Blocked task

  • blocker (str) – Description of the blocker

  • context (Dict[str, Any]) – Additional context including severity

Returns:

Prioritized solution suggestions

Return type:

List[str]

async complete(prompt, max_tokens=None, temperature=None)[source]#

Complete text using local LLM for direct access.

Parameters:
  • prompt (str) – The prompt to complete

  • max_tokens (Optional[int]) – Maximum tokens to generate. None (default) uses self.max_tokens which is sourced from config.ai.max_tokens at provider construction. Pass an explicit value only when a single call needs a tighter or looser budget than the project default.

  • temperature (float | None) – Sampling temperature (0.0-1.0). If None, uses config value.

Returns:

The completion text

Return type:

str