Skip to content

Configuration

Configuration loading and dataclass definitions.

YAML configuration loading with environment variable substitution.

Loads the application config from a YAML file (default: ~/.agent-queue/config.yaml), substitutes ${ENV_VAR} references with environment variable values, and maps the result into typed dataclass instances. Also supports loading a .env file from the same directory as the config file for local development.

The config is loaded once at startup and passed to all major components (orchestrator, Discord bot, scheduler, adapters). Individual sections are represented by dedicated dataclasses so each component can accept only the config it needs.

See specs/config.md for the full specification of all configuration fields.

Attributes

HOT_RELOADABLE_SECTIONS module-attribute

HOT_RELOADABLE_SECTIONS = {'scheduling', 'monitoring', 'hook_engine', 'archive', 'llm_logging', 'pause_retry', 'agents_config', 'auto_task', 'logging', 'agent_profiles', 'rate_limits'}

Config sections that can be safely updated at runtime without restart.

RESTART_REQUIRED_SECTIONS module-attribute

RESTART_REQUIRED_SECTIONS = {'discord', 'data_dir', 'workspace_dir', 'database_path', 'chat_provider', 'memory', 'health_check'}

Config sections that require a full restart to take effect.

Classes

ConfigError dataclass

ConfigError(section: str, field: str, message: str, severity: str = 'error')

A single configuration validation error or warning.

Used by per-section validate() methods and AppConfig.validate() to collect ALL issues before reporting, so operators can fix everything in one pass.

ConfigValidationError

ConfigValidationError(errors: list[str])

Bases: Exception

Raised when the application configuration fails validation checks.

Contains a list of all validation errors found, not just the first one, so operators can fix all issues in one pass.

Source code in src/config.py
def __init__(self, errors: list[str]):
    self.errors = errors
    msg = "Configuration validation failed:\n" + "\n".join(f"  - {e}" for e in errors)
    super().__init__(msg)

PerProjectChannelsConfig dataclass

PerProjectChannelsConfig(auto_create: bool = False, naming_convention: str = '{project_id}', category_name: str = '', private: bool = True)

Configuration for automatic per-project Discord channel management.

DiscordConfig dataclass

DiscordConfig(bot_token: str = '', guild_id: str = '', channels: dict[str, str] = (lambda: {'channel': 'agent-queue', 'agent_questions': 'agent-questions'})(), authorized_users: list[str] = list(), per_project_channels: PerProjectChannelsConfig = PerProjectChannelsConfig())

Discord bot connection and channel routing settings.

AgentsDefaultConfig dataclass

AgentsDefaultConfig(heartbeat_interval_seconds: int = 30, stuck_timeout_seconds: int = 0, graceful_shutdown_timeout_seconds: int = 30)

Default timeouts for agent health monitoring and graceful shutdown.

SchedulingConfig dataclass

SchedulingConfig(rolling_window_hours: int = 24, min_task_guarantee: bool = True)

Controls how the scheduler distributes agent capacity across projects.

rolling_window_hours defines the lookback period for proportional credit accounting. min_task_guarantee ensures every active project gets at least one task slot regardless of credit balance.

PauseRetryConfig dataclass

PauseRetryConfig(rate_limit_backoff_seconds: int = 60, token_exhaustion_retry_seconds: int = 300, rate_limit_max_retries: int = 3, rate_limit_max_backoff_seconds: int = 300)

Backoff and retry timing for rate-limited and token-exhausted tasks.

Controls both the in-process exponential backoff (before a task is paused) and the longer pause durations (after a task enters PAUSED state and waits for resume_after to elapse).

AutoTaskConfig dataclass

AutoTaskConfig(enabled: bool = True, plan_file_patterns: list[str] = (lambda: ['.claude/plan.md', 'plan.md', 'docs/plans/*.md', 'plans/*.md', 'docs/plan.md'])(), inherit_repo: bool = True, inherit_approval: bool = True, base_priority: int = 100, chain_dependencies: bool = True, rebase_between_subtasks: bool = False, mid_chain_rebase: bool = True, mid_chain_rebase_push: bool = False, max_plan_depth: int = 1, max_steps_per_plan: int = 5, use_llm_parser: bool = False, llm_parser_model: str = '', skip_if_implemented: bool = True)

Configuration for auto-generating tasks from implementation plans.

ArchiveConfig dataclass

ArchiveConfig(enabled: bool = True, after_hours: float = 24.0, statuses: list[str] = (lambda: ['COMPLETED', 'FAILED', 'BLOCKED'])())

Configuration for automatic archiving of terminal tasks.

When enabled, the orchestrator automatically archives tasks that have been in a terminal status (COMPLETED, FAILED, BLOCKED) for longer than after_hours. This keeps the active task list clean without requiring manual /archive-tasks commands.

MonitoringConfig dataclass

MonitoringConfig(stuck_task_threshold_seconds: int = 3600)

Configuration for monitoring stuck or stalled tasks.

MemoryConfig dataclass

MemoryConfig(enabled: bool = False, embedding_provider: str = 'openai', embedding_model: str = '', embedding_base_url: str = '', embedding_api_key: str = '', milvus_uri: str = '~/.agent-queue/memsearch/milvus.db', milvus_token: str = '', max_chunk_size: int = 1500, overlap_lines: int = 2, auto_remember: bool = True, auto_recall: bool = True, recall_top_k: int = 5, compact_enabled: bool = False, compact_interval_hours: int = 24, index_notes: bool = True, index_sessions: bool = False)

Configuration for the semantic memory subsystem (memsearch).

All fields have safe defaults — the subsystem is disabled unless enabled is explicitly set to True in the YAML config. See notes/memsearch-integration.md for full documentation.

LoggingConfig dataclass

LoggingConfig(level: str = 'INFO', format: str = 'text', include_source: bool = False)

Configuration for structured logging and output format.

Controls the Python stdlib logging setup. When format is "json", all log output is emitted as single-line JSON objects suitable for log aggregation systems. The "text" format (default) is human-readable with correlation context appended.

ChatProviderConfig dataclass

ChatProviderConfig(provider: str = 'anthropic', model: str = '', base_url: str = '', keep_alive: str = '1h')

LLM provider settings for the Discord chat agent (not the coding agents).

LLMLoggingConfig dataclass

LLMLoggingConfig(enabled: bool = False, retention_days: int = 30)

Configuration for logging LLM inputs/outputs to JSONL files.

AgentProfileConfig dataclass

AgentProfileConfig(id: str = '', name: str = '', description: str = '', model: str = '', permission_mode: str = '', allowed_tools: list[str] = list(), mcp_servers: dict[str, dict] = dict(), system_prompt_suffix: str = '', install: dict = dict())

Configuration for an agent profile loaded from YAML.

Profiles from YAML are synced to the database at startup. Profiles can also be created dynamically via Discord commands.

HealthCheckConfig dataclass

HealthCheckConfig(enabled: bool = False, port: int = 8080)

Configuration for the HTTP health check server.

When enabled, the daemon exposes /health and /ready endpoints on the configured port for external monitoring and load balancer probes.

AppConfig dataclass

AppConfig(data_dir: str = (lambda: os.path.expanduser('~/.agent-queue'))(), workspace_dir: str = (lambda: os.path.expanduser('~/agent-queue-workspaces'))(), database_path: str = (lambda: os.path.expanduser('~/.agent-queue/agent-queue.db'))(), profile: str = '', env: str = 'production', discord: DiscordConfig = DiscordConfig(), agents_config: AgentsDefaultConfig = AgentsDefaultConfig(), scheduling: SchedulingConfig = SchedulingConfig(), pause_retry: PauseRetryConfig = PauseRetryConfig(), chat_provider: ChatProviderConfig = ChatProviderConfig(), hook_engine: HookEngineConfig = HookEngineConfig(), health_check: HealthCheckConfig = HealthCheckConfig(), logging: LoggingConfig = LoggingConfig(), monitoring: MonitoringConfig = MonitoringConfig(), archive: ArchiveConfig = ArchiveConfig(), auto_task: AutoTaskConfig = AutoTaskConfig(), memory: MemoryConfig = MemoryConfig(), llm_logging: LLMLoggingConfig = LLMLoggingConfig(), agent_profiles: list[AgentProfileConfig] = list(), global_token_budget_daily: int | None = None, rate_limits: dict[str, dict[str, int]] = dict(), _config_path: str = '')

Top-level application configuration aggregating all subsystem configs.

Instantiated once by load_config() at startup and threaded through to all major components. Each component reads only its relevant sub-config.

The env field selects the environment profile (dev, staging, production). When set, load_config will look for an override file named config.{env}.yaml in the same directory as the main config file and deep-merge it over the base config.

The validate() method performs fail-fast checks on critical settings. The reload_non_critical() method returns a fresh config with only non-critical settings updated from disk for hot-reloading.

Functions

validate
validate() -> list[ConfigError]

Validate all configuration settings, delegating to per-section validators.

Returns a list of all ConfigError instances found (errors and warnings). Does NOT raise — callers decide how to handle errors. The load_config() function still raises ConfigValidationError for backward compatibility.

Source code in src/config.py
def validate(self) -> list[ConfigError]:
    """Validate all configuration settings, delegating to per-section validators.

    Returns a list of all ConfigError instances found (errors and warnings).
    Does NOT raise — callers decide how to handle errors. The ``load_config()``
    function still raises ``ConfigValidationError`` for backward compatibility.
    """
    errors: list[ConfigError] = []

    # Cross-field: critical path checks
    if not self.workspace_dir:
        errors.append(ConfigError("app", "workspace_dir", "workspace_dir is required"))
    elif not os.access(self.workspace_dir, os.W_OK) and not os.path.exists(self.workspace_dir):
        # Check if parent dir is writable (could create workspace_dir)
        parent = os.path.dirname(self.workspace_dir)
        if parent and os.path.exists(parent) and not os.access(parent, os.W_OK):
            errors.append(ConfigError(
                "app", "workspace_dir",
                f"'{self.workspace_dir}' is not writable and parent directory is not writable",
                severity="warning"
            ))

    if not self.database_path:
        errors.append(ConfigError("app", "database_path", "database_path is required"))
    else:
        db_parent = os.path.dirname(self.database_path)
        if db_parent and not os.path.exists(db_parent):
            # Check if we can create the parent
            grandparent = os.path.dirname(db_parent)
            if grandparent and os.path.exists(grandparent) and not os.access(grandparent, os.W_OK):
                errors.append(ConfigError(
                    "app", "database_path",
                    f"parent directory '{db_parent}' does not exist and cannot be created",
                    severity="warning"
                ))

    # Delegate to per-section validators
    errors.extend(self.discord.validate())
    errors.extend(self.agents_config.validate())
    errors.extend(self.scheduling.validate())
    errors.extend(self.pause_retry.validate())
    errors.extend(self.chat_provider.validate())
    errors.extend(self.auto_task.validate())
    errors.extend(self.archive.validate())
    errors.extend(self.llm_logging.validate())
    errors.extend(self.memory.validate())

    # Agent profiles
    for profile in self.agent_profiles:
        errors.extend(profile.validate())

    # Health check port range
    if self.health_check.enabled:
        if not (1 <= self.health_check.port <= 65535):
            errors.append(ConfigError(
                "health_check", "port",
                f"must be between 1 and 65535, got {self.health_check.port}"
            ))

    # Monitoring threshold
    if self.monitoring.stuck_task_threshold_seconds < 0:
        errors.append(ConfigError(
            "monitoring", "stuck_task_threshold_seconds",
            "must be >= 0"
        ))

    # Rate limits structure validation
    for scope, limits in self.rate_limits.items():
        if not isinstance(limits, dict):
            errors.append(ConfigError(
                "rate_limits", scope,
                f"expected a dict, got {type(limits).__name__}"
            ))

    return errors
reload_non_critical
reload_non_critical() -> 'AppConfig'

Return a new AppConfig with non-critical settings refreshed from disk.

Non-critical settings (safe to change at runtime without restart): - scheduling, pause_retry, auto_task, archive, monitoring - hook_engine, llm_logging

Critical settings (NOT reloaded — require restart): - discord, database_path, workspace_dir, chat_provider, memory, health_check

Returns a new AppConfig instance; the caller is responsible for swapping references. If the config file cannot be read or parsed, the current config is returned unchanged and the error is logged.

Source code in src/config.py
def reload_non_critical(self) -> "AppConfig":
    """Return a new AppConfig with non-critical settings refreshed from disk.

    Non-critical settings (safe to change at runtime without restart):
    - scheduling, pause_retry, auto_task, archive, monitoring
    - hook_engine, llm_logging

    Critical settings (NOT reloaded — require restart):
    - discord, database_path, workspace_dir, chat_provider, memory,
      health_check

    Returns a new AppConfig instance; the caller is responsible for
    swapping references.  If the config file cannot be read or parsed,
    the current config is returned unchanged and the error is logged.
    """
    if not self._config_path or not os.path.exists(self._config_path):
        return self

    try:
        fresh = load_config(self._config_path, profile=self.profile or None)
    except Exception as e:
        logger.warning("Config hot-reload failed, keeping current config: %s", e)
        return self

    # Create a copy of current config and update only non-critical sections
    updated = copy.deepcopy(self)
    updated.scheduling = fresh.scheduling
    updated.pause_retry = fresh.pause_retry
    updated.auto_task = fresh.auto_task
    updated.archive = fresh.archive
    updated.monitoring = fresh.monitoring
    updated.hook_engine = fresh.hook_engine
    updated.llm_logging = fresh.llm_logging

    return updated

ConfigWatcher

ConfigWatcher(config_path: str, event_bus, current_config: AppConfig, poll_interval: float = 30.0)

Watches the config file for changes and emits events on reload.

Uses mtime-based polling (not filesystem events) for maximum portability. On change detection, loads the new config, validates it, diffs against the current config, and emits config.reloaded / config.restart_needed events via the EventBus.

Only hot-reloadable sections are applied; restart-required sections trigger a warning event but are not applied.

Source code in src/config.py
def __init__(
    self,
    config_path: str,
    event_bus,  # EventBus — imported lazily to avoid circular imports
    current_config: AppConfig,
    poll_interval: float = 30.0,
):
    self._config_path = config_path
    self._bus = event_bus
    self._config = current_config
    self._poll_interval = poll_interval
    self._last_mtime: float = 0.0
    self._task: asyncio.Task | None = None
    # Initialize mtime
    try:
        self._last_mtime = os.path.getmtime(config_path)
    except OSError:
        pass

Attributes

config property
config: AppConfig

Return the current config (may have been updated by reload).

Functions

start
start() -> None

Start the background polling task.

Source code in src/config.py
def start(self) -> None:
    """Start the background polling task."""
    if self._task is None or self._task.done():
        self._task = asyncio.create_task(self._poll_loop())
stop async
stop() -> None

Stop the background polling task.

Source code in src/config.py
async def stop(self) -> None:
    """Stop the background polling task."""
    if self._task and not self._task.done():
        self._task.cancel()
        try:
            await self._task
        except asyncio.CancelledError:
            pass
        self._task = None
reload async
reload() -> dict

Reload configuration from disk, diff, and emit events.

Returns a summary dict with changed_sections, restart_required, and applied keys.

Source code in src/config.py
async def reload(self) -> dict:
    """Reload configuration from disk, diff, and emit events.

    Returns a summary dict with ``changed_sections``,
    ``restart_required``, and ``applied`` keys.
    """
    try:
        new_config = load_config(
            self._config_path,
            profile=self._config.profile or None,
        )
    except Exception as e:
        logger.warning("Config reload failed (keeping current config): %s", e)
        return {"error": str(e), "changed_sections": [], "applied": []}

    changed = diff_configs(self._config, new_config)
    if not changed:
        return {"changed_sections": [], "restart_required": [], "applied": []}

    # Classify changes
    hot_reloadable = changed & HOT_RELOADABLE_SECTIONS
    restart_needed = changed & RESTART_REQUIRED_SECTIONS

    # Apply only hot-reloadable sections
    if hot_reloadable:
        for section in hot_reloadable:
            if hasattr(self._config, section) and hasattr(new_config, section):
                setattr(self._config, section, getattr(new_config, section))

        await self._bus.emit("config.reloaded", {
            "changed_sections": sorted(hot_reloadable),
            "config": self._config,
        })
        logger.info(
            "Config hot-reload: updated sections: %s",
            ", ".join(sorted(hot_reloadable)),
        )

    if restart_needed:
        await self._bus.emit("config.restart_needed", {
            "changed_sections": sorted(restart_needed),
        })
        logger.warning(
            "Config reload: sections require restart to take effect: %s",
            ", ".join(sorted(restart_needed)),
        )

    return {
        "changed_sections": sorted(changed),
        "restart_required": sorted(restart_needed),
        "applied": sorted(hot_reloadable),
    }

Functions

diff_configs

diff_configs(old: AppConfig, new: AppConfig) -> set[str]

Compare two AppConfig instances and return the set of changed section names.

Uses dataclasses.asdict() for deep comparison of each section. Skips internal fields (prefixed with _).

Source code in src/config.py
def diff_configs(old: AppConfig, new: AppConfig) -> set[str]:
    """Compare two AppConfig instances and return the set of changed section names.

    Uses ``dataclasses.asdict()`` for deep comparison of each section.
    Skips internal fields (prefixed with ``_``).
    """
    changed: set[str] = set()
    old_dict = dataclasses.asdict(old)
    new_dict = dataclasses.asdict(new)
    for field_name in _SECTION_FIELDS:
        old_val = old_dict.get(field_name)
        new_val = new_dict.get(field_name)
        if old_val != new_val:
            changed.add(field_name)
    return changed

load_config

load_config(path: str, profile: str | None = None) -> AppConfig

Load and validate application configuration from a YAML file.

Processing order
  1. Load .env from the config file's directory (without overriding existing env vars)
  2. Parse the base YAML file
  3. Determine the environment profile (AGENT_QUEUE_ENV env var, or env field in config, default "production")
  4. If an overlay file config.{env}.yaml exists in the same directory, deep-merge it over the base config
  5. If a profile is specified (via --profile CLI arg or AGENT_QUEUE_PROFILE env var), load the profile overlay from profiles/{profile}.yaml relative to the config directory and deep-merge it over the config
  6. Recursively substitute ${ENV_VAR} references in all strings
  7. Map sections into typed dataclass instances
  8. Run validate() to catch misconfiguration early

Parameters:

Name Type Description Default
path str

Path to the base YAML config file.

required
profile str | None

Optional profile name. Falls back to AGENT_QUEUE_PROFILE env var if not provided. When set, the corresponding file {config_dir}/profiles/{profile}.yaml must exist.

None
Source code in src/config.py
def load_config(path: str, profile: str | None = None) -> AppConfig:
    """Load and validate application configuration from a YAML file.

    Processing order:
      1. Load ``.env`` from the config file's directory (without overriding
         existing env vars)
      2. Parse the base YAML file
      3. Determine the environment profile (``AGENT_QUEUE_ENV`` env var,
         or ``env`` field in config, default ``"production"``)
      4. If an overlay file ``config.{env}.yaml`` exists in the same
         directory, deep-merge it over the base config
      5. If a *profile* is specified (via ``--profile`` CLI arg or
         ``AGENT_QUEUE_PROFILE`` env var), load the profile overlay from
         ``profiles/{profile}.yaml`` relative to the config directory and
         deep-merge it over the config
      6. Recursively substitute ``${ENV_VAR}`` references in all strings
      7. Map sections into typed dataclass instances
      8. Run ``validate()`` to catch misconfiguration early

    Args:
        path: Path to the base YAML config file.
        profile: Optional profile name. Falls back to ``AGENT_QUEUE_PROFILE``
            env var if not provided. When set, the corresponding file
            ``{config_dir}/profiles/{profile}.yaml`` must exist.
    """
    if not os.path.exists(path):
        raise FileNotFoundError(f"Config file not found: {path}")

    _load_env_file(path)

    with open(path) as f:
        raw = yaml.safe_load(f) or {}

    # Determine environment profile for overlay loading
    env = os.environ.get("AGENT_QUEUE_ENV", raw.get("env", "production"))

    # Load environment-specific overlay (e.g. config.dev.yaml)
    config_dir = os.path.dirname(path) or "."
    base_name = os.path.basename(path)
    name_part, ext = os.path.splitext(base_name)
    overlay_path = os.path.join(config_dir, f"{name_part}.{env}{ext}")
    if os.path.exists(overlay_path):
        with open(overlay_path) as f:
            overlay = yaml.safe_load(f) or {}
        raw = _deep_merge(raw, overlay)

    # Resolve profile: CLI arg > env var > none
    resolved_profile = profile or os.environ.get("AGENT_QUEUE_PROFILE", "") or ""

    if resolved_profile:
        profiles_dir = os.path.join(config_dir, "profiles")
        profile_path = os.path.join(profiles_dir, f"{resolved_profile}.yaml")
        if not os.path.exists(profile_path):
            # List available profiles for a helpful error message
            available: list[str] = []
            if os.path.isdir(profiles_dir):
                available = sorted(
                    os.path.splitext(f)[0]
                    for f in os.listdir(profiles_dir)
                    if f.endswith((".yaml", ".yml"))
                )
            msg = f"Profile '{resolved_profile}' not found: {profile_path}"
            if available:
                msg += f"\nAvailable profiles: {', '.join(available)}"
            else:
                msg += f"\nNo profiles found in {profiles_dir}"
            raise FileNotFoundError(msg)
        with open(profile_path) as f:
            profile_raw = yaml.safe_load(f) or {}
        raw = _deep_merge(raw, profile_raw)

    raw = _process_values(raw)

    config = AppConfig()
    config._config_path = path
    config.profile = resolved_profile
    config.env = env

    if "data_dir" in raw:
        config.data_dir = raw["data_dir"]
    if "workspace_dir" in raw:
        config.workspace_dir = raw["workspace_dir"]
    if "database_path" in raw:
        config.database_path = raw["database_path"]
    if "global_token_budget_daily" in raw:
        config.global_token_budget_daily = raw["global_token_budget_daily"]

    if "discord" in raw:
        d = raw["discord"]
        ppc = PerProjectChannelsConfig()
        if "per_project_channels" in d:
            pp = d["per_project_channels"]
            ppc = PerProjectChannelsConfig(
                auto_create=pp.get("auto_create", False),
                naming_convention=pp.get(
                    "naming_convention", "{project_id}"
                ),
                category_name=pp.get("category_name", ""),
                private=pp.get("private", True),
            )
        # Backward compat: if old config has separate control/notifications,
        # merge into single "channel" entry (prefer control since that's where
        # the bot listens for chat).
        raw_channels = d.get("channels", config.discord.channels)
        if "channel" not in raw_channels and ("control" in raw_channels or "notifications" in raw_channels):
            merged_name = raw_channels.get("control") or raw_channels.get("notifications", "agent-queue")
            raw_channels = {
                "channel": merged_name,
                "agent_questions": raw_channels.get("agent_questions", "agent-questions"),
            }
        config.discord = DiscordConfig(
            bot_token=d.get("bot_token", ""),
            guild_id=d.get("guild_id", ""),
            channels=raw_channels,
            authorized_users=d.get("authorized_users", []),
            per_project_channels=ppc,
        )

    if "agents" in raw:
        a = raw["agents"]
        config.agents_config = AgentsDefaultConfig(
            heartbeat_interval_seconds=a.get("heartbeat_interval_seconds", 30),
            stuck_timeout_seconds=a.get("stuck_timeout_seconds", 0),
            graceful_shutdown_timeout_seconds=a.get(
                "graceful_shutdown_timeout_seconds", 30
            ),
        )

    if "scheduling" in raw:
        s = raw["scheduling"]
        config.scheduling = SchedulingConfig(
            rolling_window_hours=s.get("rolling_window_hours", 24),
            min_task_guarantee=s.get("min_task_guarantee", True),
        )

    if "pause_retry" in raw:
        p = raw["pause_retry"]
        config.pause_retry = PauseRetryConfig(
            rate_limit_backoff_seconds=p.get("rate_limit_backoff_seconds", 60),
            token_exhaustion_retry_seconds=p.get(
                "token_exhaustion_retry_seconds", 300
            ),
            rate_limit_max_retries=p.get("rate_limit_max_retries", 3),
            rate_limit_max_backoff_seconds=p.get("rate_limit_max_backoff_seconds", 300),
        )

    if "chat_provider" in raw:
        cp = raw["chat_provider"]
        config.chat_provider = ChatProviderConfig(
            provider=cp.get("provider", "anthropic"),
            model=cp.get("model", ""),
            base_url=cp.get("base_url", ""),
            keep_alive=cp.get("keep_alive", "1h"),
        )

    if "hook_engine" in raw:
        h = raw["hook_engine"]
        config.hook_engine = HookEngineConfig(
            enabled=h.get("enabled", True),
            max_concurrent_hooks=h.get("max_concurrent_hooks", 2),
            file_watcher_enabled=h.get("file_watcher_enabled", True),
            file_watcher_poll_interval=h.get("file_watcher_poll_interval", 10.0),
            file_watcher_debounce_seconds=h.get("file_watcher_debounce_seconds", 5.0),
        )

    if "logging" in raw:
        lg = raw["logging"]
        config.logging = LoggingConfig(
            level=lg.get("level", "INFO"),
            format=lg.get("format", "text"),
            include_source=lg.get("include_source", False),
        )

    if "monitoring" in raw:
        m = raw["monitoring"]
        config.monitoring = MonitoringConfig(
            stuck_task_threshold_seconds=m.get(
                "stuck_task_threshold_seconds", 3600
            ),
        )

    if "archive" in raw:
        ar = raw["archive"]
        config.archive = ArchiveConfig(
            enabled=ar.get("enabled", True),
            after_hours=float(ar.get("after_hours", 24.0)),
            statuses=ar.get("statuses", ["COMPLETED", "FAILED", "BLOCKED"]),
        )

    if "auto_task" in raw:
        at = raw["auto_task"]
        config.auto_task = AutoTaskConfig(
            enabled=at.get("enabled", True),
            plan_file_patterns=at.get("plan_file_patterns", [
                ".claude/plan.md", "plan.md",
                "docs/plans/*.md", "plans/*.md", "docs/plan.md",
            ]),
            inherit_repo=at.get("inherit_repo", True),
            inherit_approval=at.get("inherit_approval", True),
            base_priority=at.get("base_priority", 100),
            chain_dependencies=at.get("chain_dependencies", True),
            rebase_between_subtasks=at.get("rebase_between_subtasks", False),
            mid_chain_rebase=at.get("mid_chain_rebase", True),
            mid_chain_rebase_push=at.get("mid_chain_rebase_push", False),
            max_plan_depth=at.get("max_plan_depth", 1),
            max_steps_per_plan=at.get("max_steps_per_plan", 5),
            use_llm_parser=at.get("use_llm_parser", False),
            llm_parser_model=at.get("llm_parser_model", ""),
            skip_if_implemented=at.get("skip_if_implemented", True),
        )

    if "memory" in raw:
        mem = raw["memory"]
        config.memory = MemoryConfig(
            enabled=mem.get("enabled", False),
            embedding_provider=mem.get("embedding_provider", "openai"),
            embedding_model=mem.get("embedding_model", ""),
            embedding_base_url=mem.get("embedding_base_url", ""),
            embedding_api_key=mem.get("embedding_api_key", ""),
            milvus_uri=mem.get("milvus_uri", "~/.agent-queue/memsearch/milvus.db"),
            milvus_token=mem.get("milvus_token", ""),
            max_chunk_size=mem.get("max_chunk_size", 1500),
            overlap_lines=mem.get("overlap_lines", 2),
            auto_remember=mem.get("auto_remember", True),
            auto_recall=mem.get("auto_recall", True),
            recall_top_k=mem.get("recall_top_k", 5),
            compact_enabled=mem.get("compact_enabled", False),
            compact_interval_hours=mem.get("compact_interval_hours", 24),
            index_notes=mem.get("index_notes", True),
            index_sessions=mem.get("index_sessions", False),
        )

    if "llm_logging" in raw:
        ll = raw["llm_logging"]
        config.llm_logging = LLMLoggingConfig(
            enabled=ll.get("enabled", False),
            retention_days=ll.get("retention_days", 30),
        )

    if "agent_profiles" in raw:
        profiles = []
        for pid, pdata in raw["agent_profiles"].items():
            if not isinstance(pdata, dict):
                continue
            profiles.append(AgentProfileConfig(
                id=pid,
                name=pdata.get("name", pid),
                description=pdata.get("description", ""),
                model=pdata.get("model", ""),
                permission_mode=pdata.get("permission_mode", ""),
                allowed_tools=pdata.get("allowed_tools", []),
                mcp_servers=pdata.get("mcp_servers", {}),
                system_prompt_suffix=pdata.get("system_prompt_suffix", ""),
                install=pdata.get("install", {}),
            ))
        config.agent_profiles = profiles

    if "health_check" in raw:
        hc = raw["health_check"]
        config.health_check = HealthCheckConfig(
            enabled=hc.get("enabled", False),
            port=hc.get("port", 8080),
        )

    if "rate_limits" in raw:
        config.rate_limits = raw["rate_limits"]

    # Fail fast on misconfiguration — surface all errors at once.
    # validate() returns ConfigError list; convert fatal errors to exception
    # for backward compatibility.
    config_errors = config.validate()
    fatal_errors = [str(e) for e in config_errors if e.severity == "error"]
    if fatal_errors:
        raise ConfigValidationError(fatal_errors)

    # Log warnings (non-fatal)
    for e in config_errors:
        if e.severity == "warning":
            logger.warning("Config warning: %s", e)

    return config