编程 Hermes Agent 深度解析：143K Stars 的自进化 AI 智能体框架，从「用完即走」到「越用越懂你」的范式革命

2026-05-15 13:46:17 +0800 CST views 390

Hermes Agent 深度解析：143K Stars 的自进化 AI 智能体框架，从「用完即走」到「越用越懂你」的范式革命

引言：AI Agent 的「健忘症」困境

你有没有这样的体验——每次打开 ChatGPT，都像在跟一个失忆的朋友聊天？昨天教它理解你的项目架构，今天又得从头来一遍。你精心调教的 Prompt，换一个会话窗口就全部归零。

这不是个别产品的问题，而是整个 AI Agent 赛道的结构性缺陷：没有记忆，没有成长，没有延续性。

2026 年 2 月，Nous Research 发布了 Hermes Agent——一个从设计哲学上就与众不同的开源智能体框架。它不追求单次对话的惊艳表现，而是追求一个更本质的目标：让 AI 像一个真正的助手那样，随着使用越来越懂你。

一个月内，GitHub Star 突破 14 万，OpenRouter 单日 Token 消耗量达到 2710 亿，登顶全球应用 Token 消耗榜第一。这不是营销驱动的泡沫，而是解决了一个真实痛点后的市场验证。

本文将从架构设计、核心机制、代码实战、部署方案到性能优化，全面拆解 Hermes Agent 的技术内幕。

一、核心设计哲学：自学习闭环

1.1 传统 Agent 的问题在哪？

主流 AI Agent 的工作模式可以概括为：

用户输入 → LLM 推理 → 工具调用 → 返回结果

这是一个无状态的循环。每次交互都是独立的，Agent 无法从历史经验中受益。你用了一个月，它和第一天一样「笨」。

一些产品尝试通过「记忆」功能来缓解这个问题——比如让用户手动保存重要信息，或者在对话开头塞入一段 System Prompt。但这些都是补丁式方案：

手动记忆：用户负担重，且容易遗漏
System Prompt 注入：上下文窗口有限，信息密度低
RAG 检索：只解决了「找到」，没解决「理解」和「改进」

1.2 Hermes 的解法：经验 → 技能 → 自我改进

Hermes Agent 引入了一个三层自学习闭环：

┌─────────────────────────────────────────────┐
│                                             │
│   ┌──────────┐    ┌──────────┐    ┌──────┐ │
│   │  经验沉淀  │───→│  技能生成  │───→│ 自我  │ │
│   │ Experience│    │  Skills   │    │ 改进  │ │
│   └──────────┘    └──────────┘    └──────┘ │
│        ↑                                │    │
│        └────────────────────────────────┘    │
│                                             │
└─────────────────────────────────────────────┘

第一层：经验沉淀（Experience）

当 Hermes 完成一个任务后，它不会简单地丢弃过程数据。相反，它会分析整个任务执行路径，识别出哪些步骤是关键决策点，哪些工具调用组合是高效的，哪些中间结果可以复用。

# Hermes 的经验记录结构（简化版）
@dataclass
class Experience:
    task_description: str          # 任务描述
    execution_path: List[Step]     # 执行路径
    outcome: str                   # 最终结果
    success: bool                  # 是否成功
    key_decisions: List[Decision]  # 关键决策点
    tool_combos: List[ToolCombo]   # 有效的工具组合
    duration: float                # 执行耗时
    token_cost: int                # Token 消耗

第二层：技能生成（Skills）

当相似的经验积累到一定阈值，Hermes 会自动将它们抽象为一个可复用的「技能文件」。这个技能文件不是简单的 Prompt 模板，而是一个包含了前置条件、执行策略、异常处理和后置验证的结构化知识单元。

# 示例：自动生成的 Docker 部署技能
skill:
  name: docker_deploy
  version: "1.2"
  created_from: ["exp_2847", "exp_3102", "exp_4105"]
  trigger:
    keywords: ["部署", "deploy", "docker", "容器化"]
    context: ["项目目录存在 Dockerfile"]
  preconditions:
    - docker_installed: true
    - dockerfile_exists: true
    - port_available: true
  execution:
    strategy: "multi_stage_build"
    steps:
      - name: "构建优化镜像"
        command: "docker build -t {{project_name}}:{{version}} ."
        optimization: "layer_caching"
      - name: "安全扫描"
        command: "docker scout cves {{project_name}}:{{version}}"
        on_failure: "warn_and_continue"
      - name: "启动容器"
        command: "docker run -d -p {{port}}:{{internal_port}} {{project_name}}:{{version}}"
    error_handling:
      port_conflict: "自动检测空闲端口并重试"
      build_failure: "分析错误日志，建议修复方案"
  postconditions:
    - container_running: true
    - health_check_passed: true
  learned_shortcuts:
    - "对于 Node.js 项目，使用 alpine 基础镜像可减少 70% 镜像体积"
    - "多阶段构建时，先复制 package.json 再复制源码可优化缓存命中"

第三层：自我改进（Self-improvement）

技能不是一成不变的。每次使用技能后，Hermes 会评估执行效果，如果发现可以优化的地方，会自动修改技能文件：

# 技能自改进逻辑（伪代码）
class SkillImprover:
    def improve(self, skill: Skill, execution_result: Result) -> Skill:
        if execution_result.success:
            # 成功案例：检查是否有更优路径
            if execution_result.actual_steps < len(skill.execution.steps):
                skill.execution.shortcuts.append(
                    self._extract_shortcut(execution_result)
                )
                skill.metrics.success_rate = self._update_rate(
                    skill.metrics.success_rate, True
                )
        else:
            # 失败案例：添加新的异常处理分支
            new_handling = self._analyze_failure(execution_result)
            skill.execution.error_handling.update(new_handling)
            skill.metrics.success_rate = self._update_rate(
                skill.metrics.success_rate, False
            )
        
        # 版本升级
        skill.version = self._bump_version(skill.version)
        return skill

1.3 对比：为什么这不是简单的 RAG？

很多人会问：这不就是 RAG（检索增强生成）换个说法吗？真不是。区别在于：

维度	传统 RAG	Hermes 自学习闭环
知识来源	预先灌入的文档	实际任务执行中积累
知识形态	静态文本片段	结构化可执行技能
更新方式	人工重新灌入	自动从经验中提炼
适应性	对所有用户相同	针对当前用户个性化
可解释性	难以追溯检索逻辑	每个技能有清晰的来源链路

RAG 解决的是「找到已知信息」的问题，Hermes 解决的是「从实践中创造新知识」的问题。

二、三层记忆系统

Hermes 的记忆系统基于自研的 Honcho 协议构建，分为三层：

2.1 会话记忆（Session Memory）

这是最短期的记忆，存储当前对话的完整上下文。类似于人类的「工作记忆」，特点是容量有限但访问速度极快。

# 会话记忆的数据结构
class SessionMemory:
    """当前会话的短期记忆"""
    
    def __init__(self, session_id: str):
        self.session_id = session_id
        self.messages: List[Message] = []
        self.active_context: Dict[str, Any] = {}
        self.tool_results: Dict[str, Any] = {}
    
    def add_message(self, role: str, content: str):
        self.messages.append(Message(role=role, content=content))
        # 自动压缩过长的上下文
        if self.estimated_tokens() > 8000:
            self._compress_older_messages()
    
    def _compress_older_messages(self):
        """将较早的消息压缩为摘要，保留关键信息"""
        older = self.messages[:-10]  # 保留最近10条
        summary = self._summarize(older)
        self.messages = [Message(role="system", content=summary)] + self.messages[-10:]

2.2 长期记忆（Long-term Memory）

跨会话持久化的记忆，存储用户偏好、项目知识、历史决策等。这是 Hermes 「越用越懂你」的核心机制。

# 长期记忆的存储与检索
class LongTermMemory:
    """基于 Honcho 协议的跨会话记忆"""
    
    def __init__(self, user_id: str, storage_backend: str = "sqlite"):
        self.user_id = user_id
        self.backend = self._init_backend(storage_backend)
        self.index = VectorIndex(dimension=1536)  # 嵌入维度
    
    async def store(self, key: str, value: Any, metadata: Dict = None):
        """存储记忆，自动建立向量索引"""
        embedding = await self._embed(f"{key}: {value}")
        entry = MemoryEntry(
            key=key,
            value=value,
            embedding=embedding,
            metadata=metadata or {},
            timestamp=datetime.now(),
            access_count=0
        )
        await self.backend.store(entry)
        self.index.add(embedding, entry.id)
    
    async def recall(self, query: str, top_k: int = 5) -> List[MemoryEntry]:
        """语义检索相关记忆"""
        query_embedding = await self._embed(query)
        candidates = self.index.search(query_embedding, top_k=top_k * 2)
        
        # 二次排序：结合语义相似度和时效性
        scored = []
        for entry_id, similarity in candidates:
            entry = await self.backend.get(entry_id)
            recency_score = self._recency_score(entry.timestamp)
            access_score = min(entry.access_count / 10, 1.0)  # 常访问的记忆更相关
            final_score = 0.6 * similarity + 0.3 * recency_score + 0.1 * access_score
            scored.append((entry, final_score))
        
        scored.sort(key=lambda x: x[1], reverse=True)
        return [entry for entry, _ in scored[:top_k]]
    
    def _recency_score(self, timestamp: datetime) -> float:
        """时间衰减函数：越近的记忆分数越高"""
        hours_ago = (datetime.now() - timestamp).total_seconds() / 3600
        return math.exp(-hours_ago / 168)  # 一周半衰期

2.3 元记忆（Meta Memory）

最高层的记忆，记录 Hermes 对用户本身的认知模型——你的技术栈偏好、沟通风格、常见任务模式等。

# 元记忆：用户认知模型
class MetaMemory:
    """关于用户的深层认知"""
    
    def __init__(self, user_id: str):
        self.user_id = user_id
        self.user_model = {
            "tech_stack": [],        # 常用技术栈
            "communication_style": None,  # 沟通风格
            "expertise_level": {},    # 各领域专业度
            "frequent_tasks": [],     # 高频任务
            "preferences": {},        # 偏好设置
            "working_hours": {},      # 工作时间段
        }
    
    def update_from_interaction(self, interaction: Interaction):
        """从每次交互中更新用户模型"""
        # 技术栈推断
        tech_mentions = self._extract_tech_terms(interaction.content)
        for tech in tech_mentions:
            self._increment_expertise(tech, interaction.depth)
        
        # 沟通风格学习
        style = self._infer_style(interaction)
        self.user_model["communication_style"] = style
        
        # 高频任务模式识别
        task_pattern = self._identify_pattern(interaction)
        if task_pattern:
            self._update_frequent_tasks(task_pattern)
    
    def get_context_prompt(self) -> str:
        """生成注入到 System Prompt 的用户上下文"""
        parts = []
        if self.user_model["tech_stack"]:
            stack = ", ".join(self.user_model["tech_stack"][:5])
            parts.append(f"用户主要使用的技术栈：{stack}")
        if self.user_model["communication_style"]:
            parts.append(f"用户偏好{self.user_model['communication_style']}的沟通方式")
        return "\n".join(parts)

2.4 三层记忆的协同工作流

当用户发送一条消息时，三层记忆的协作流程如下：

用户消息 "帮我把这个项目容器化部署"
    │
    ├──→ 会话记忆：当前对话上下文是什么？用户刚在讨论哪个项目？
    │
    ├──→ 长期记忆：用户之前部署过类似项目吗？有哪些相关经验？
    │
    └──→ 元记忆：用户的技术栈是什么？偏好 Docker 还是 K8s？
          │
          ▼
    综合三层记忆，生成上下文增强的 Prompt
          │
          ▼
    LLM 推理 + 工具调用
          │
          ▼
    执行结果 → 更新三层记忆

三、统一消息网关（Gateway）

Hermes 的另一个核心创新是统一消息网关。一个 Gateway 实例可以同时接入 12+ 消息平台：

3.1 Gateway 架构

                    ┌─────────────────┐
                    │   Hermes Core   │
                    │  (Skills/Memory)│
                    └────────┬────────┘
                             │
                    ┌────────▼────────┐
                    │     Gateway     │
                    │  (消息路由中心)   │
                    └────────┬────────┘
                             │
        ┌────────┬───────┬───┴───┬────────┬────────┐
        │        │       │       │        │        │
    ┌───▼──┐ ┌──▼──┐ ┌──▼──┐ ┌─▼──┐ ┌──▼──┐ ┌──▼──┐
    │Telegram│ │Discord│ │Slack│ │WhatsApp│ │Signal│ │Email│
    └──────┘ └─────┘ └─────┘ └────┘ └─────┘ └────┘

3.2 Gateway 配置实战

# ~/.hermes/gateway.yaml
gateway:
  # Telegram 配置
  telegram:
    enabled: true
    bot_token: "${TELEGRAM_BOT_TOKEN}"
    allowed_users: [123456789]  # 限制访问用户
    rate_limit: 30  # 每分钟消息数上限
  
  # Discord 配置
  discord:
    enabled: true
    bot_token: "${DISCORD_BOT_TOKEN}"
    allowed_guilds: ["your-server-id"]
    command_prefix: "!"
  
  # Slack 配置
  slack:
    enabled: false
    app_token: "${SLACK_APP_TOKEN}"
    bot_token: "${SLACK_BOT_TOKEN}"
  
  # WhatsApp (via WhatsApp Web API)
  whatsapp:
    enabled: true
    session_name: "hermes_session"
    phone_number: "${WHATSAPP_PHONE}"
  
  # Signal (via signal-cli)
  signal:
    enabled: false
    phone_number: "${SIGNAL_PHONE}"
  
  # Email
  email:
    enabled: true
    imap:
      host: "imap.gmail.com"
      port: 993
      username: "${EMAIL_ADDRESS}"
      password: "${EMAIL_APP_PASSWORD}"
    smtp:
      host: "smtp.gmail.com"
      port: 587
    check_interval: 300  # 每5分钟检查一次
  
  # CLI (本地终端)
  cli:
    enabled: true
    prompt_style: "rich"  # rich/plain
  
  # 消息路由规则
  routing:
    # 工作时间消息优先发到 Slack
    - condition: "hour >= 9 and hour <= 18"
      platform: "slack"
      priority: "high"
    # 非工作时间发到 Telegram
    - condition: "hour < 9 or hour > 18"
      platform: "telegram"
      priority: "normal"
    # 紧急通知走所有渠道
    - condition: "priority == 'urgent'"
      platform: "all"

3.3 消息路由核心代码

# Gateway 消息路由器（核心逻辑简化版）
class MessageRouter:
    """将各平台消息统一路由到 Hermes Core"""
    
    def __init__(self, config: GatewayConfig):
        self.adapters: Dict[str, PlatformAdapter] = {}
        self.config = config
        self._init_adapters()
    
    async def handle_incoming(self, platform: str, message: IncomingMessage):
        """处理来自各平台的入站消息"""
        # 统一消息格式
        unified = self._normalize_message(platform, message)
        
        # 加载用户上下文（从元记忆）
        user_context = await self._load_user_context(unified.user_id)
        
        # 路由到 Hermes Core
        response = await self.core.process(
            message=unified,
            context=user_context,
            platform=platform
        )
        
        # 将响应发送回原平台
        adapter = self.adapters[platform]
        await adapter.send_response(message.chat_id, response)
        
        # 异步更新记忆
        asyncio.create_task(
            self._update_memories(unified, response)
        )
    
    async def handle_outgoing(self, notification: Notification):
        """将 Hermes 主动发出的通知路由到合适的平台"""
        # 根据路由规则选择平台
        platform = self._resolve_platform(notification, self.config.routing)
        adapter = self.adapters[platform]
        
        # 格式化消息（不同平台有不同的消息格式）
        formatted = self._format_for_platform(notification, platform)
        await adapter.send_notification(formatted)
    
    def _format_for_platform(self, notification: Notification, platform: str) -> str:
        """针对不同平台的消息格式适配"""
        if platform == "telegram":
            # Telegram 支持 MarkdownV2
            return self._to_markdown_v2(notification.content)
        elif platform == "discord":
            # Discord 支持完整 Markdown
            return notification.content  # 已经是 Markdown 格式
        elif platform == "slack":
            # Slack 使用 mrkdwn（Markdown 子集）
            return self._to_mrkdwn(notification.content)
        elif platform == "whatsapp":
            # WhatsApp 只支持简单格式
            return self._to_plain_text(notification.content)
        return notification.content

四、技能系统深入解析

4.1 内置技能库

Hermes 预置了 40+ 技能，覆盖常见开发场景：

类别	技能示例	说明
MLOps	model_train, model_deploy, data_pipeline	机器学习全流程
DevOps	ci_setup, container_deploy, infra_provision	基础设施管理
代码质量	code_review, refactor_suggest, test_gen	代码审查与优化
GitHub	pr_review, issue_triage, release_manage	GitHub 工作流
研究	paper_search, arxiv_summarize, patent_lookup	技术研究辅助
文档	doc_generate, api_spec, changelog_write	文档自动化

4.2 技能文件结构详解

每个技能是一个独立的 YAML 文件，存放在 ~/.hermes/skills/ 目录下：

# ~/.hermes/skills/code_review.yaml
skill:
  name: code_review
  version: "2.1"
  description: "智能代码审查，基于项目上下文和编码规范"
  
  # 触发条件
  trigger:
    keywords: ["代码审查", "code review", "review", "审查代码"]
    file_patterns: ["*.py", "*.ts", "*.go", "*.rs", "*.java"]
    context_signals: ["git_diff_available", "pr_opened"]
  
  # 前置条件检查
  preconditions:
    - type: "file_exists"
      path: ".git"
      message: "需要在 Git 仓库中执行代码审查"
    - type: "command_available"
      command: "git"
      message: "需要安装 Git"
  
  # 执行策略
  execution:
    strategy: "diff_aware"  # 基于差异的增量审查
    
    steps:
      - name: "获取变更范围"
        action: "git_diff"
        args:
          target: "main"
          options: "--stat"
        output_key: "diff_stats"
      
      - name: "分析变更文件"
        action: "analyze_files"
        args:
          files: "${diff_stats.changed_files}"
          analysis_types: ["complexity", "dependencies", "test_coverage"]
        output_key: "file_analysis"
      
      - name: "逐文件审查"
        action: "review_code"
        args:
          files: "${diff_stats.changed_files}"
          focus_areas: "${file_analysis.risk_areas}"
          project_conventions: "${memory.project_conventions}"
        output_key: "review_findings"
      
      - name: "生成审查报告"
        action: "generate_report"
        args:
          findings: "${review_findings}"
          format: "markdown"
          severity_order: true
        output_key: "final_report"
    
    # 并行优化
    parallel_steps:
      - ["analyze_files", "fetch_conventions"]
    
    # 超时控制
    timeout_per_step: 60  # 秒
    total_timeout: 300
  
  # 输出格式
  output:
    format: "markdown"
    template: |
      ## 代码审查报告
      
      ### 📊 概览
      - 变更文件数：{{diff_stats.file_count}}
      - 新增行数：{{diff_stats.additions}}
      - 删除行数：{{diff_stats.deletions}}
      - 风险等级：{{review_findings.risk_level}}
      
      ### 🔍 发现问题
      {{#each review_findings.issues}}
      - **[{{severity}}]** {{file}}:{{line}} — {{message}}
        💡 建议：{{suggestion}}
      {{/each}}
      
      ### ✅ 亮点
      {{#each review_findings.positives}}
      - {{message}}
      {{/each}}
  
  # 自改进配置
  improvement:
    track_metrics: ["accuracy", "false_positive_rate", "user_acceptance"]
    min_samples_for_update: 5
    auto_update: true
    require_confirmation: false  # 高置信度时自动更新
  
  # MCP 扩展
  mcp_servers:
    - name: "github"
      tools: ["get_pull_request", "create_review_comment"]
    - name: "filesystem"
      tools: ["read_file", "search_files"]

4.3 自定义技能开发

除了自动生成的技能，你也可以手动编写技能文件：

# ~/.hermes/skills/api_migration.yaml
# 自定义技能：API 版本迁移助手
skill:
  name: api_migration
  version: "1.0"
  description: "帮助迁移 API 到新版本，自动检测 Breaking Changes"
  
  trigger:
    keywords: ["API迁移", "migrate API", "版本升级", "breaking change"]
  
  execution:
    strategy: "interactive"  # 需要用户确认的交互式策略
    steps:
      - name: "检测当前 API 版本"
        action: "scan_api_usage"
        args:
          scan_paths: ["src/", "lib/", "app/"]
          version_indicators: ["package.json", "requirements.txt", "go.mod"]
        output_key: "current_version"
      
      - name: "获取变更日志"
        action: "fetch_changelog"
        args:
          package: "${current_version.package_name}"
          from_version: "${current_version.version}"
          to_version: "latest"
        output_key: "changelog"
      
      - name: "分析 Breaking Changes"
        action: "analyze_breaking_changes"
        args:
          changelog: "${changelog}"
          current_usage: "${current_version.usage_patterns}"
        output_key: "breaking_changes"
      
      - name: "生成迁移方案"
        action: "generate_migration_plan"
        args:
          changes: "${breaking_changes}"
          codebase: "${current_version.files}"
          strategy: "incremental"  # 增量迁移 vs 一次性迁移
        output_key: "migration_plan"
      
      - name: "执行迁移（需确认）"
        action: "execute_migration"
        args:
          plan: "${migration_plan}"
          dry_run: true  # 先试运行
        requires_confirmation: true
        confirmation_message: "迁移方案已生成，是否执行？"
        output_key: "migration_result"

4.4 MCP（Model Context Protocol）集成

Hermes 通过 MCP 协议安全地扩展工具集，避免了传统 Agent 框架中工具权限混乱的问题：

// ~/.hermes/mcp_config.json
{
  "mcp_servers": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {
        "GITHUB_TOKEN": "${GITHUB_TOKEN}"
      },
      "permissions": {
        "allowed_tools": ["get_pull_request", "list_issues", "create_comment"],
        "denied_tools": ["delete_repository", "create_release"],
        "require_confirmation": ["merge_pull_request"]
      }
    },
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/projects"],
      "permissions": {
        "allowed_paths": ["/home/user/projects"],
        "denied_paths": ["/etc", "/root"],
        "max_file_size": "10MB"
      }
    },
    "database": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-postgres"],
      "env": {
        "DATABASE_URL": "${DATABASE_URL}"
      },
      "permissions": {
        "allowed_operations": ["SELECT", "EXPLAIN"],
        "denied_operations": ["DROP", "TRUNCATE", "DELETE"],
        "max_rows": 1000
      }
    }
  }
}

五、部署方案全解析

Hermes 支持 7 种部署后端，从 $5/月的 VPS 到 Serverless 架构，满足不同场景需求。

5.1 一键安装（推荐）

# Linux / macOS / WSL2
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

# 加载环境变量
source ~/.bashrc  # 或 source ~/.zshrc

# 首次启动（进入配置向导）
hermes

安装脚本自动完成的工作：

检测并安装 uv（Python 包管理器）
克隆仓库到 ~/.hermes/repo/
创建虚拟环境并安装依赖
生成默认配置文件
启动交互式配置向导

5.2 Docker 部署

# Dockerfile.heroku — 生产级 Docker 部署
FROM python:3.11-slim

# 安装系统依赖
RUN apt-get update && apt-get install -y \
    git curl build-essential \
    && rm -rf /var/lib/apt/lists/*

# 安装 uv
RUN curl -LsSf https://astral.sh/uv/install.sh | sh
ENV PATH="/root/.local/bin:${PATH}"

# 安装 Node.js（MCP 服务器需要）
RUN curl -fsSL https://deb.nodesource.com/setup_22.x | bash - \
    && apt-get install -y nodejs

WORKDIR /app

# 先复制依赖文件，利用 Docker 缓存
COPY pyproject.toml uv.lock ./
RUN uv sync --frozen --no-dev

# 复制源码
COPY . .

# 持久化记忆卷
VOLUME ["/root/.hermes/memory", "/root/.hermes/skills"]

# 健康检查
HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
    CMD python -c "import requests; requests.get('http://localhost:8000/health')"

# 启动
CMD ["uv", "run", "hermes", "serve", "--host", "0.0.0.0", "--port", "8000"]

# 构建并运行
docker build -t hermes-agent .
docker run -d \
    --name hermes \
    -p 8000:8000 \
    -v hermes_memory:/root/.hermes/memory \
    -v hermes_skills:/root/.hermes/skills \
    -e OPENAI_API_KEY="${OPENAI_API_KEY}" \
    -e TELEGRAM_BOT_TOKEN="${TELEGRAM_BOT_TOKEN}" \
    hermes-agent

# deploy_modal.py — 使用 Modal 部署 Hermes
import modal

app = modal.App("hermes-agent")

# 定义依赖镜像
image = (
    modal.Image.debian_slim(python_version="3.11")
    .apt_install("git", "curl", "build-essential")
    .pip_install("hermes-agent[all]")
)

# 持久化存储
memory_vol = modal.Volume.from_name("hermes-memory", create_if_missing=True)
skills_vol = modal.Volume.from_name("hermes-skills", create_if_missing=True)

@app.cls(image=image, volumes={"/root/.hermes/memory": memory_vol, "/root/.hermes/skills": skills_vol},
         allow_concurrent_inputs=10, timeout=300)
class HermesService:
    @modal.enter()
    def setup(self):
        """初始化 Hermes 实例"""
        from hermes import Hermes
        self.hermes = Hermes.load(profile="serverless")
    
    @modal.method()
    async def process(self, message: str, user_id: str, platform: str = "api"):
        """处理消息"""
        response = await self.hermes.process(
            message=message,
            user_id=user_id,
            platform=platform
        )
        return response
    
    @modal.method()
    async def health(self):
        """健康检查"""
        return {"status": "healthy", "version": self.hermes.version}

# Webhook 端点
@app.function(image=image, volumes={"/root/.hermes/memory": memory_vol})
@modal.asgi_app()
def webapp(app):
    """ASGI 应用，接收各平台 Webhook"""
    from fastapi import FastAPI, Request
    from hermes.gateway import create_app
    
    hermes_app = create_app(profile="serverless")
    return hermes_app

5.4 SSH 远程部署

Hermes 支持通过 SSH 连接到远程主机执行命令，适合已有服务器的场景：

# ~/.hermes/backends.yaml
backends:
  - name: "production_server"
    type: "ssh"
    host: "your-server.com"
    port: 22
    user: "deploy"
    key_file: "~/.ssh/id_ed25519"
    working_dir: "/home/deploy/projects"
    allowed_commands: ["python", "node", "docker", "git", "npm", "uv"]
    denied_commands: ["rm -rf", "sudo", "shutdown"]
    timeout: 120
    
  - name: "staging_server"
    type: "ssh"
    host: "staging.your-server.com"
    port: 22
    user: "deploy"
    key_file: "~/.ssh/id_ed25519"
    working_dir: "/home/deploy/staging"
    sandbox: true  # 沙箱模式，限制文件系统访问

六、Cron 定时任务

Hermes 内置了 Cron 调度系统，可以定时执行技能或主动推送信息：

# ~/.hermes/cron.yaml
cron:
  # 每天早上 9 点检查 GitHub 通知
  - name: "github_morning_digest"
    schedule: "0 9 * * 1-5"  # 工作日 9:00
    skill: "github_digest"
    args:
      repos: ["my-org/main-app", "my-org/api-service"]
      include: ["new_issues", "pr_reviews", "security_alerts"]
    notify_via: "telegram"
    
  # 每小时检查服务健康
  - name: "health_check"
    schedule: "0 * * * *"
    skill: "service_monitor"
    args:
      endpoints:
        - name: "API"
          url: "https://api.example.com/health"
          expected_status: 200
        - name: "Web"
          url: "https://www.example.com"
          expected_status: 200
      alert_threshold: 2  # 连续失败 2 次才告警
    notify_via: "all"  # 通过所有渠道发送告警
    
  # 每周一自动生成周报
  - name: "weekly_report"
    schedule: "0 18 * * 5"  # 每周五 18:00
    skill: "report_generate"
    args:
      sources: ["git_log", "jira_tickets", "slack_messages"]
      period: "week"
      format: "markdown"
    notify_via: "email"
    
  # 每天凌晨清理过期记忆
  - name: "memory_cleanup"
    schedule: "0 3 * * *"
    action: "memory_cleanup"
    args:
      max_age_days: 90
      keep_frequently_accessed: true

七、多配置文件与子代理

7.1 多配置文件（Multi-Profile）

Hermes 支持运行多个隔离的实例，每个实例有独立的记忆和技能：

# 创建新配置文件
hermes profile create work --base default

# 列出所有配置文件
hermes profile list
# → default (active)
# → work

# 切换配置文件
hermes profile switch work

# 在指定配置文件下运行任务
hermes run --profile work "检查生产环境的日志"

# ~/.hermes/profiles/work.yaml
profile:
  name: work
  description: "工作环境，连接公司基础设施"
  
  llm:
    provider: "openai"
    model: "gpt-4o"
    temperature: 0.3  # 工作场景更保守
    
  memory:
    namespace: "work"  # 独立的记忆命名空间
    retention_days: 180
    
  mcp_servers:
    - name: "github"
      config:
        org: "my-company"
    - name: "jira"
      config:
        server: "https://my-company.atlassian.net"
    - name: "postgres"
      config:
        url: "${WORK_DB_URL}"
  
  gateway:
    platforms: ["slack", "email"]
    routing:
      default_platform: "slack"

7.2 子代理（Sub-Agent）工作流

对于复杂任务，Hermes 可以将任务分解并委派给专门的子代理：

# 子代理编排示例（伪代码）
class SubAgentOrchestrator:
    """子代理编排器"""
    
    async def decompose_task(self, task: str) -> List[SubTask]:
        """将复杂任务分解为子任务"""
        decomposition = await self.llm.generate(
            prompt=f"""
            将以下任务分解为独立的子任务：
            {task}
            
            要求：
            1. 每个子任务有明确的输入和输出
            2. 标注子任务之间的依赖关系
            3. 指定最适合的子代理类型
            """,
            response_format=TaskDecomposition
        )
        return decomposition.subtasks
    
    async def execute(self, task: str):
        """执行编排流程"""
        subtasks = await self.decompose_task(task)
        
        # 构建执行图
        graph = self._build_dag(subtasks)
        
        # 拓扑排序执行
        results = {}
        for level in graph.topological_levels():
            # 同一层级的子任务可以并行
            tasks_at_level = [
                self._execute_subtask(st, results) 
                for st in level
            ]
            level_results = await asyncio.gather(*tasks_at_level)
            for st, result in zip(level, level_results):
                results[st.id] = result
        
        return self._synthesize(results)

# 示例：全栈代码审查
# 主任务："审查这个 PR 的代码质量、安全性和性能"
#
# 分解结果：
# ├── 代码质量子代理：检查代码风格、复杂度、可维护性
# ├── 安全审查子代理：检查 SQL 注入、XSS、敏感信息泄露
# └── 性能分析子代理：检查 N+1 查询、内存泄漏、算法复杂度
#
# 三个子代理并行执行，最终由主编排器综合报告

八、性能优化与成本控制

8.1 Token 消耗优化

Hermes 的 Token 消耗是实际使用中的核心成本问题。以下是几个关键优化策略：

# Token 优化配置
class TokenOptimizer:
    """Token 消耗优化器"""
    
    # 策略 1：分层模型选择
    MODEL_TIERS = {
        "simple_qa": "gpt-4o-mini",      # 简单问答用小模型
        "skill_execution": "gpt-4o",      # 技能执行用标准模型
        "complex_reasoning": "claude-3.5-sonnet",  # 复杂推理用强模型
        "code_generation": "gpt-4o",      # 代码生成用标准模型
    }
    
    def select_model(self, task: Task) -> str:
        """根据任务复杂度选择合适的模型"""
        complexity = self._estimate_complexity(task)
        if complexity < 0.3:
            return self.MODEL_TIERS["simple_qa"]
        elif complexity < 0.7:
            return self.MODEL_TIERS["skill_execution"]
        else:
            return self.MODEL_TIERS["complex_reasoning"]
    
    # 策略 2：上下文压缩
    def compress_context(self, messages: List[Message]) -> List[Message]:
        """压缩上下文，保留关键信息"""
        if self._estimate_tokens(messages) < 4000:
            return messages
        
        # 保留最近的消息 + 历史摘要
        recent = messages[-6:]
        older = messages[:-6]
        
        summary = self._summarize(older, max_tokens=500)
        return [Message(role="system", content=f"[历史摘要]\n{summary}")] + recent
    
    # 策略 3：技能缓存
    async def cached_skill_execution(self, skill: Skill, args: Dict):
        """对相同的技能调用使用缓存"""
        cache_key = self._make_cache_key(skill.name, args)
        
        cached = await self.cache.get(cache_key)
        if cached and self._is_fresh(cached, max_age=3600):
            return cached.result
        
        result = await self._execute_skill(skill, args)
        await self.cache.set(cache_key, result)
        return result

8.2 记忆索引优化

# 向量索引性能优化
class OptimizedMemoryIndex:
    """高性能记忆索引"""
    
    def __init__(self):
        # 使用 HNSW 索引（近似最近邻），O(log n) 查询
        self.index = hnswlib.Index(space='cosine', dim=1536)
        self.id_map: Dict[int, str] = {}
        self._init_index()
    
    def _init_index(self):
        """初始化索引参数"""
        self.index.init_index(
            max_elements=100000,  # 最大记忆条目数
            ef_construction=200,  # 构建时搜索深度（越大越精确，越慢）
            M=32                  # 每个节点的连接数（越大越精确，内存越多）
        )
        self.index.set_ef(100)  # 查询时搜索深度
    
    async def search(self, query_embedding: List[float], k: int = 5):
        """高效向量搜索"""
        labels, distances = self.index.knn_query(query_embedding, k=k)
        return [
            (self.id_map[label], 1 - distance)  # 转换为相似度
            for label, distance in zip(labels[0], distances[0])
        ]
    
    async def add(self, embedding: List[float], memory_id: str):
        """添加记忆到索引"""
        idx = len(self.id_map)
        self.id_map[idx] = memory_id
        self.index.add_items([embedding], [idx])

8.3 并行任务处理

# 并行任务处理器
class ParallelTaskProcessor:
    """并行处理多个独立子任务"""
    
    def __init__(self, max_concurrent: int = 5):
        self.semaphore = asyncio.Semaphore(max_concurrent)
        self.results: Dict[str, Any] = {}
    
    async def execute_dag(self, tasks: List[Task], dependencies: Dict[str, List[str]]):
        """按依赖关系并行执行任务"""
        completed = set()
        in_progress = set()
        
        while len(completed) < len(tasks):
            # 找到所有依赖已满足的任务
            ready = [
                t for t in tasks
                if t.id not in completed
                and t.id not in in_progress
                and all(d in completed for d in dependencies.get(t.id, []))
            ]
            
            # 并行启动就绪任务
            for task in ready:
                in_progress.add(task.id)
                asyncio.create_task(self._run_with_semaphore(task, completed, in_progress))
            
            await asyncio.sleep(0.1)  # 避免忙等待
        
        return self.results
    
    async def _run_with_semaphore(self, task, completed, in_progress):
        async with self.semaphore:
            try:
                self.results[task.id] = await self._execute(task)
            finally:
                in_progress.discard(task.id)
                completed.add(task.id)

九、安全模型

9.1 权限分级

Hermes 采用了严格的权限分级模型：

┌──────────────────────────────────────┐
│           Level 0: 只读               │
│   读取文件、搜索代码、查看日志          │
├──────────────────────────────────────┤
│           Level 1: 受限写入           │
│   创建文件（非覆盖）、添加注释          │
├──────────────────────────────────────┤
│           Level 2: 标准操作           │
│   运行测试、Git 操作、容器管理          │
├──────────────────────────────────────┤
│           Level 3: 敏感操作           │
│   部署发布、数据库变更、密钥操作         │
│   ⚠️ 需要用户确认                      │
└──────────────────────────────────────┘

9.2 沙箱执行

# 沙箱执行器
class SandboxExecutor:
    """在受限环境中执行不可信代码"""
    
    ALLOWED_COMMANDS = {
        "python", "node", "npm", "pip", "uv",
        "git", "ls", "cat", "grep", "head", "tail",
        "docker", "kubectl"
    }
    
    DENIED_PATTERNS = [
        r"rm\s+-rf\s+/",           # 危险的删除命令
        r"sudo\s+",                 # 提权命令
        r"chmod\s+777",            # 过度宽松的权限
        r">\s*/etc/",               # 修改系统配置
        r"curl\s+.*\|\s*sh",       # 管道执行远程脚本
        r"eval\s+",                 # 动态代码执行
    ]
    
    def validate_command(self, command: str) -> Tuple[bool, str]:
        """验证命令是否安全"""
        # 检查基础命令
        base_cmd = command.split()[0] if command.split() else ""
        if base_cmd not in self.ALLOWED_COMMANDS:
            return False, f"命令 '{base_cmd}' 不在允许列表中"
        
        # 检查危险模式
        for pattern in self.DENIED_PATTERNS:
            if re.search(pattern, command):
                return False, f"命令包含危险模式：{pattern}"
        
        return True, "命令验证通过"
    
    async def execute(self, command: str, timeout: int = 30) -> ExecutionResult:
        """在沙箱中执行命令"""
        is_safe, reason = self.validate_command(command)
        if not is_safe:
            return ExecutionResult(
                success=False,
                output="",
                error=f"安全检查未通过：{reason}"
            )
        
        try:
            proc = await asyncio.create_subprocess_shell(
                command,
                stdout=asyncio.subprocess.PIPE,
                stderr=asyncio.subprocess.PIPE,
                # 资源限制
                preexec_fn=lambda: resource.setrlimit(
                    resource.RLIMIT_CPU, (timeout, timeout)
                )
            )
            stdout, stderr = await asyncio.wait_for(
                proc.communicate(), timeout=timeout + 5
            )
            return ExecutionResult(
                success=proc.returncode == 0,
                output=stdout.decode(),
                error=stderr.decode()
            )
        except asyncio.TimeoutError:
            proc.kill()
            return ExecutionResult(success=False, error="执行超时")

十、实战：从零搭建一个 24/7 在线的 AI 运维助手

让我们把前面学到的所有知识串起来，搭建一个真实可用的 AI 运维助手。

10.1 目标

24/7 在线，通过 Telegram 远程交互
自动监控服务健康状态
异常时主动告警并尝试自愈
每天生成运维日报
记忆历史故障模式，提供越来越精准的排障建议

10.2 步骤一：安装与基础配置

# 1. 一键安装
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
source ~/.zshrc

# 2. 首次配置
hermes setup \
    --llm-provider openai \
    --llm-model gpt-4o \
    --telegram-bot \
    --memory-backend sqlite

# 3. 配置 LLM API Key
hermes config set llm.api_key "${OPENAI_API_KEY}"

# 4. 配置 Telegram Bot
hermes gateway add telegram --token "${TELEGRAM_BOT_TOKEN}"

10.3 步骤二：编写自定义监控技能

# ~/.hermes/skills/service_monitor.yaml
skill:
  name: service_monitor
  version: "1.0"
  description: "监控服务健康状态，支持 HTTP/TCP/Docker 检查"
  
  trigger:
    keywords: ["监控", "monitor", "健康检查", "服务状态"]
    cron: "*/5 * * * *"  # 每 5 分钟自动执行
  
  execution:
    steps:
      - name: "检查所有端点"
        action: "parallel_health_check"
        args:
          endpoints:
            - name: "主站 API"
              url: "https://api.example.com/health"
              method: "GET"
              expected:
                status: 200
                response_time_ms: 500
            - name: "数据库"
              type: "tcp"
              host: "db.internal"
              port: 5432
              timeout: 3
            - name: "Redis"
              type: "tcp"
              host: "redis.internal"
              port: 6379
              timeout: 2
            - name: "Elasticsearch"
              url: "https://es.internal:9200/_cluster/health"
              expected:
                status: 200
                body_contains: '"status":"green"'
        output_key: "health_results"
      
      - name: "异常分析"
        action: "analyze_anomalies"
        args:
          results: "${health_results}"
          history: "${memory.monitoring_history}"
        output_key: "anomalies"
      
      - name: "自愈尝试"
        action: "auto_remediate"
        condition: "${anomalies.has_critical} == true"
        args:
          anomalies: "${anomalies.critical}"
          strategies: "${memory.remediation_strategies}"
          max_attempts: 2
        requires_confirmation: false  # 低风险自愈自动执行
        output_key: "remediation_result"
      
      - name: "生成报告"
        action: "generate_report"
        args:
          health: "${health_results}"
          anomalies: "${anomalies}"
          remediation: "${remediation_result}"
        output_key: "report"
  
  output:
    on_success:
      store_memory: true
      notify:
        condition: "${anomalies.count} > 0"
        platform: "telegram"
        message: |
          ⚠️ 监控告警
          
          异常数量：${anomalies.count}
          严重级别：${anomalies.max_severity}
          
          详情：
          {{#each anomalies.items}}
          - [{{severity}}] {{name}}: {{message}}
          {{/each}}
          
          {{#if remediation_result.success}}
          ✅ 已自动修复：${remediation_result.actions}
          {{else}}
          ❌ 自动修复失败，需要人工介入
          {{/if}}
    
    on_failure:
      notify:
        platform: "telegram"
        message: "🚨 监控脚本执行异常，请检查 Hermes 日志"

10.4 步骤三：配置定时任务

# ~/.hermes/cron.yaml
cron:
  # 每 5 分钟健康检查
  - name: "health_check"
    schedule: "*/5 * * * *"
    skill: "service_monitor"
    notify_via: "telegram"
  
  # 每天 9:00 运维日报
  - name: "daily_report"
    schedule: "0 9 * * 1-5"
    skill: "ops_daily_report"
    args:
      sources: ["monitoring_history", "incident_log", "deployment_log"]
      period: "24h"
    notify_via: "telegram"
  
  # 每周日晚备份检查
  - name: "backup_verify"
    schedule: "0 22 * * 0"
    skill: "backup_check"
    args:
      backup_dirs: ["/data/backups", "/data/snapshots"]
      min_age_hours: 24
      verify_integrity: true
    notify_via: "email"

10.5 步骤四：启动服务

# 启动 Hermes 守护进程
hermes serve --daemon

# 查看状态
hermes status
# → Core: running (pid 12345)
# → Gateway: telegram ✓, email ✓
# → Cron: 3 jobs active
# → Memory: 1,247 entries
# → Skills: 43 (5 custom)

# 查看日志
hermes logs --tail -f

10.6 实际交互示例

在 Telegram 中与你的运维助手交互：

你：数据库好像变慢了，帮我查查

Hermes：我来检查一下 PostgreSQL 的状态...

📊 PostgreSQL 诊断结果：

1. 当前连接数：187/200（93.5%）← ⚠️ 接近上限
2. 慢查询（>1s）：过去 1 小时 23 条
3. 锁等待：5 个会话在等待锁
4. 缓存命中率：94.2%（正常 >99%）

🔍 根因分析：
连接数异常升高，主要来源是 API 服务实例 app-server-03。
该实例在 14:32 开始产生大量短连接，疑似连接池配置错误。

💡 建议操作：
1. 立即：终止 app-server-03 的空闲连接（预计释放 45 个连接）
2. 短期：检查 app-server-03 的连接池配置
3. 长期：升级 max_connections 到 300

是否执行建议操作 1？（我上次帮你处理过类似问题，那次是连接池泄漏，这次模式很像）

你：执行

Hermes：✅ 已终止 app-server-03 的 47 个空闲连接
当前连接数：140/200（70%）
慢查询已恢复正常水平

我已经将这次经验记录到技能库了，下次遇到类似的连接数飙升，会优先检查连接池配置。

十一、与其他 Agent 框架的对比

维度	Hermes Agent	OpenAI Codex	Claude Code	AutoGPT
自学习能力	✅ 经验→技能闭环	❌ 无	❌ 无	⚠️ 有限（需手动配置）
跨会话记忆	✅ 三层记忆系统	❌ 每次会话独立	⚠️ CLAUDE.md	⚠️ 本地文件
多平台接入	✅ 12+ 平台	❌ CLI only	❌ CLI only	❌ CLI only
Serverless 部署	✅ 7 种后端	❌	❌	❌
技能复用	✅ 自动生成+手动编写	❌	⚠️ Slash Commands	⚠️ 插件系统
MCP 支持	✅ 完整	❌	✅	❌
定时任务	✅ 内置 Cron	❌	❌	❌
开源协议	MIT	❌ 闭源	❌ 闭源	MIT
成本模型	自带 API Key	订阅制	订阅制	自带 API Key

Hermes 最核心的差异化优势是自学习闭环。其他框架在你用了一个月后，和第一天没有区别；而 Hermes 会随着使用越来越懂你——了解你的技术栈、你的编码习惯、你的项目架构、你的常见问题模式。

十二、局限性与发展方向

12.1 当前局限

记忆准确性：长期记忆的语义检索偶尔会返回不太相关的结果，特别是在记忆条目较少时（冷启动问题）
技能质量参差：自动生成的技能文件质量取决于 LLM 的推理能力，复杂场景下可能需要人工微调
多语言支持：Gateway 的消息格式化对不同平台的适配还不够完善，特别是 WhatsApp 和 Signal
资源消耗：三层记忆系统 + 向量索引 + Cron 调度，内存占用在 2-4GB 左右，$5 VPS 可能需要优化
调试困难：当技能执行出错时，追踪问题链路（记忆检索 → 技能选择 → 工具调用）比较困难

12.2 发展方向

根据 Nous Research 的 Roadmap，接下来几个重要方向：

技能市场：类似 VS Code 扩展市场，用户可以分享和安装社区技能
多用户协作：多个 Hermes 实例之间的记忆共享和任务协作
本地模型支持：集成 Ollama/LM Studio，完全离线运行
可视化调试器：Web UI 查看记忆状态、技能执行链路、Cron 调度历史
Federated Learning：多个 Hermes 实例之间安全地共享改进后的技能，但不泄露用户数据

十三、总结

Hermes Agent 解决的不是「AI 能不能写代码」的问题，而是「AI 能不能成为一个越来越好的助手」的问题。

在 2026 年的 AI Agent 赛道，单次对话的智能程度已经不再是瓶颈——GPT-4o、Claude 3.5、Gemini 都能写出漂亮的代码。真正的差异化在于延续性和成长性：

延续性：你昨天教它的东西，今天还在吗？上周的上下文，下周还记得吗？
成长性：它能不能从自己的成功和失败中学到东西？它会不会主动优化自己的工作方式？

Hermes 通过三层记忆系统解决了延续性，通过自学习闭环解决了成长性。这不是一个渐进式的改进，而是一个范式级的转变——从「工具」到「伙伴」。

143K Stars 不是偶然。当 AI Agent 从「用完即走」变成「越用越懂你」，用户的使用模式也会从「偶尔调用」变成「持续依赖」。这才是 AI Agent 的终极形态。

参考链接

Hermes Agent GitHub 仓库：https://github.com/NousResearch/hermes-agent
Nous Research 官网：https://nousresearch.com
Honcho 协议规范：https://github.com/NousResearch/honcho
MCP 协议官方文档：https://modelcontextprotocol.io
Hermes Agent 官方文档：https://hermes-agent.readthedocs.io