ML Intern (GitHub Repo)
Hugging Face released ML Intern, an open-source autonomous coding agent that researches, writes, and ships machine learning projects using the Hugging Face ecosystem.
Deep dive
- The agent operates in an agentic loop with up to 300 iterations, cycling between LLM calls, tool execution, and result processing until tasks complete
- Built-in tools cover the ML workflow spectrum including Hugging Face docs research, dataset access, repository management, paper searches, GitHub code search, sandbox execution, and planning capabilities
- Context management automatically compacts message history at 170k tokens and can upload sessions to Hugging Face for persistence
- Approval workflows gate sensitive operations like launching cloud jobs, sandbox execution, and destructive operations, with notifications sent via Slack or other gateways
- The doom loop detector monitors for repeated tool usage patterns and injects corrective prompts to break unproductive cycles
- Uses litellm for model abstraction, supporting both Anthropic Claude and OpenAI models with configurable parameters
- Events are emitted throughout execution for monitoring (processing, tool calls, approvals, errors, completion) enabling external integrations
- Extensible architecture allows adding custom built-in tools via Python handlers or integrating MCP servers through configuration
- Installation uses uv package manager for fast Python environment setup and can be installed globally as a CLI tool
- Slack integration supports one-way notifications for approval requests, errors, and task completion with configurable auto-events
- The tool router centralizes all tool execution with unified error handling and result formatting
- Session state includes full message history using litellm message format, enabling resume and replay capabilities
- Headless mode with auto-approval enables fully autonomous execution for batch processing or CI/CD integration
Decoder
- Agentic loop: A recurring cycle where an AI agent calls an LLM, receives tool instructions, executes them, and feeds results back to the LLM until a task completes
- MCP servers: Model Context Protocol servers that provide external tools and capabilities to AI agents through a standardized interface
- litellm: A library that provides a unified API for calling different LLM providers (Anthropic, OpenAI, etc.) with the same code
- uv: A fast Python package manager and environment tool, an alternative to pip and virtualenv
- Context manager: A component that tracks conversation history and automatically compresses it when token limits approach
- Tool router: A dispatcher that maps tool names to their implementations and handles execution with error handling
- Doom loop: A failure pattern where an agent repeatedly executes the same unsuccessful action without making progress
- Headless mode: Running the agent non-interactively with a single prompt, automatically approving all actions without user confirmation
Original article
ML Intern
An ML intern that autonomously researches, writes, and ships good quality ML related code using the Hugging Face ecosystem — with deep access to docs, papers, datasets, and cloud compute.
Quick Start
Installation
git clone git@github.com:huggingface/ml-intern.git
cd ml-intern
uv sync
uv tool install -e .
That's it. Now ml-intern works from any directory:
ml-intern
Create a .env file in the project root (or export these in your shell):
ANTHROPIC_API_KEY=<your-anthropic-api-key> # if using anthropic models
OPENAI_API_KEY=<your-openai-api-key> # if using openai models
HF_TOKEN=<your-hugging-face-token>
GITHUB_TOKEN=<github-personal-access-token>
If no HF_TOKEN is set, the CLI will prompt you to paste one on first launch. To get a GITHUB_TOKEN follow the tutorial here.
Usage
Interactive mode (start a chat session):
ml-intern
Headless mode (single prompt, auto-approve):
ml-intern "fine-tune llama on my dataset"
Options:
ml-intern --model anthropic/claude-opus-4-6 "your prompt"
ml-intern --model openai/gpt-5.5 "your prompt"
ml-intern --max-iterations 100 "your prompt"
ml-intern --no-stream "your prompt"
Supported Gateways
ML Intern currently supports one-way notification gateways from CLI sessions. These gateways send out-of-band status updates; they do not accept inbound chat messages.
Slack
Slack notifications use the Slack Web API to post messages when the agent needs approval, hits an error, or completes a turn. Create a Slack app with a bot token that has chat:write, invite the bot to the target channel, then set:
SLACK_BOT_TOKEN=xoxb-...
SLACK_CHANNEL_ID=C...
The CLI automatically creates a slack.default destination when both variables are present. Optional environment variables for the env-only default:
ML_INTERN_SLACK_NOTIFICATIONS=false
ML_INTERN_SLACK_DESTINATION=slack.ops
ML_INTERN_SLACK_AUTO_EVENTS=approval_required,error,turn_complete
ML_INTERN_SLACK_ALLOW_AGENT_TOOL=true
ML_INTERN_SLACK_ALLOW_AUTO_EVENTS=true
For a persistent user-level config, put overrides in ~/.config/ml-intern/cli_agent_config.json or point ML_INTERN_CLI_CONFIG at a JSON file:
{
"messaging": {
"enabled": true,
"auto_event_types": ["approval_required", "error", "turn_complete"],
"destinations": {
"slack.ops": {
"provider": "slack",
"token": "${SLACK_BOT_TOKEN}",
"channel": "${SLACK_CHANNEL_ID}",
"allow_agent_tool": true,
"allow_auto_events": true
}
}
}
}
Architecture
Component Overview
┌─────────────────────────────────────────────────────────────┐
│ User/CLI │
└────────────┬─────────────────────────────────────┬──────────┘
│ Operations │ Events
↓ (user_input, exec_approval, ↑
submission_queue interrupt, compact, ...) event_queue
│ │
↓ │
┌────────────────────────────────────────────────────┐ │
│ submission_loop (agent_loop.py) │ │
│ ┌──────────────────────────────────────────────┐ │ │
│ │ 1. Receive Operation from queue │ │ │
│ │ 2. Route to handler (run_agent/compact/...) │ │ │
│ └──────────────────────────────────────────────┘ │ │
│ ↓ │ │
│ ┌──────────────────────────────────────────────┐ │ │
│ │ Handlers.run_agent() │ ├──┤
│ │ │ │ │
│ │ ┌────────────────────────────────────────┐ │ │ │
│ │ │ Agentic Loop (max 300 iterations) │ │ │ │
│ │ │ │ │ │ │
│ │ │ ┌──────────────────────────────────┐ │ │ │ │
│ │ │ │ Session │ │ │ │ │
│ │ │ │ ┌────────────────────────────┐ │ │ │ │ │
│ │ │ │ │ ContextManager │ │ │ │ │ │
│ │ │ │ │ • Message history │ │ │ │ │ │
│ │ │ │ │ (litellm.Message[]) │ │ │ │ │ │
│ │ │ │ │ • Auto-compaction (170k) │ │ │ │ │ │
│ │ │ │ │ • Session upload to HF │ │ │ │ │ │
│ │ │ │ └────────────────────────────┘ │ │ │ │ │
│ │ │ │ │ │ │ │ │
│ │ │ │ ┌────────────────────────────┐ │ │ │ │ │
│ │ │ │ │ ToolRouter │ │ │ │ │ │
│ │ │ │ │ ├─ HF docs & research │ │ │ │ │ │
│ │ │ │ │ ├─ HF repos, datasets, │ │ │ │ │ │
│ │ │ │ │ │ jobs, papers │ │ │ │ │ │
│ │ │ │ │ ├─ GitHub code search │ │ │ │ │ │
│ │ │ │ │ ├─ Sandbox & local tools │ │ │ │ │ │
│ │ │ │ │ ├─ Planning │ │ │ │ │ │
│ │ │ │ │ └─ MCP server tools │ │ │ │ │ │
│ │ │ │ └────────────────────────────┘ │ │ │ │ │
│ │ │ └──────────────────────────────────┘ │ │ │ │
│ │ │ │ │ │ │
│ │ │ ┌──────────────────────────────────┐ │ │ │ │
│ │ │ │ Doom Loop Detector │ │ │ │ │
│ │ │ │ • Detects repeated tool patterns │ │ │ │ │
│ │ │ │ • Injects corrective prompts │ │ │ │ │
│ │ │ └──────────────────────────────────┘ │ │ │ │
│ │ │ │ │ │ │
│ │ │ Loop: │ │ │ │
│ │ │ 1. LLM call (litellm.acompletion) │ │ │ │
│ │ │ ↓ │ │ │ │
│ │ │ 2. Parse tool_calls[] │ │ │ │
│ │ │ ↓ │ │ │ │
│ │ │ 3. Approval check │ │ │ │
│ │ │ (jobs, sandbox, destructive ops) │ │ │ │
│ │ │ ↓ │ │ │ │
│ │ │ 4. Execute via ToolRouter │ │ │ │
│ │ │ ↓ │ │ │ │
│ │ │ 5. Add results to ContextManager │ │ │ │
│ │ │ ↓ │ │ │ │
│ │ │ 6. Repeat if tool_calls exist │ │ │ │
│ │ └────────────────────────────────────────┘ │ │ │
│ └──────────────────────────────────────────────┘ │ │
└────────────────────────────────────────────────────┴──┘
Agentic Loop Flow
User Message
↓
[Add to ContextManager]
↓
╔═══════════════════════════════════════════╗
║ Iteration Loop (max 300) ║
║ ║
║ Get messages + tool specs ║
║ ↓ ║
║ litellm.acompletion() ║
║ ↓ ║
║ Has tool_calls? ──No──> Done ║
║ │ ║
║ Yes ║
║ ↓ ║
║ Add assistant msg (with tool_calls) ║
║ ↓ ║
║ Doom loop check ║
║ ↓ ║
║ For each tool_call: ║
║ • Needs approval? ──Yes──> Wait for ║
║ │ user confirm ║
║ No ║
║ ↓ ║
║ • ToolRouter.execute_tool() ║
║ • Add result to ContextManager ║
║ ↓ ║
║ Continue loop ─────────────────┐ ║
║ ↑ │ ║
║ └───────────────────────┘ ║
╚═══════════════════════════════════════════╝
Events
The agent emits the following events via event_queue:
processing- Starting to process user inputready- Agent is ready for inputassistant_chunk- Streaming token chunkassistant_message- Complete LLM response textassistant_stream_end- Token stream finishedtool_call- Tool being called with argumentstool_output- Tool execution resulttool_log- Informational tool log messagetool_state_change- Tool execution state transitionapproval_required- Requesting user approval for sensitive operationsturn_complete- Agent finished processingerror- Error occurred during processinginterrupted- Agent was interruptedcompacted- Context was compactedundo_complete- Undo operation completedshutdown- Agent shutting down
Development
Adding Built-in Tools
Edit agent/core/tools.py:
def create_builtin_tools() -> list[ToolSpec]:
return [
ToolSpec(
name="your_tool",
description="What your tool does",
parameters={
"type": "object",
"properties": {
"param": {"type": "string", "description": "Parameter description"}
},
"required": ["param"]
},
handler=your_async_handler
),
# ... existing tools
]
Adding MCP Servers
Edit configs/cli_agent_config.json for CLI defaults, or configs/frontend_agent_config.json for web-session defaults:
{
"model_name": "anthropic/claude-sonnet-4-5-20250929",
"mcpServers": {
"your-server-name": {
"transport": "http",
"url": "https://example.com/mcp",
"headers": {
"Authorization": "Bearer ${YOUR_TOKEN}"
}
}
}
}
Note: Environment variables like ${YOUR_TOKEN} are auto-substituted from .env.