Skip to main content
This guide covers local development where you configure LLMs via config files and have full control over the execution environment.
Want a faster setup? Check out the Platform Quickstart - no LLM config files needed and built-in observability!
Make sure you’ve completed the Installation steps before proceeding.

Configure LLM Settings

1. Create LLM Config File

Create a llm-config.override.jsonc file to configure your LLM models. This file will override the default configuration.
llm-config.override.jsonc
// Your custom LLM configuration
{
  "planner": {
    "provider": "openai",
    "model": "gpt-5-nano"
  },
  "orchestrator": {
    "provider": "openai",
    "model": "gpt-5-nano"
  },
  "cortex": {
    "provider": "openai",
    "model": "gpt-5",
    "fallback": {
      "provider": "openai",
      "model": "gpt-5"
    }
  },
  "executor": {
    "provider": "openai",
    "model": "gpt-5-nano"
  },
  "utils": {
    "hopper": {
      // Needs at least a 256k context window
      "provider": "openai",
      "model": "gpt-5-nano"
    },
    "outputter": {
      "provider": "openai",
      "model": "gpt-5-nano"
    }
  }
}

2. Configure Environment Variables

Create a .env file in your project root with necessary API keys:
.env
# LLM API Keys (only include the ones you need)
OPENAI_API_KEY=your_key_here
XAI_API_KEY=your_key_here
OPEN_ROUTER_API_KEY=your_key_here
GOOGLE_API_KEY=your_key_here

# Optional: For local LLMs or custom OpenAI-compatible endpoints
# OPENAI_BASE_URL=http://localhost:1234/v1
Never commit your .env file to version control. Add it to your .gitignore.

Creating Your First Automation

Let’s write a simple script that opens a calculator app and performs a basic calculation.
For more examples, check out the mobile-use SDK examples directory on GitHub.
calculator_demo.py
import asyncio
from minitap.mobile_use.sdk import Agent
from minitap.mobile_use.sdk.types import AgentProfile
from minitap.mobile_use.sdk.builders import Builders

async def main():
    # Create an agent profile
    default_profile = AgentProfile(
        name="default", 
        from_file="llm-config.override.jsonc"
    )
    
    # Configure the agent
    agent_config = Builders.AgentConfig.with_default_profile(default_profile).build()
    agent = Agent(config=agent_config)
    
    try:
        # Initialize the agent (connect to the first available device)
        agent.init()
        
        # Define a simple task goal
        result = await agent.run_task(
            goal="Open the calculator app, calculate 123 * 456, and tell me the result",
            name="calculator_demo"
        )
        
        # Print the result
        print(f"Result: {result}")
        
    except Exception as e:
        print(f"Error: {e}")
    finally:
        # Always clean up when finished
        agent.clean()

if __name__ == "__main__":
    asyncio.run(main())

Run the script

python calculator_demo.py
1

Initialize the Agent

The agent connects to your device and starts required servers.
2

Execute the Task

The agent interprets your goal, navigates the UI, and performs the calculation.
3

Clean Up

Resources are properly released.

Getting Structured Output

Mobile-use SDK can return structured data using Pydantic models:
structured_output.py
import asyncio
from pydantic import BaseModel, Field
from minitap.mobile_use.sdk import Agent
from minitap.mobile_use.sdk.types import AgentProfile
from minitap.mobile_use.sdk.builders import Builders

# Define a model for structured output
class CalculationResult(BaseModel):
    expression: str = Field(..., description="The mathematical expression calculated")
    result: float = Field(..., description="The result of the calculation")
    app_used: str = Field(..., description="The name of the calculator app used")

async def main():
    # Create an agent
    default_profile = AgentProfile(
        name="default", 
        from_file="llm-config.override.jsonc"
    )
    agent_config = Builders.AgentConfig.with_default_profile(default_profile).build()
    agent = Agent(config=agent_config)
    
    try:
        agent.init()
        
        # Request structured output using Pydantic model
        result = await agent.run_task(
            goal="Open the calculator app, calculate 123 * 456, and tell me the result",
            output=CalculationResult,
            name="structured_calculator"
        )
        
        if result:
            print(f"Expression: {result.expression}")
            print(f"Result: {result.result}")
            print(f"App used: {result.app_used}")
        
    finally:
        agent.clean()

if __name__ == "__main__":
    asyncio.run(main())
Using Pydantic models ensures type-safe, validated output from your automation tasks.

Understanding the Code

Agent Profile

default_profile = AgentProfile(
    name="default", 
    from_file="llm-config.override.jsonc"
)
The AgentProfile defines which LLM models power different components of the agent.

Agent Configuration

agent_config = Builders.AgentConfig.with_default_profile(default_profile).build()
The Builders.AgentConfig provides a fluent API to configure your agent.

Running Tasks

result = await agent.run_task(
    goal="Your instruction here",
    output=YourPydanticModel,  # Optional
    name="task_name"  # Optional
)
Tasks are executed asynchronously and can return structured output.

Comparing Local vs Platform

✅ When to Use Local

  • Full control over LLM providers
  • Custom infrastructure requirements
  • Offline or air-gapped environments
  • Development and testing

🚀 When to Use Platform

  • Centralized configuration and management
  • Built-in cost monitoring and observability
  • Update tasks without code changes

Next Steps

⌘I