π Recommended approach for most users - Get started faster with centralized configuration and built-in observability.
Why Use the Platform?
β‘ Faster Setup
No LLM config files needed - Start in minutes, not hours
π Real-Time Monitoring
Track costs, execution time, and agent reasoning live
π Dynamic Updates
Update task prompts and LLM models without code changes
π₯ Team Collaboration
Centralized tasks and profiles for your organization
π€ All OpenRouter Models
Access to all models available on OpenRouter - no individual API key management needed
Prerequisites & Installation
First time? Complete the common Installation steps (SDK installation, device setup, etc.) before continuing with platform-specific configuration below.
Configure Platform Credentials
Create a.env file in your project root with your Minitap Platform credentials:
.env
No LLM config file needed! Unlike local development, the platform manages all LLM configurations centrally.
Quick Start
1
Sign Up
Go to platform.minitap.ai and create an account
2
Create API Key
Navigate to API Keys β Create API Key β Copy your key and add it to your
.env file as shown above3
Create a Task
Go to Tasks β Create Task
- Name:
check-notifications - Agent Prompt:
Open the notifications panel and list all notifications - Click Create
4
Run from SDK
Create your first automation script:
5
View Results
Go to Task Runs to see execution details, agent thoughts, and costs
Whatβs Next?
Create Custom Tasks
Define more complex automation workflows with structured outputs
Optimize LLM Models
Create custom LLM profiles for cost vs. performance tradeoffs
View Observability
Explore agent thoughts, execution timeline, GIF traces, and cost breakdown
Collaborate with Team
Centralized tasks and profiles for your organization
Run Tasks from the Platform UI (Cloud Execution)
In addition to running tasks via the SDK, you can now execute automation tasks directly from the platform UI on cloud devices - no local setup required.Fully Cloud-Based: Cloud Execution runs entirely on Minitap infrastructure. No Python SDK installation, no local device connection needed.
Prerequisites
Before using Cloud Execution:1
Cloud Device Ready
Have an active cloud mobile device (booted and in Ready state)
2
LLM Profile Configured
Create at least one LLM profile on the platform
3
Task Created
Have an automation task template configured
Cloud Execution Workflow
1
Navigate to Tasks
Go to Tasks on the platform
2
Run Task
Click Run on any task card
3
Select Cloud Execution
In the dialog, select the Cloud Execution tab
4
Choose LLM Profile
Select the LLM profile to use for this execution
5
Select Cloud Device
Choose an available cloud device from the dropdown
- Ready devices can run immediately
- Stopped devices will show a boot confirmation dialog
6
Run Task
Click Run Task to start execution
7
View Results
Youβll be redirected to the task run details page with real-time progress
Device Status Indicators
| Status | Description |
|---|---|
| Ready | Device is booted and available for immediate execution |
| Starting | Device is currently booting up |
| Stopping | Device is shutting down |
| Stopped | Device is powered off (will prompt to boot) |
Local vs Cloud Execution Comparison
| Feature | Local (SDK) | Cloud Execution |
|---|---|---|
| Setup required | Python + SDK install + device | None |
| Device | Physical device or emulator | Cloud managed |
| Execution | On your machine | Platform servers |
| Best for | Development, debugging, custom integrations | Production, no-code users, quick testing |
| Real-time monitoring | Via platform traces | Built-in with redirect to task run |
Learn More
Task Configuration Options
When creating tasks on the platform, you have several configuration options: Basic Fields:- Task Name: Unique identifier used in your SDK code
- Description: Helps team members understand the task purpose
- Agent Prompt: Detailed instructions for the agent (use the βGenerateβ button for AI assistance)
- Output Description: Optional - describe the expected JSON structure for structured outputs
- Locked App Package: Optional - restrict execution to a specific app (e.g.,
com.whatsapp)
- Enable Tracing: Shows full LLM prompts/responses on platform (disable for privacy)
- Max Steps: Limit execution steps to prevent runaway costs (default: 400)
When a Locked App Package is set, the task card displays a π indicator with the package name. Use the
<locked-app-package> placeholder in your prompt to reference it dynamically.LLM Profiles (Optional)
By default, tasks use a Minitap-managed profile optimized for mobile-use. Create custom profiles for:- Cost optimization (use faster/cheaper models)
- Performance optimization (use more powerful models)
- Different task types (simple vs. complex)

minitap provider with format: provider/model-name (e.g., openai/gpt-5, google/gemini-2.5-pro)
Agent Components:
The mobile-use agent uses a multi-agent architecture where different LLMs handle specific tasks:
Cortex (Most Critical - Vision Required)
Cortex (Most Critical - Vision Required)
Role: The βeyesβ and decision-maker of the system. Analyzes screenshots, understands UI elements, and decides what action to take next.Requirements: Must support vision/image inputsRecommendation: Use the best vision model available:
google/gemini-2.5-pro- Excellent vision + reasoningopenai/gpt-5- Strong vision capabilitiesanthropic/claude-3.5-sonnet- Good vision understanding
Planner
Planner
Role: Decomposes high-level goals into executable subgoals. Runs once at the start and potentially during replanning.Requirements: Strong reasoning and planning capabilitiesRecommendation:
meta-llama/llama-4-scout- Fast and capableopenai/gpt-5-nano- Quick planninganthropic/claude-3-haiku- Cost-effective
Orchestrator
Orchestrator
Role: Coordinates the execution flow, decides when to use hopper vs cortex, manages state transitions.Requirements: Fast, good at decision-makingRecommendation: Fast models:
openai/gpt-oss-120b- Efficient coordinationopenai/gpt-5-nano- Quick decisions
Executor
Executor
Role: Translates high-level decisions into specific device actions (tap, swipe, type).Requirements: Instruction-following, fast responseRecommendation:
meta-llama/llama-3.3-70b-instruct- Excellent instruction followingopenai/gpt-5-nano- Fast execution
Hopper
Hopper
Role: Digs through large batches of data (historical context, screen data) to extract the most relevant information for reaching the goal.Requirements: Large context window (256k+ tokens recommended) to handle extensive data batchesRecommendation:
openai/gpt-4.1- 256k contextgoogle/gemini-2.0-flash- Large context
Outputter
Outputter
Role: Extracts structured output from task results according to output description.Requirements: JSON formatting, structured output capabilityRecommendation:
openai/gpt-5-nano- Good at JSONanthropic/claude-3-haiku- Structured outputs
Structured Output Example
For type-safe results, use Pydantic models:Viewing Task Runs
Visit Task Runs to see execution details:
- Execution status and duration
- Agent thoughts and reasoning
- Subgoal progression
- Cost breakdown

Task Run Status
Task Run Status
Status transitions throughout execution:
pending: Task created, waiting to startrunning: Task actively executingcompleted: Task finished successfully with outputfailed: Task encountered an errorcancelled: Task was manually cancelled
Subgoals & Plans
Subgoals & Plans
The planner agent creates high-level subgoals. Each subgoal is tracked:
- Name/description
- State:
pendingβstartedβcompleted/failed - Start and end timestamps
- Plan updates on replanning
Agent Thoughts
Agent Thoughts
Reasoning from each agent component:
- Planner: Goal decomposition and planning
- Cortex: Visual understanding and decision making
- Orchestrator: Execution coordination
- Executor: Action translation and execution
- Hopper: Data extraction from large batches
- Outputter: Structured output extraction
LLM Traces
LLM Traces
Detailed LLM API call metrics (when tracing enabled):
- Model used
- Token counts (input/output)
- Cost in dollars
- Latency
- Request/response content
PlatformTaskRequest Reference
Parameters:task(required): Task name from platformprofile(optional): LLM profile name (defaults to Minitap-managed profile)api_key(optional): OverridesMINITAP_API_KEYenvironment variablerecord_trace(optional): Save local trace filestrace_path(optional): Local directory for traces
Platform vs Local Comparison
Platform Benefits (Recommended)
Platform Benefits (Recommended)
β
No LLM config files - Platform manages all model configurationsβ
All OpenRouter models - Access to all models on OpenRouter without managing API keysβ
Instant updates - Change task prompts and models without redeploying codeβ
Built-in observability - Real-time cost tracking, execution monitoring, and agent reasoningβ
Team collaboration - Share tasks and LLM profiles across your organization
When to Use Local Instead
When to Use Local Instead
Use the Local approach if you need:
- Full control over LLM provider selection and API endpoints
- Custom infrastructure or air-gapped environments
- Offline capability without internet dependency
- Development and testing with local model configurations