π Recommended approach for most users - Get started faster with centralized configuration and built-in observability.
Why Use the Platform?
β‘ Faster Setup
No LLM config files needed - Start in minutes, not hours
π Real-Time Monitoring
Track costs, execution time, and agent reasoning live
π Dynamic Updates
Update task prompts and LLM models without code changes
π₯ Team Collaboration
Centralized tasks and profiles for your organization
π€ All OpenRouter Models
Access to all models available on OpenRouter - no individual API key management needed
Compare with Local Development if you need full control over LLM configuration or offline capability.
Prerequisites & Installation
First time? Complete the common Installation steps (SDK installation, device setup, etc.) before continuing with platform-specific configuration below.
Configure Platform Credentials
Create a.env file in your project root with your Minitap Platform credentials:
.env
Never commit your
.env file to version control. Add it to your .gitignore.No LLM config file needed! Unlike local development, the platform manages all LLM configurations centrally.
Quick Start
1
Sign Up
Go to platform.minitap.ai and create an account
2
Create API Key
Navigate to API Keys β Create API Key β Copy your key and add it to your
.env file as shown above3
Create a Task
Go to Tasks β Create Task
- Name:
check-notifications - Agent Prompt:
Open the notifications panel and list all notifications - Click Create
4
Run from SDK
Create your first automation script:
5
View Results
Go to Task Runs to see execution details, agent thoughts, and costs
Whatβs Next?
Create Custom Tasks
Define more complex automation workflows with structured outputs
Optimize LLM Models
Create custom LLM profiles for cost vs. performance tradeoffs
View Observability
Explore agent thoughts, execution timeline, and cost breakdown
Collaborate with Team
Centralized tasks and profiles for your organization
Learn More
Task Configuration Options
When creating tasks on the platform, you have several configuration options: Basic Fields:- Task Name: Unique identifier used in your SDK code
- Description: Helps team members understand the task purpose
- Agent Prompt: Detailed instructions for the agent (use the βGenerateβ button for AI assistance)
- Output Description: Optional - describe the expected JSON structure for structured outputs
- Enable Tracing: Shows full LLM prompts/responses on platform (disable for privacy)
- Max Steps: Limit execution steps to prevent runaway costs (default: 400)
LLM Profiles (Optional)
By default, tasks use a Minitap-managed profile optimized for mobile-use. Create custom profiles for:- Cost optimization (use faster/cheaper models)
- Performance optimization (use more powerful models)
- Different task types (simple vs. complex)

minitap provider with format: provider/model-name (e.g., openai/gpt-5, google/gemini-2.5-pro)
The platform supports all models available on OpenRouter, giving you access to the latest models from OpenAI, Anthropic, Google, Meta, and more - without managing individual API keys.
Cortex (Most Critical - Vision Required)
Cortex (Most Critical - Vision Required)
Role: The βeyesβ and decision-maker of the system. Analyzes screenshots, understands UI elements, and decides what action to take next.Requirements: Must support vision/image inputsRecommendation: Use the best vision model available:
google/gemini-2.5-pro- Excellent vision + reasoningopenai/gpt-5- Strong vision capabilitiesanthropic/claude-3.5-sonnet- Good vision understanding
Planner
Planner
Role: Decomposes high-level goals into executable subgoals. Runs once at the start and potentially during replanning.Requirements: Strong reasoning and planning capabilitiesRecommendation:
meta-llama/llama-4-scout- Fast and capableopenai/gpt-5-nano- Quick planninganthropic/claude-3-haiku- Cost-effective
Orchestrator
Orchestrator
Role: Coordinates the execution flow, decides when to use hopper vs cortex, manages state transitions.Requirements: Fast, good at decision-makingRecommendation: Fast models:
openai/gpt-oss-120b- Efficient coordinationopenai/gpt-5-nano- Quick decisions
Executor
Executor
Role: Translates high-level decisions into specific device actions (tap, swipe, type).Requirements: Instruction-following, fast responseRecommendation:
meta-llama/llama-3.3-70b-instruct- Excellent instruction followingopenai/gpt-5-nano- Fast execution
Hopper
Hopper
Role: Digs through large batches of data (historical context, screen data) to extract the most relevant information for reaching the goal.Requirements: Large context window (256k+ tokens recommended) to handle extensive data batchesRecommendation:
openai/gpt-4.1- 256k contextgoogle/gemini-2.0-flash- Large context
Outputter
Outputter
Role: Extracts structured output from task results according to output description.Requirements: JSON formatting, structured output capabilityRecommendation:
openai/gpt-5-nano- Good at JSONanthropic/claude-3-haiku- Structured outputs
Structured Output Example
For type-safe results, use Pydantic models:Viewing Task Runs
Visit Task Runs to see execution details:
- Execution status and duration
- Agent thoughts and reasoning
- Subgoal progression
- Cost breakdown

Task Run Status
Task Run Status
Status transitions throughout execution:
pending: Task created, waiting to startrunning: Task actively executingcompleted: Task finished successfully with outputfailed: Task encountered an errorcancelled: Task was manually cancelled
Subgoals & Plans
Subgoals & Plans
The planner agent creates high-level subgoals. Each subgoal is tracked:
- Name/description
- State:
pendingβstartedβcompleted/failed - Start and end timestamps
- Plan updates on replanning
Agent Thoughts
Agent Thoughts
Reasoning from each agent component:
- Planner: Goal decomposition and planning
- Cortex: Visual understanding and decision making
- Orchestrator: Execution coordination
- Executor: Action translation and execution
- Hopper: Data extraction from large batches
- Outputter: Structured output extraction
LLM Traces
LLM Traces
Detailed LLM API call metrics (when tracing enabled):
- Model used
- Token counts (input/output)
- Cost in dollars
- Latency
- Request/response content
PlatformTaskRequest Reference
Parameters:task(required): Task name from platformprofile(optional): LLM profile name (defaults to Minitap-managed profile)api_key(optional): OverridesMINITAP_API_KEYenvironment variablerecord_trace(optional): Save local trace filestrace_path(optional): Local directory for traces
Platform vs Local Comparison
Platform Benefits (Recommended)
Platform Benefits (Recommended)
β
No LLM config files - Platform manages all model configurationsβ
All OpenRouter models - Access to all models on OpenRouter without managing API keysβ
Instant updates - Change task prompts and models without redeploying codeβ
Built-in observability - Real-time cost tracking, execution monitoring, and agent reasoningβ
Team collaboration - Share tasks and LLM profiles across your organization
When to Use Local Instead
When to Use Local Instead
Use the Local approach if you need:
- Full control over LLM provider selection and API endpoints
- Custom infrastructure or air-gapped environments
- Offline capability without internet dependency
- Development and testing with local model configurations