Tasks and Task Requests

Tasks represent automation workflows to be executed on a mobile device. They are defined using natural language goals and can return structured, type-safe results.

Platform (Recommended)
Local Development

With the platform, tasks are configured on platform.minitap.ai and executed using PlatformTaskRequest.

Define tasks once on the platform
Update prompts without code changes
Built-in observability and cost tracking

Task Characteristics

Goal-based

Define what you want using natural language

Traceable

Record execution for debugging and visualization

Structured Output

Return typed Pydantic models

Platform Tasks

Using the platform? Create tasks on platform.minitap.ai/tasks and execute them with PlatformTaskRequest.

Creating Platform Tasks

Go to Tasks on the platform
Click Create Task
Configure task details:
- Name: Unique identifier (e.g., check-notifications)
- Agent Prompt: Detailed instructions
- Output Description: Optional structured output format
Use in your code:

from minitap.mobile_use.sdk.types import PlatformTaskRequest

result = await agent.run_task(
    request=PlatformTaskRequest(task="check-notifications")
)

Platform Task with Structured Output

from pydantic import BaseModel, Field
from minitap.mobile_use.sdk.types import PlatformTaskRequest

class NotificationSummary(BaseModel):
    total: int = Field(..., description="Total notifications")
    unread: int = Field(..., description="Unread count")

result = await agent.run_task(
    request=PlatformTaskRequest[NotificationSummary](
        task="check-notifications",
        profile="fast"  # Optional: use specific platform profile
    )
)

if result:
    print(f"Total: {result.total}, Unread: {result.unread}")

Platform Task Benefits

Centralized Management

Update task prompts on the platform without redeploying code

Built-in Observability

View execution details, costs, and agent thoughts on the platform

Team Collaboration

Share tasks across your organization

Version Control

Track changes to task configurations over time

Local Tasks

For local development, define tasks directly in code:

Simple String Output

The most basic way to run a local task:

result = await agent.run_task(
    goal="Open settings and enable dark mode"
)
print(result)  # String output

Structured Output with Pydantic

Get type-safe, validated output:

from pydantic import BaseModel, Field

class ThemeSettings(BaseModel):
    dark_mode_enabled: bool = Field(..., description="Whether dark mode is enabled")
    theme_name: str = Field(..., description="Name of the current theme")

result = await agent.run_task(
    goal="Check the current theme settings",
    output=ThemeSettings
)

print(f"Dark mode: {result.dark_mode_enabled}")
print(f"Theme: {result.theme_name}")

With Output Description

Provide guidance for unstructured output:

result = await agent.run_task(
    goal="Find all my unread emails",
    output="A comma-separated list of email subjects"
)

Task Options

Naming Tasks

Give your tasks descriptive names for logging:

await agent.run_task(
    goal="Send a message to John",
    name="send_message_john"
)

Using Different Profiles

Switch agent profiles for specific tasks:

await agent.run_task(
    goal="Analyze this complex form",
    profile="detail_oriented"  # Uses a different LLM configuration
)

Maximum Steps

Control how many actions the agent can take:

task = (
    agent.new_task("Complete checkout process")
    .with_max_steps(500)  # Default is 400
    .build()
)

await agent.run_task(request=task)

Task Builder Pattern

For advanced configuration, use the builder pattern:

task_request = (
    agent.new_task("Open settings and check notifications")
    .with_name("check_notification_settings")
    .with_max_steps(300)
    .with_output_description("Summary of notification settings")
    .with_trace_recording(enabled=True)
    .build()
)

result = await agent.run_task(request=task_request)

Tracing and Debugging

Enable trace recording to capture screenshots and execution steps:

from pathlib import Path

task = (
    agent.new_task("Navigate to profile settings")
    .with_trace_recording(
        enabled=True,
        path=Path("/tmp/my-traces")
    )
    .build()
)

await agent.run_task(request=task)

Traces include screenshots at each step, making it easy to debug failed tasks.

Saving Output

Save LLM Output

Save the final LLM response to a file:

task = (
    agent.new_task("Extract product information")
    .with_llm_output_saving(path="/tmp/llm_output.json")
    .build()
)

Save Agent Thoughts

Capture the agent’s reasoning process:

task = (
    agent.new_task("Book a restaurant reservation")
    .with_thoughts_output_saving(path="/tmp/agent_thoughts.txt")
    .build()
)

Complex Output Structures

Define complex, nested output structures:

from pydantic import BaseModel, Field
from typing import List

class Email(BaseModel):
    sender: str
    subject: str
    preview: str
    is_unread: bool

class InboxSummary(BaseModel):
    total_emails: int = Field(..., description="Total number of emails")
    unread_count: int = Field(..., description="Number of unread emails")
    emails: List[Email] = Field(..., description="List of recent emails")

result = await agent.run_task(
    goal="Open Gmail and analyze my inbox",
    output=InboxSummary
)

for email in result.emails:
    if email.is_unread:
        print(f"Unread: {email.subject} from {email.sender}")

Task Execution Flow

Goal Analysis

LLM analyzes the goal and creates a plan

Screen Observation

Agent captures current screen state

Action Decision

LLM decides next action based on goal and screen

Action Execution

Hardware bridge performs the action

Verification

Agent checks if goal is achieved

Output Extraction

If specified, extract structured output

Best Practices

Be specific in your goals

# ✅ Good - specific
goal="Open Weather app, check temperature for New York, and tell me if it will rain tomorrow"

# ❌ Bad - vague
goal="Check weather"

Use Pydantic for structured output

Define clear field descriptions to help the LLM understand what to extract

class WeatherInfo(BaseModel):
    temperature: float = Field(..., description="Current temperature in Celsius")
    will_rain_tomorrow: bool = Field(..., description="Whether rain is forecast for tomorrow")

Break complex tasks into simpler ones

Instead of one complex task, run multiple simpler tasks in sequence

# Step 1: Navigate
await agent.run_task(goal="Open banking app and go to transactions")

# Step 2: Extract data
transactions = await agent.run_task(
    goal="Get the last 5 transactions",
    output=TransactionList
)

Enable tracing for debugging

Always enable tracing when developing or debugging tasks

task = agent.new_task(goal).with_trace_recording(enabled=True).build()

Example: Multi-Step Workflow

Platform
Local

import asyncio
from pydantic import BaseModel, Field
from typing import List
from minitap.mobile_use.sdk import Agent
from minitap.mobile_use.sdk.types import PlatformTaskRequest

class Product(BaseModel):
    name: str
    price: float
    in_stock: bool

class ShoppingResults(BaseModel):
    products: List[Product]
    total_found: int

async def shop_online():
    agent = Agent()
    agent.init()
    
    try:
        # Step 1: Search (task configured on platform)
        await agent.run_task(
            request=PlatformTaskRequest(task="search-products")
        )
        
        # Step 2: Extract results (task configured on platform)
        results = await agent.run_task(
            request=PlatformTaskRequest[ShoppingResults](
                task="extract-product-results"
            )
        )
        
        # Step 3: Filter and act
        for product in results.products:
            if product.in_stock and product.price < 50:
                print(f"Good deal: {product.name} - ${product.price}")
        
    finally:
        agent.clean()

Getting Started

Core Concepts

SDK Reference

Examples

Resources

Task Characteristics

Goal-based

Traceable

Structured Output

Platform Tasks

Creating Platform Tasks

Platform Task with Structured Output

Platform Task Benefits

Centralized Management

Built-in Observability

Team Collaboration

Version Control

Local Tasks

Simple String Output

Structured Output with Pydantic

With Output Description

Task Options

Naming Tasks

Using Different Profiles

Maximum Steps

Task Builder Pattern

Tracing and Debugging

Saving Output

Save LLM Output

Save Agent Thoughts

Complex Output Structures

Task Execution Flow

Best Practices

Example: Multi-Step Workflow

Next Steps

Profiles

Task Builder SDK

Getting Started

Core Concepts

SDK Reference

Examples

Resources

​Task Characteristics

Goal-based

Traceable

Structured Output

​Platform Tasks

​Creating Platform Tasks

​Platform Task with Structured Output

​Platform Task Benefits

Centralized Management

Built-in Observability

Team Collaboration

Version Control

​Local Tasks

​Simple String Output

​Structured Output with Pydantic

​With Output Description

​Task Options

​Naming Tasks

​Using Different Profiles

​Maximum Steps

​Task Builder Pattern

​Tracing and Debugging

​Saving Output

​Save LLM Output

​Save Agent Thoughts

​Complex Output Structures

​Task Execution Flow

​Best Practices

​Example: Multi-Step Workflow

​Next Steps

Profiles

Task Builder SDK

Task Characteristics

Platform Tasks

Creating Platform Tasks

Platform Task with Structured Output

Platform Task Benefits

Local Tasks

Simple String Output

Structured Output with Pydantic

With Output Description

Task Options

Naming Tasks

Using Different Profiles

Maximum Steps

Task Builder Pattern

Tracing and Debugging

Saving Output

Save LLM Output

Save Agent Thoughts

Complex Output Structures

Task Execution Flow

Best Practices

Example: Multi-Step Workflow

Next Steps