Architecture Overview

The mobile-use SDK follows a layered architecture designed to provide both simplicity for common use cases and flexibility for advanced scenarios.

The core concepts apply to both Platform and Local approaches. The platform simplifies configuration by managing profiles and tasks centrally, while local development gives you full control over all components.

Architecture Diagram

Key Components

Agent

The central orchestrator for mobile automation

Tasks

Goal-based automation workflows with structured output

Profiles

Customize agent behavior and LLM configuration

Builders

Fluent APIs for configuring agents and tasks

Component Overview

Agent Layer

The Agent class is the primary entry point that coordinates:

Device connections (Android/iOS)
Server lifecycle management
Task creation and execution
Resource cleanup

Task Layer

Tasks represent automation workflows defined by:

Natural language goals - What you want to accomplish
Structured output - Type-safe results using Pydantic
Tracing - Recording execution for debugging

LangGraph Integration

The SDK leverages LangGraph for:

Agent reasoning - Transparent decision-making process
Step-by-step execution - Breaking complex tasks into manageable steps
Dynamic adaptation - Responding to what’s on screen

Device Interaction

Two key components handle device control:

Hardware Bridge (Maestro)

Performs physical actions on the device:

Tap, swipe, scroll gestures
App launching and navigation
Key press events
Text input

Screen API

Captures device state:

Screenshots for visual analysis
UI hierarchy data
Element accessibility information

Execution Flow

Initialize

Agent connects to device and starts required servers

Plan

LLM analyzes the goal and creates a plan

Observe

Screen API captures current UI state

Decide

LLM determines next action based on screen

Act

Hardware Bridge executes the action

Repeat

Loop through steps 3-5 until goal is achieved

Next Steps

Agent

Learn about the Agent class

Tasks

Understand task creation and execution

Getting Started

Core Concepts

SDK Reference

Examples

Resources

Architecture Diagram

Key Components

Agent

Tasks

Profiles

Builders

Component Overview

Agent Layer

Task Layer

LangGraph Integration

Device Interaction

Execution Flow

Next Steps

Agent

Tasks

Getting Started

Core Concepts

SDK Reference

Examples

Resources

​Architecture Diagram

​Key Components

Agent

Tasks

Profiles

Builders

​Component Overview

​Agent Layer

​Task Layer

​LangGraph Integration

​Device Interaction

​Execution Flow

​Next Steps

Agent

Tasks

Architecture Diagram

Key Components

Component Overview

Agent Layer

Task Layer

LangGraph Integration

Device Interaction

Execution Flow

Next Steps