Architecture

OpenPocket is a local-first phone-use runtime centered on a local Android emulator.

Topology

text

User (CLI / Telegram / Panel)
           |
           v
Gateway / Command Router
           |
           v
AgentRuntime + HeartbeatRunner + CronService
   |          |          |            |
   v          v          v            v
ModelClient AdbRuntime SkillLoader ScriptExecutor
    |          |
    v          v
 LLM APIs   Android Emulator (local adb target)

Why Local Emulator

automation does not consume runtime resources on the user’s main phone
execution remains local instead of running on a hosted cloud phone service
task artifacts and permissions stay under local control

Control Modes

OpenPocket supports two complementary control paths on the same runtime:

Human direct control: users can directly operate the local emulator.
Agent control: agent actions operate the local emulator via adb.

This makes human-agent handoff practical for real app workflows.

Components

AgentRuntime: orchestrates task loop, step execution, and session/memory persistence.
ModelClient: builds multimodal prompts, calls model endpoints, parses normalized actions.
AdbRuntime: captures snapshots and executes mobile actions (tap, swipe, type, etc.).
EmulatorManager: manages emulator lifecycle (start, stop, status, screenshot).
WorkspaceStore: writes auditable session and daily memory files.
SkillLoader: loads markdown skills from workspace/local/bundled sources.
ScriptExecutor: validates and executes run_script with allowlist and safety controls.
TelegramGateway: routes chat/task commands and sends progress.
HeartbeatRunner: emits liveness snapshots and stuck-task warnings.
CronService: triggers scheduled tasks from workspace/cron/jobs.json.
runGatewayLoop: robust long-running gateway loop with graceful restart/stop behavior.
HumanAuthBridge: blocks task flow on request_human_auth and waits for human approval.
HumanAuthRelayServer: serves one-time approval web links and polling APIs for unblock flows.

Task Flow

Create a session markdown file.
Resolve model profile and credentials.
For each step:
- capture emulator snapshot
- optionally persist screenshot
- request next action from model
- execute action via adb or script runner
- append step history to session
- emit progress callback when configured
Stop on finish, step cap, error, or explicit user stop.
Finalize session and append one daily memory entry.
Optionally return emulator to home screen.

Model Fallback

OpenPocket attempts provider endpoints in fallback order:

task loop (ModelClient): chat -> responses -> completions
chat assistant (ChatAssistant): responses -> chat -> completions

This keeps runtime compatibility across providers with partial endpoint support.

Persistence

runtime state is stored under OPENPOCKET_HOME
task execution is auditable through session/memory/script artifacts
screenshot storage uses configured retention limits

Near-Term Extensions

Planned next step:

richer remote phone controls beyond auth approvals (pause/resume/approve/retry)

Documentation Hubs

Quickstart

Configuration

Deploy Documentation Site

Project Blueprint

Core Principles

User Scenarios

Architecture

Prompting and Decision Model

Sessions and Memory

Skills

Scripts

Config Defaults

Prompt Templates

Action and Output Schema

Session and Memory Formats

CLI and Gateway

Filesystem Layout

Runbook

Troubleshooting

Implementation Plan

Next Focus

Architecture

Topology

Why Local Emulator

Control Modes

Components

Task Flow

Model Fallback

Persistence

Near-Term Extensions

Core Principles

User Scenarios

Next Focus

Architecture ​

Topology ​

Why Local Emulator ​

Control Modes ​

Components ​

Task Flow ​

Model Fallback ​

Persistence ​

Near-Term Extensions ​

Architecture

Topology

Why Local Emulator

Control Modes

Components

Task Flow

Model Fallback

Persistence

Near-Term Extensions