Architecture
OpenPocket is a local-first phone-use runtime centered on a local Android emulator.
Topology
text
User (CLI / Telegram / Panel)
|
v
Gateway / Command Router
|
v
AgentRuntime + HeartbeatRunner + CronService
| | | |
v v v v
ModelClient AdbRuntime SkillLoader ScriptExecutor
| |
v v
LLM APIs Android Emulator (local adb target)Why Local Emulator
- automation does not consume runtime resources on the user’s main phone
- execution remains local instead of running on a hosted cloud phone service
- task artifacts and permissions stay under local control
Control Modes
OpenPocket supports two complementary control paths on the same runtime:
- Human direct control: users can directly operate the local emulator.
- Agent control: agent actions operate the local emulator via
adb.
This makes human-agent handoff practical for real app workflows.
Components
AgentRuntime: orchestrates task loop, step execution, and session/memory persistence.ModelClient: builds multimodal prompts, calls model endpoints, parses normalized actions.AdbRuntime: captures snapshots and executes mobile actions (tap,swipe,type, etc.).EmulatorManager: manages emulator lifecycle (start,stop,status,screenshot).WorkspaceStore: writes auditable session and daily memory files.SkillLoader: loads markdown skills from workspace/local/bundled sources.ScriptExecutor: validates and executesrun_scriptwith allowlist and safety controls.TelegramGateway: routes chat/task commands and sends progress.HeartbeatRunner: emits liveness snapshots and stuck-task warnings.CronService: triggers scheduled tasks fromworkspace/cron/jobs.json.runGatewayLoop: robust long-running gateway loop with graceful restart/stop behavior.HumanAuthBridge: blocks task flow onrequest_human_authand waits for human approval.HumanAuthRelayServer: serves one-time approval web links and polling APIs for unblock flows.
Task Flow
- Create a session markdown file.
- Resolve model profile and credentials.
- For each step:
- capture emulator snapshot
- optionally persist screenshot
- request next action from model
- execute action via
adbor script runner - append step history to session
- emit progress callback when configured
- Stop on
finish, step cap, error, or explicit user stop. - Finalize session and append one daily memory entry.
- Optionally return emulator to home screen.
Model Fallback
OpenPocket attempts provider endpoints in fallback order:
- task loop (
ModelClient):chat->responses->completions - chat assistant (
ChatAssistant):responses->chat->completions
This keeps runtime compatibility across providers with partial endpoint support.
Persistence
- runtime state is stored under
OPENPOCKET_HOME - task execution is auditable through session/memory/script artifacts
- screenshot storage uses configured retention limits
Near-Term Extensions
Planned next step:
- richer remote phone controls beyond auth approvals (pause/resume/approve/retry)