Browser Control
browser_control and browser_session_* are Wingman's native browser automation tools.
- Native runtime capability, not an MCP server.
- Supports persistent named profiles (
browserProfile) for reusable login state. - Supports config-driven transport selection:
auto,playwright, orrelay. - Persistent profiles launch with Playwright persistent-context by default for reliability.
- Non-persistent sessions use CDP first, with automatic persistent-context fallback if CDP fails.
browser_session_*is the default browser automation path for screenshots, extraction, and QA.- Reserve
browser_controlfor explicit requests to use or take over an existing browser or live tab. - Supports loading unpacked extensions from host config mappings.
- Includes a bundled first-party extension install flow via
wingman browser extension install --default.
For setup, see:
Input shape
url: optional initial URLactions: ordered list of browser actionsheadless: defaults totruefor non-persistent runs. PersistentbrowserProfileruns default to headed for session reliability unlessheadless: trueis explicitly requested.timeoutMs: per-action timeout (ms)executablePath: optional Chrome/Chromium path override
Managed sessions
Use the managed-session tools by default whenever browser state may need to survive across turns, or when you simply need the most reliable automation path.
browser_session_start: opens a reusable browser session and returnssession_idbrowser_session_action: runs more actions against an existingsession_idbrowser_session_close: closes the session and finalizes any saved video recordingsbrowser_session_list: lists active managed sessions for the current run
Start a managed session with video recording:
recordVideo: trueuses a default workspace-relative directory under.wingman/browser/videos/.- Video recording is supported only for Playwright persistent launches and is finalized on
browser_session_close.
Use browser_control only when the user explicitly wants you to work inside their browser or live tab, for example:
- "use my browser"
- "take control of my browser"
- "work in the tab I already opened"
Action types
Canonical actions:
navigateclicktypepresswaitwait_forextract_textevaluatescreenshot
Accepted aliases:
- Navigation:
url,open,goto - Wait delay:
ms,sleep,pause - Conditional wait:
wait_until - Text extraction:
selector,extract,getContent,get_content,querySelector,query_selector - JS eval:
expression,js,script - Screenshot:
path,snapshot,capture
Examples
Navigate and read a page:
Conditional wait can be expressed either as wait_for or as wait with condition fields:
Evaluate JavaScript and capture screenshot:
Aliases for lower-friction prompting:
Output summary
Returns structured JSON including:
browser:chrome-cdp,chrome-playwright, orchrome-relaytransport:cdp,persistent-context, orrelay-cdptransportRequested: configured preference (auto,playwright,relay)transportUsed: runtime transport after fallback decisionsfallbackReason: reason when auto mode switches transportsmode:headlessorheadedpersistentProfile: whether a named profile was usedprofileId: profile ID when setextensions: loaded extension IDsfinalUrltitleactionResults
Managed-session responses also include:
media: saved screenshots and finalized video recordings with localpath, previewurl,mimeType, and displaynamevideo_recording: recording status, output directory, configured size, and saved video countcontent: chat-renderableresource_linkblocks for each saved media artifact
Troubleshooting
- Schema errors like
Invalid input at actions[n].type: use one of the supported canonical types or aliases. Browser profile is already in use: another run is holding the profile lock.ProcessSingletonprofile-in-use errors: close any manually opened Chrome window using that same profile, then retry.- CDP timeout/refused: this usually affects non-persistent sessions; persistent profiles already use persistent-context by default.
