Question 1

Is WebBrain a free alternative to Claude's browser plugin?

Accepted Answer

Yes. WebBrain provides similar AI browser agent capabilities — reading pages, extracting data, clicking buttons, filling forms, and automating multi-step workflows. Unlike Claude's proprietary browser plugin which requires a Claude Pro subscription and only works with Anthropic's models, WebBrain is completely free, open-source (MIT license), and supports multiple LLM providers including local models that run entirely on your machine.

Question 2

How does WebBrain compare to OpenClaw, Browser-Use, and other AI agent frameworks?

Accepted Answer

They're different categories of tools. WebBrain is a browser extension — you install it in Chrome or Firefox and chat with it in a side panel, no coding required. Frameworks like OpenClaw and Browser-Use are developer SDKs for building automated browser pipelines in Python, typically using headless browsers and CDP. Think of it this way: WebBrain is for daily browsing with an AI assistant; agent frameworks are for building scraping bots and test automation. You can use both — they complement each other.

Question 3

Can I use WebBrain completely offline?

Accepted Answer

Yes. WebBrain's default provider is llama.cpp, which runs a local AI model on your computer. No API keys needed, no internet required for the AI, and no data ever leaves your machine. Just download a GGUF model, start llama-server, and you have a fully private AI browser agent. You can also use Ollama with its OpenAI-compatible endpoint.

Question 4

Which AI models does WebBrain support?

Accepted Answer

WebBrain supports four provider types: llama.cpp (any local GGUF model), OpenAI (GPT-4o, GPT-4, etc.), Claude (Claude Opus, Sonnet, Haiku via native API), and OpenRouter (access to 100+ models from various providers). Any OpenAI-compatible API endpoint works, so you can also use services like Together AI, Groq, Mistral, or any local server with an OpenAI-compatible interface.

Question 5

What's the most recommended model?

Accepted Answer

As of April 21, 2026, the top recommendation is Qwen 3.6 35B. Why: in our vision benchmark (vision-model-shootout), it outperformed Gemma 4 on screenshot understanding while remaining practical for local inference. For consumer GPUs, RTX 5090 is ideal, and RTX 4090 is often workable with INT4 AutoRound quantization via Intel/Qwen3.6-35B-A3B-int4-AutoRound. For max speed, we recommend serving on vLLM. Sample command: python -u -m vllm.entrypoints.openai.api_server --model Intel/Qwen3.6-35B-A3B-int4-AutoRound --served-model-name qwen3.6-35b --quantization auto --dtype bfloat16 --max-model-len 65536 --max-num-batched-tokens 32768 --max-num-seqs 4 --host 0.0.0.0 --port 8000 --gpu-memory-utilization 0.92 --enable-prefix-caching --enable-chunked-prefill --limit-mm-per-prompt '{"image": 4, "video": 1}' --mm-processor-cache-type shm --reasoning-parser qwen3 --enable-auto-tool-choice --tool-call-parser qwen3_coder --trust-remote-code --allowed-origins '["*"]' --speculative-config '{"method": "dflash", "model": "z-lab/Qwen3.6-35B-A3B-DFlash", "num_speculative_tokens": 15}' --attention-backend flash_attn DFlash speculative decoding is optional.

Question 6

I'm getting "Failed to fetch" when connecting to a local LLM server (vLLM, Ollama, llama.cpp) on my network

Accepted Answer

If your LLM server is on a different machine on your local network (e.g. http://192.168.1.x:8000), Chrome blocks the request unless the server sends CORS headers. The fix depends on your server: vLLM: Start with --allowed-origins '["*"]' (the value must be a JSON list). Ollama: Set the environment variable OLLAMA_ORIGINS=* before starting. llama.cpp: CORS is enabled by default — no changes needed. If your server runs on localhost (same machine as the browser), CORS is usually not required. The issue only affects cross-machine connections on the local network. Make sure your base URL in WebBrain settings ends with /v1 (e.g. http://192.168.1.47:8000/v1).

Question 7

Does WebBrain work on Firefox?

Accepted Answer

Yes. WebBrain ships with both a Chrome version (Manifest V3, using the sidePanel API) and a Firefox version (Manifest V2, using sidebar_action). Both versions have identical features. The Firefox version can be loaded as a temporary add-on for development, or published to addons.mozilla.org for permanent installation.

Question 8

Can I move the Firefox sidebar from the left to the right, like Chrome's side panel?

Accepted Answer

Yes — Firefox's sidebar defaults to the left, but you can flip it. Right-click anywhere in the sidebar header and choose Move Sidebar to Right (or use View → Sidebar → Move Sidebar to Right from the menu bar). The position persists across restarts. Chrome's sidePanel defaults to the right and isn't user-movable from the panel itself.

Question 9

Is WebBrain safe to use? Can it modify web pages?

Accepted Answer

WebBrain has two modes: Ask mode (default) is read-only and cannot modify anything on the page. Act mode enables full browser agent capabilities (clicking, typing, navigating) but requires explicit user confirmation before activation, and comes with a visible warning banner. You can stop the agent at any time with the Stop button. The extension's source code is fully open for audit on GitHub.

Question 10

How do I use WebBrain for web scraping and data extraction?

Accepted Answer

Simply open any web page, open the WebBrain side panel, and ask in natural language: "Extract all product names and prices from this page", "Get all email addresses on this page", or "Summarize this article in bullet points". The AI agent reads the page content, understands the structure, and returns the extracted data. For more complex scraping, switch to Act mode and the agent can navigate between pages, click pagination buttons, and aggregate data across multiple pages.

Question 11

Does WebBrain call APIs directly, or does it always click through the UI?

Accepted Answer

By default, WebBrain always goes through the visible UI for any action that creates, modifies, deletes, sends, submits, posts, or buys anything. It will navigate to the page, fill the form, and click the button — exactly the way you would. It refuses to call REST/GraphQL endpoints directly via background fetch() for mutations. This is deliberate: API actions are invisible (you don't see what's being sent), often require separate auth tokens you may not have configured, and have a much larger blast radius than a visible mis-click. UI-first means everything is on screen, in your normal browser session, and stoppable. For reading data — fetching a README, looking up an issue, comparing prices across sites, checking a status page — WebBrain freely uses background HTTP requests via the fetch_url and research_url tools. Reading is not the same as acting; it doesn't change anything on a remote service, so the safety concerns don't apply. If you specifically want to allow API mutations for a particular task, type /allow-api at the start of your message (optionally followed by a short task description). This per-conversation override lets WebBrain fall back to API endpoints when the UI is genuinely failing or unworkable, while still preferring UI when UI works. A sticky badge stays visible above the input area while the override is active, and it clears when you reset the conversation.

Question 12

Can I use it in LM Studio too?

Accepted Answer

Yes. WebBrain's read-only network tools — fetch_url and research_url — also ship as a standalone LM Studio plugin at webbrain/web-tools. Install with lms clone webbrain/web-tools and toggle it on in any LM Studio chat — any tool-capable model can then call those two tools without you installing the browser extension. Pure Node, no headless browser. Source: lmstudio-plugin/.

Question 13

Can I switch to another tab while WebBrain is working on a page?

Accepted Answer

Yes, on Chrome — the agent runs in the background service worker and is bound to the tab it started on, so it keeps clicking, typing, and reading that specific tab even when you move focus elsewhere. Tools that target a tab (CDP click, type, navigate, screenshot) all work on backgrounded tabs in Chrome. The sidebar locks the input while a task is running so you can't accidentally start a second task on the new tab — you'll need to wait or stop the current one. Note that browsers throttle timers and animations on background tabs, so heavily animated sites may respond slightly slower. On Firefox, the agent will keep running on its original tab too, but auto-screenshots are limited: Firefox's screenshot API can only capture the currently active tab, not a specific tab in the background. WebBrain detects this and skips the screenshot for that turn rather than feeding the model an image of an unrelated page. The agent will continue planning from text-based context until you switch back to its tab. Avoid actively clicking or typing on the same tab the agent is working on — that creates race conditions where you and the agent fight over the same page. Switching tabs is fine; co-driving the same tab is not.

Question 14

How does Profile auto-fill work, and is it safe?

Accepted Answer

Profile auto-fill is an optional feature in Settings → Profile. You enter a short bio — name, work email, company, and a throwaway password for low-stakes signups — and toggle it on. When enabled, WebBrain appends that text to the agent's system prompt so it can fill signup forms without asking every time. The text is stored in plaintext in your browser's local storage. It is not transmitted to the WebBrain project, but it is sent to whichever LLM provider you have configured on every turn, as part of the system prompt. Off by default. Do not put passwords for important accounts (Google, Apple, iCloud, banking, work SSO, primary email) here. Those accounts should use 2FA and shouldn't be handed to an agent anyway. A throwaway password you reuse for newsletter signups and free trials is the intended use case.

Question 15

What does WebBrain do with cookie banners and paywalls?

Accepted Answer

Cookie banners: WebBrain recognizes consent banners from common frameworks (OneTrust, Cookiebot, Didomi, Quantcast, Google Funding Choices, TrustArc) and dismisses them before reasoning about the page. Priority is "Reject all" / "Reject non-essential" / "Only necessary" when clearly visible; it falls back to "Accept all" rather than disappearing into the "Manage preferences" maze. Paywalls: WebBrain reports the paywall honestly and tells you what it could actually see (headline, dek, first paragraphs). It does not attempt to bypass paywalls — no archive.today, 12ft.io, cookie clearing, JS disabling, or reader-mode tricks. If you want the full article, log in with a subscription or ask WebBrain to look for free coverage of the same story.

Question 16

Does WebBrain support a dry-run mode?

Accepted Answer

As of 7.0.0, not yet. Dry-run mode is planned and already on the roadmap.

Question 17

How does WebBrain keep cloud LLM bills under control?

Accepted Answer

Three independent layers: Token-conscious screenshots. Before any image leaves your machine, WebBrain resizes it (shorter side capped, preserving aspect ratio) and iteratively JPEG-compresses it until it fits a per-turn image-token budget. A 2000×1200 screenshot that would cost you ~1,500 input tokens on GPT-4o gets compressed down to ~300–500 tokens with no practical loss for page-reading tasks. Implemented in _fitImageDimensions with unit tests for the budget math. Smart context trimming. Conversation history, tool outputs, and inline DOM dumps are bounded per turn and trimmed oldest-first when the active model's context window is approaching full. You won't see a run silently balloon from 10k tokens to 100k because a read_page returned a novel-length article. Dedicated vision model. Pair a cheap text model (e.g. GPT-4o-mini) for planning and tool calls with a separate vision-capable model (e.g. GPT-4o) only for screenshots, so you don't pay multimodal-model prices on every single turn. Configure under Settings → Vision. Net result: long sessions with cloud providers stay predictable. For full control, use llama.cpp locally — zero per-token cost.

Question 18

Can I contribute to WebBrain?

Accepted Answer

Absolutely! WebBrain is MIT-licensed and welcomes contributions. Check out the GitHub repository for issues, feature requests, and contribution guidelines.

Feature	WebBrain	Claude in Chrome
Open Source	MIT License	Proprietary
Price	Free forever	Requires Claude Pro ($20/mo)
Local LLM support	llama.cpp, Ollama	No — Claude only
Multi-provider	All OpenAI-compatible endpoints	Claude only
Chrome	Yes (MV3)	Yes
Firefox	Yes (MV2)	No
Side panel UI	Yes	Yes
Ask / Act modes	Yes	Similar
Fully offline	Yes (with local LLM)	No — cloud required
Self-hostable	Yes	No

Aspect	WebBrain	OpenClaw / Browser-Use / etc.
What is it?	Browser extension (end-user tool)	Agent framework / SDK (developer tool)
Target user	Anyone — no coding needed	Developers building automations
Installation	One-click browser install	Python/Docker setup required
UI	Built-in side panel chat	No UI — code or API only
Browser control	Content script (lightweight)	CDP / Playwright (full control)
Multi-tab workflows	Per-tab conversations	Programmable multi-tab orchestration
Headless mode	No — runs in your browser	Yes — headless automation
Extensibility	Add custom LLM providers	Full Python SDK, custom tools
Best for	Daily browsing AI assistant	Automated scraping / testing pipelines

The Open-Source AI Browser Agent

Product Catalog

Watch WebBrain in action

Everything you need in a browser AI

Page Understanding

Full Browser Agent

Data Extraction

Multi-Provider LLM

Privacy First

Smart Context

Dedicated Vision Model

Profile Auto-fill

Cookie & Paywall Aware

Optional CAPTCHA Solver

Multilingual UI

Token-Conscious

Bring your own AI

Ask or Act

Ask Mode

Act Mode

Install WebBrain

Chrome & Chromium

Firefox

How does WebBrain compare?

vs. Browser AI Plugins

vs. AI Agent Frameworks (different category)

Frequently Asked Questions

Spread the word, share the love