How to Make Claude Code Control Your Computer - Stop the Refusals

How to Make Claude Code Control Your Computer - Stop the Refusals | OpenOwl

The first time I asked Claude to open Chrome, it laughed at me

Not literally. But basically.

I typed "open chrome and find latest news using owl" into Claude Code and got back:

I can't open Chrome or control your browser. I'm a code assistant. I work with files, run shell commands, and help with software engineering tasks.

Claude refusing to open Chrome

That was two months ago. Today, Claude books flights for me, fills out forms, posts on social media, and runs a daily automation that checks my competitor's pricing page every morning at 9am. Same Claude. Same terminal.

Here is exactly what changed.

The problem is not Claude. It is the sandbox

Claude Code lives in a terminal. It can read files, run shell commands, write code. But it has no eyes. It cannot see your screen, move your mouse, or click a button.

So when you ask it to "open Chrome," it is not being difficult. It genuinely does not have the ability to do that. It is like asking someone blindfolded to find a light switch in a room they have never been in.

The fix is giving Claude eyes and hands. That is what OpenOwl does. It is an MCP server that adds 34 desktop automation tools to any AI assistant. Screenshots, clicking, typing, OCR, window management. All running locally on your machine.

But here is what nobody tells you: even after you install OpenOwl, Claude will still refuse stuff. Getting it to actually use those tools consistently took me a few days of trial and error.

What I learned: you cannot ask for everything at once

My mistake was going big too fast. I would type something like:

Open Chrome, go to kayak.com, search for flights from NYC to London next Friday, find the cheapest one, and book it.

Claude would freeze up. Too many unknowns. It does not know what your screen looks like, where the buttons are, or what will happen after each click. So it defaults to "I can't do that."

The trick is one thing at a time - Seriously. One thing.

Step 1: Just take a screenshot

This is how every automation starts. Do not ask Claude to do anything yet. Just ask it to look.

Take a screenshot of my desktop

That is it. Claude calls the OpenOwl screenshot tool, captures your screen, and describes what it sees. The first time this works, something clicks in your head. Claude can see your screen. It knows what apps are open, where the dock is, what windows are visible.

This is the trust-building step. Claude learns it has tools. You learn what Claude can see.

Step 2: Open one app

Now that Claude can see, ask it to do one thing:

Open Google Chrome

Claude will use the OpenOwl tools to find Chrome (either through the dock, Spotlight, or the Applications folder) and open it. It might take a screenshot first to figure out the current state, then execute the action.

If this works, you are 80% there. Everything else is just variations of "look at screen, do a thing, look again."

Step 3: Give it a small mission

Now combine looking and doing:

Open Chrome and search for "flights NYC to London next Friday"

Claude will open Chrome, find the address bar, type the search query, and hit enter. It takes screenshots between steps to verify what happened. It is methodical, like a careful person using someone else's computer for the first time.

At this point you might notice Claude getting more confident. It stops asking for permission and starts chaining actions together. The tools are loaded, it knows they work, and it starts using them proactively.

Step 4: Build a repeatable task

Once you have done something manually 2-3 times with Claude, it gets the pattern. So now you can be more ambitious:

Go to kayak.com, search for the cheapest round-trip flight from NYC to London,
departing next Friday and returning the following Sunday.
Take a screenshot of the results.

Claude will handle the whole flow: open the browser, navigate to Kayak, fill in the form fields, click search, wait for results, and screenshot them for you.

The key insight: you had to do the small steps first. Claude needed to learn (in this session) that it has these tools and that they work. Once it has that context, it stops refusing.

Step 5: Schedule it

This is where it gets wild. Once you have a working automation, you can run it on a schedule.

I use Claude Code's built-in scheduling (or a simple cron job) to run my automations daily:

9:00 AM - Check competitor pricing, screenshot the results, save to a folder
9:15 AM - Open LinkedIn, check messages, summarize anything important
9:30 AM - Open Gmail, flag emails from specific senders

These run every morning before I sit down at my desk. Claude opens the apps, does the tasks, takes screenshots as proof, and I review the results when I am ready.

I did not build a complex automation framework. I did not write a Playwright script. I just talked to Claude enough times that it learned the routine, and then I scheduled it.

The prompting pattern that actually works

After two months of doing this daily, here is the pattern I use for any new automation:

Session 1: "Take a screenshot of my desktop" (build trust)

Session 2: "Open [app]" (one action)

Session 3: "Open [app] and do [one specific thing]" (small mission)

Repeat sessions 2-3 two more times with slightly bigger asks.

Session 4+: Give it the full task. It will not refuse because it already knows the tools work.

Then: Schedule the full task to run daily or on a trigger.

The progression matters. If you skip straight to session 4, you get refusals. If you build up to it, Claude figures out its own capabilities and starts using them proactively.

Why this works (technically)

When Claude Code starts a session, it does not know what MCP tools are available until it encounters them. The first time you ask for a screenshot and it works, Claude's context now includes "I have a screenshot tool and it works." Each successful tool use adds to that context.

By the time you are asking for complex multi-step automations, Claude has a full context of:

What tools are available (screenshot, click, type, scroll, OCR, etc.)
That they actually work on your machine
What your screen looks like
What happened in previous steps

It is not jailbreaking. You are not tricking Claude into doing something it should not do. You are giving it legitimate tools via the MCP protocol (which Anthropic built) and then letting it discover what it can do incrementally.

Setting it up

If you want to try this yourself:

1. Install OpenOwl (takes about 60 seconds)

npm install -g openowl

2. Get a free API key at openowl.dev

3. Save your key

mkdir -p ~/.openowl
echo "YOUR_API_KEY" > ~/.openowl/api.key

4. Register with Claude Code

claude mcp add owl --transport stdio -s user -- owl

5. Grant macOS permissions

System Settings → Privacy & Security → Screen Recording → enable for your terminal
System Settings → Privacy & Security → Accessibility → enable for your terminal

6. Start with a screenshot

claude "Take a screenshot of my desktop"

If that works, you are in. Follow the progression above and you will have Claude automating your desktop within an hour.

What I automate daily

For context, here is what my daily Claude automations look like after two months of building them up:

Competitor monitoring - Claude opens three competitor websites, screenshots their pricing pages, and saves them to a dated folder. Runs at 9 AM.
Email triage - Claude opens Gmail, scans for emails from specific senders (investors, partners, key clients), and writes me a 3-line summary. I review it over coffee.
Social posting - I write a tweet draft in a text file. Claude opens Twitter, pastes it in, and posts it. I started doing this after I realized I was spending 10 minutes just navigating to Twitter and getting distracted.
Form filling - When I have to submit the same type of form repeatedly (invoices, applications), Claude fills them out from a spreadsheet.

None of these are fancy. None required writing code. I just talked to Claude, built up the tasks gradually, and scheduled them.

The honest limitations

I should be clear about what does not work well:

Speed-sensitive tasks - Claude takes screenshots between steps, so each action takes a few seconds. It will not win any speed contests.
CAPTCHAs - Obviously.
Apps that change layout frequently - Claude re-orients each time, but if a UI changes completely between sessions, it might get confused.
Anything that needs to happen in under a second - This is methodical automation, not scripted macros.

For everything else. the repetitive, boring, click-heavy stuff that eats an hour of your day. it works surprisingly well.

Stop asking for permission

The biggest mindset shift was this: stop framing requests as questions. Do not ask Claude "Can you open Chrome?" because it will tell you all the reasons it cannot.

Instead, tell it: "Open Chrome." Or better: "Take a screenshot, then open Chrome."

Assume the tools work. If Claude pushes back, scale down to a smaller ask and work back up. The tools are there. Claude just needs to discover them.

I am Mihir, the maker of OpenOwl. I built this because I got tired of switching between terminal and GUI 100 times a day. If you get stuck setting up, join our Discord and I will help you personally.

OpenOwl ranked #4 on Product Hunt launch day. Free tier includes 50 tool calls/day.

Claude Keeps Saying "I Can't Do That" - Here's How I Got It to Actually Control My Computer

Try OpenOwl - AI Desktop Automation