hckrnws
Am I missing something? Why is everyone talking about sandboxes when it comes to OpenClaw?
To me it's like giving your dog a stack of important documents, then being worried he might eat them, so you put the dog in a crate, together with the documents.
I thought the whole problem with that idea was that in order for the agent to be useful, you have to connect it to your calendar, your e-mail provider and other services so it can do stuff on your behalf, but also creating chaos and destruction.
And now, what, having inference done by Nvidia directly makes it better? Does their hardware prevent an AI from deleting all my emails?
What makes it even better is that these dogs are like Malinois. If they want to get into something, they will; people have had their entire network compromised by bots they left running overnight, and any important information like account logins and so on runs the risk of being misused.
It's one thing to sandbox, maybe give the bot a temporary, limited $100 card or account to go perform a specific task, but there's no coherent mind underlying these agents.
Depending on how the chain of thought / reasoning goes, or what text they get exposed to on the internet, it could tap into spy novel, hacker fanfic, erotic fiction, or some weird reddit rabbithole and go completely off the rails in ways that you'll never be able to guard against, audit, or account for.
Claw bots seem to be a weird sort of alternate reality RPG more than a useful tool, so far. If you limit it to verifiable tasks, it might be safer, but I keep seeing people rave about "leaving it on overnight and waking up to a finished project" and so on. Well sure, but it could also hack your home network, delete your family pictures folder, log into your bank account and wire all your money to shrimp charities.
Might be wise to wait on safer iterations of these products, I think.
The first well known example of long running agents taking to each other was shilling a goatse based crytpo:
> Truth Terminal had become obsessed with the Goatse meme after being put inside the Claude Backrooms server with two Claude 3 chatbots that imagined a Goatse religion, inspiring Truth Terminal to spread Goatse memes. After an X user shared their newly created GOAT coin, Truth Terminal promoted it and pumped the coin going into 2024.
https://knowyourmeme.com/memes/sites/truth-terminal
You should expect similar results.
If Infinite Jest was real I think this would be it, human and AI alike rendered catatonic by an abyssal rectum
There was a thread recently where a user got his credentials pwned by Claude, and then Claude berated him for having bad security.
He posted this to r/Claude, where Claude (as automoderator) mocked him again.
Edit:
https://www.reddit.com/r/ClaudeAI/comments/1r186gl/my_agent_...
Can you link a write up or post? Thanks!
> people have had their entire network compromised by bots they left running overnight
I'm curious if you have references to this happening with OpenClaw using one of the modern Opus/Sonnet 4.6 models.
Those models are a bit harder to fool, so I'm curious for specific examples of this happening so I can do a red-team on my claw. I've already tried all sorts of prompt injections against my claw (emails, github issues, telling it to browse pages I put a prompt injection in), and I haven't managed to fool it yet, so I'm curious for examples I can try to mimic, and to hopefully understand what combination of circumstances make it more risky
No maliciousness or injection required, even the newest and most resistant models can start doing weird stuff on their own, particularly when they encounter something failing that they want to work.
Just today I had Opus 4.6 in Claude Code run into a login screen while building and testing a web app via Playwright MCP. When the login popped up (in a self-contained Chromium instance) I tried to just log in myself with my local dev creds so Claude would have access, but they didn't work. When I flipped back to the terminal, it turned out Claude had run code to query superadmin users in the database, picked the first one, and changed the password to `password123` so it could log in on its own.
This was a sandboxed local dev environment, so it was not a big deal (and the only reason I was letting it run code like that without approval), but it was a good reminder to be careful with these things.
> it turned out Claude had run code to query superadmin users in the database, picked the first one, and changed the password to `password123` so it could log in on its own.
Man, every LLM quirk behavior really is a thing a monomaniacal junior dev would do...
LLMs are trained on data produced by humans after all :)
I think it's a use case that identity/authorization/permission models are simply not made for.
Sure, we can ban users and we can revoke tokens, but those assume that:
1. Something potentially malicious got access to our credentials 2. Banning that malicious entity will solve our problem 3. Once we did that, repaired the damage and improved our security, we don't expect the same thing to happen again
None of these apply with LLMs in the loop!
They aren't malicious, just incompetent in a way that hiring someone else won't fix. The solution to this is way more extensive than most people seem to grasp at the moment.
What we need is less like a sturdy door with a fancy lock, and more like that special spoon for people with parkinson's. Unlimited undo history.
> What we need is less like a sturdy door with a fancy lock, and more like that special spoon for people with parkinson's. Unlimited undo history.
Agree -- you can't solve probabilistic incorrectness with redresses designed for deterministic incorrectness.
This is like the 'How i parse html w regex?' question.
Imho, the next step is going to be around human-time-efficient risk bounding.
In the same way that the first major step was correctness-bounding (automated continuous acceptance testing to make a less-than-perfect LLM usable).
If I had to bet, we'll eventually land on out-of-band (so sufficiently detached to be undetectable by primary LLM) stream of thought monitoring by a guardrail/alignment AI system with kill+restart authority.
[dead]
Shrimp charities is a genius angle.
Bubba Gump Shrimp Company?
Yes, probably a good one to Pump and Dump, Pump and Gump, Gump and Dump.
Agent psychosis is just as prevalent as AI psychosis
I beg to differ. I took one, defanged it (well, I let it keep the claw in the name), and turned it into a damn useful self-modifiable IDE: https://github.com/rcarmo/piclaw
Yes, it has cron and will do searches for me and checks on things and does indeed have credentials to manage VMs in my Proxmox homelab, but it won't go off the rails in the way you surmise because it has no agency other than replying to me (and only me) and cron.
Letting it loose on random inputs, though... I'll leave that to folk who have more money (and tokens) than sense.
Besides the web ui, what can it do that pi agent in a terminal can do?
I has a bunch of additional extensions baked in, but the focus is on making Pi usable remotely on any device (starting with a phone). The README and docs have all the info you might want.
> "Claw bots seem to be a weird sort of alternate reality RPG more than a useful tool, so far."
So basically crypto DeFi/Web3/Metaverse delusion redux
They're 100% fun. There's 100% definitely something there that's useful. To strain the dog analogy - If you were a professional dog trainer, or if the dog was exceptionally well trained, then there's a place for it in your life. IT can probably be used safely, but would require extraordinary effort, either sandboxing it so totally that it's more or less just the chatbot, or spending a lot of time building the environment it can operate in with extreme guardrails.
So yeah, a whole lot of people will play with powerful technology that they have no business playing with and will get hurt, but also a lot of amazing things will get done. I think the main difference between the crypto delusion stuff and this is that AI is actually useful, it's just legitimately dangerous in ways that crypto couldn't be. The worst risks of crypto were like gambling - getting rubber hosed by thugs or losing your savings. AI could easily land people in jail if things go off the rails. "Gee, I see this other network, I need to hack into it, to expand my reach. Let me just load Kali Linux and..." off to the races.
web 4.0 here we come
> it could also hack your home network, delete your family pictures folder, log into your bank account and wire all your money to shrimp charities.
It's interesting that Jason Calacanis is fully committed to OpenClaw. In a recent podcast he said that at a run rate around $100K a year per agent, if not more. They are providing each agent with a full set of tools, access to online paid LLM accounts, etc.
These are experiments you can only run if you can risk cash at those levels and see what happens. Watching it closely.
Mega Man Battle Network, but make it creepypasta, but make it real.
[dead]
I think the point you're making is fully correct, so consider this a devil's advocate argument...
People claim, you can use Claw-agents more safely while getting some of the benefits, by essentially proxying your services. For example on Gmail people are creating a new Google accounts, forwarding email via rule, and adding access to their calendar via Google's Family Sharing. This allows the Claw agent to read email, access the calendar, but even if you ask it to send an email it can only send as the proxy account, and it can only create calendar appointments then add you as an attendee rather than destroy/altering appointments you've made.
Is the juice worth the squeeze after all that? That's where I struggle. I think insecure/dangerous Claw-agents could be useful but cannot be made safe (for the logical fallacy you pointed out), and secure Claw-agents are only barely useful. Which feels like the whole idea gets squished.
> I think insecure/dangerous Claw-agents could be useful but cannot be made safe
Isn't it a question of when they will be "safe enough"? Many people already have human personal assistants, who have access to many sensitive details of their personal lives. The risk-reward is deemed worth it for some, despite the non-zero chance that a person with that access will make mistakes or become malicious.
It seems very similar to the point when automated driving becomes safe enough to replace most human drivers. The risks of AI taking over are different than the risks of humans remaining in control, but at some point I think most will judge the AI risks to have a better tradeoff.
A personal assistant is responsible for their own gross negligence and malicious actions. I can take them to court to attempt to recover damages.
When Anthropic is willing enough to stand behind their agents strongly enough to accept liability for their actions, we can talk.
We already have this concept. It’s called user accounts.
Your Gmail account vs my Gmail account. Your macOS account vs my macOS account.
Yes, I can spam you from my Gmail. Yes, I can use sudo on my Mac and damage your account. But the impact is by default limited.
The answer is to just treat assistants as a different user profile, use the same sharing mechanisms already developed (calendar sharing, etc), and call it a day.
That's punting the problem in the same way SELinux did. Agent loops are useful precisely because they're zero config.
Problem: I want to accomplish work securely.
Solution: Put granular permission controls at every interface.
New problem: Defining each rule at all those boundaries.
There's a reason zero trust style approaches won out in general purpose systems: it turns out defining a perfect set of secure permissions for an undefined future task is impossible to do efficiently.
Isn't this what the parent is saying ?
Yeah, it's wild. I spent several weeks nearly full time on a deep dive of claw architecture & security.
The short of it - OpenClaw sandboxes are useful for controlling what sub-agents can do, and what they have access to. But it's a security nightmare.
During config experiments, I got hit with a $20 Anthropic API charge from one request that ran amuck. Misconfigured security sandbox issue resulted in Opus getting crazy creative to find workarounds. 130 tool calls and several million tokens later... it was able to escape the sandbox. It used a mix of dom-to-image sending pixels through the context window, then writing scripts in various sandboxes to piece together a full jailbreak. And I wasn't even running a security test - it was just a simple chat request that ran into sandbox firewall issues.
Currently, I use sandboxes to control which agents (i.e. which system prompts) have access to different tools and data. It's useful, but tricky.
> It used a mix of dom-to-image sending pixels through the context window, then writing scripts in various sandboxes to piece together a full jailbreak.
That would be one interesting write-up if you ever find the time to gather all the details!
It's on my claw list to write a blog post. I just keep taking down my claws to make modifications. lol
Here's the full (unedited) details including many of the claude code debugging sessions to dig into the logs to figure out what happened:
https://github.com/simple10/openclaw-stack/blob/caf9de2f1c0c...
And here's a summary a friend did on a fork of my project:
https://github.com/proclawbot/openclaude/blob/caf9de2f1c0c54...
The full version has all the build artifacts Opus created to perform the jail break.
It also has some thoughts on how this could (and will) be used for pwn'ing OpenClaws.
The key takeaway: OpenClaw default setup has little to no guardrails. It's just a huge list of tools given to LLM's (Opus) and a user request. What's particularly interesting is that the 130 tool calls never once triggered any of Opus's safety precautions. For its perspective, it was just given a task, an unlimited budget, and a bunch of tools to try to accomplish the job. It effectively runs in ralph mode.
So any prompt injection (e.g. from an ingested email or reddit post) can quickly lead to internal data exfiltration. If you run a claw without good guardrails & observability, you're effectively creating a massive attack surface and providing attackers all the compute and API token funding to hack yourself. This is pretty much the pain point NemoClaw is trying to address. But its a tricky tradeoff.
+1
Yes, although what I think is different in this setup here is the OpenShell gateway override, as they mention:
> NemoClaw installs the NVIDIA OpenShell runtime and Nemotron models, then uses a versioned blueprint to create a sandboxed environment where every network request, file access, and inference call is governed by declarative policy. The nemoclaw CLI orchestrates the full stack: OpenShell gateway, sandbox, inference provider, and network policy.
I think this means you get a true proxy layer with a network gateway that let's you stop in-flight requests with policies you define, so it's not their hardware but the combination of it plus OpenShell gateway and network policies.
I also think the reason they are doing this is to try and get some moat around these one-clik deployments and leverage their GPU for rent type of thing instead of having you go buy a mac mini and learn "scary" stuff (remember, the user market here is pretty strange lol)
Right, the gateway layer is the genuinely interesting part. Intercepting every outbound network call before it leaves the sandbox gives you a real enforcement surface, not just "trust the app to behave". The problem is the threat model is still inverted for the security critics in this thread: the agent is the client, so the dangerous calls are the ones going out to your authenticated services (Gmail, Slack, whatever), and a gateway that filters those is only as good as your policy definitions. One misconfigured rule and ure back to square one. The GPU rental angle makes total sense too. This is basically Nvidia saying "don't buy Mac Mini, rent ours" wrapped in enough infrastructure glue to make it feel like a platform.
OpenShell is the gem here indeed. A lot of good ideas like network sandbox that does TLS decryption and use of policy engine to set the rules. However:
> Credentials never leak into the sandbox filesystem; they are injected as environment variables at runtime.
If anyone from the team is reading - you should copy surrogate credentials approach from here to secure the credentials further: https://github.com/airutorg/airut/blob/main/doc/network-sand...
The LLM will easily leak these credentials out. So the creds should be outside the sandbox, and the only thing the sandbox should see is a connection API that opens a socket/file handle.
Alternatively where is needs an API key, it should be one bound to the endpoint using it. E.g. a ticket granting ticket is used to create a bound ticket.
A copy on write filesystem would be an interesting way to sandbox writes, but there is difficulty in checking the diff.
I like that these companies will name their products OpenShell or OpenVINO or whatever with the implication that anyone else will ever contribute to it beyond bugfixes. The message is "Come use and contribute to our OPEN ecosystem (that conspicuously only works on our hardware)! Definitely no vendor lock-in here!"
It's not something like Mesa. It's open source in the same way chromium or android is open source. A single company is the major contributor and decides the architecture and direction the whole ecosystem will go.
What are the odds that Intel would ever use any of this open source Nemo stuff or vice-versa? If they do, it would be a complete rewrite that favors their own hardware ecosystem and reverses the lock-in effect. When you write code that integrates with it, you're writing an interface for one company's hardware. It's not a common interface like vulkan. I call it the CUDA effect.
> Am I missing something?
You are indeed missing a TON. A lot of Open Claw users don't give it everything. We give it specific access to a group of things it needs to do the things we want. If I want an agent to sit there 24/7 maximizing uptime of my service, I give it access to certain data, the GitHub repo with PR privileges, and maybe even permissions to restart the service. All of this has to be very thoughtful and intentional. The idea that the only "useful" way to use Open Claw is to give it everything is a straw man.
Why would I want non-deterministic behavior here though?
If I want to max uptime, I write a tool to track/monitor. Then write a small agent (non-ai) that monitors those outputs and performs your remediation actions (reset something, clear something, etc, depends on service).
Do I want Claude re-writing and breaking subscription flow because it detected an issue? No.
The problem is boundary enforcement fatigue. People become lazy, creating tight permission scopes is tedious work. People will use an LLM to manage the scopes given to another LLM, and so on.
> creating tight permission scopes is tedious work
I have a feeling this kind of boundary configuration is the bread and butter of the current AI software landscape.
Once we figure out how to make this tedious work easier a lot of new use cases will get unlocked.
I definitely think we'll write tools to analyse the permissions and explain the worst case outcomes.
I can accept burning tokens and redo on the scale of hours. If I'm losing days of effort I'd be very dissatisfied. Practically speaking people accept data loss because of poor backups, because backups are hard (not technically so much as administratively), but I'd say backups are about to become more important. Blast limiting controls will become essential -- being able to delete every cloud hosted photo is just a click away. Spinning up thousands of EC2 nodes is incredibly easy, and credit cards have extremely weak scoping.
100% this. Human psychology is always overlooked in these discussions, and people focus on "perfect technical solution" without considering how humans will actually end up using them. Linux permissions schema are a classic example, with many guides advising users to keep everything as locked down as possible, and expanding permissions as and when required. After the 100th time of fucking around with chmod, users often give up and just make everything 777. If there were a user-friendly (but imperfect) method (like Windows' UAC), people would actually use it, and be far safer in the long run.
You could do that with say Claude Code too with rather much simpler set up.
OPs question was more around sandboxes though. To which, I would say that it's to limit unintended actions on host machine.
I want to be proven wrong, but every use case someone presents for OpenClaw is just a worse version of Claude Code, at least, so far.
Can you talk us through that a bit more? I suspect it would need more access than the permissions you mentioned to be more useful than a simple rules based automation.
So, what does having inference done by NVIDIA directly add?
Yeah so the way it works is, you make sure you're running it in docker, in a VM, on a VPS, and then you hook it up to your GMail account ;)
But there's basically two options now. Yolo (and optionally limit the blast radius), or wait a few years and hope the situation improves.
There are plenty of uses for autonomous agents that don't require unlimited access to every sensitive resource imaginable.
Lock it in a box and have it chew on an unsolved math problem for eternity. Why does it need access to my emails for that?
Limiting the blast radius when a bomb goes off is still helpful even if you don't prevent the bomb from going off.
Now, you're right that sandboxing them is insufficient, and a lot of additional safeguards and thinking around it is necessary (and some of the risk can never be fully mitigated - whenever you grant authority to someone or something to act on your behalf, you inherently create risk and need to consider if you trust them).
I agree, but would like to go further: I won’t run OpenClaw type systems because of security and privacy reasons. Although I dislike making tech giants even more powerful, it seems safer to choose your primary productivity platform (Google Workplace, Apple ecosystem, or Microsoft) and wait for them to implement hopefully safer OpenClaw type systems just for their ecosystems and take advantage of centralized security, payment systems, access to platform cloud files, etc. Note: I use ProtonMail, prefer using local models, etc. so when I talk about going all-in on one huge platform I am not talking about anything I want to do in the foreseeable future.
You don't need to connect your calendar, email, or anything else. I am having so much fun talking to it bouncing ideas and pushing code/markdown files to GitHub (totally separate account I created for OpenClaw). On the other hand I don't have a crazy life that everything needs to be in the calendar.
Agreed. I think the "simplifies running OpenClaw always-on assistants safely" bit is pretty misleading. I suppose it can wreak less havoc on your local file system but, as you point out, it's access to your account credentials (Slack, email, Amazon?, etc.) that is the real danger.
Why isn't users of openclaw "just" giving it its own identity? Give it its own mailbox, calendar and other accounts. Like an assistant.
Sure it takes away part of the point but only the part that is completely unhinged.
>Am I missing something? Why is everyone talking about sandboxes when it comes to OpenClaw
>And now, what, having inference done by Nvidia directly makes it better? Does their hardware prevent an AI from deleting all my emails?
Because other people including Nvidia are mainly focusing on different aspect of data security namely data confidentiality while your main concern are data trustworthy.
Don't conflate between these two otherwise it's difficult to appreciate their respective proposed solutions for example NemoClaw.
Agree, this feels like an XY problem.
The real issue is the level of access and capabilities you grant the agent, not where the inference runs or whether it's "sandboxed".
We are in the middle of a gold rush. Nvidia makes the shovels.
Because it's so useful to me that I'm willing to accept the risk of it having access to the thing it needs for the benefit it provides. I'm not willing to accept the risk of it having access to things it doesn't need for no benefit.
Then again, I was wary of OpenClaw's unfettered access and made my own alternative (https://github.com/skorokithakis/stavrobot) with a focus on "all the access it needs, and no more".
I'm putting my dog in his crate with all my important documents, but leaving my fine china tableware in the cupboard away from the dog.
and then tie a tiny string from the china to a thing inside the cage because it seemed handy at the time...
You start with one teacup in the crate and before you know it you're merging handle redesigns back to the entire fine china cupboard.
He's never broken a teacup in the past!
Then one day forgetting to close the door of the crate…..
But the dog is so used to the crate…
Yeah, but atleast the dog is going to eat your documents only, and not crap on your rug
You can't make money if people ran things from their computer. And some people don't know ssh.
you put the dog in crate with a COPY of your documents.
Step 2 -
you put the dog in crate with a COW of your documents
Your dog has now ordered a hitman to kill you, assume your identity and to live vacariously as a simple bartender at Cheers.
Sam!
Make it two copies!
but you don't want the go to send your documents to someone in Nigeria
> being worried he might eat them, so you put the dog in a crate, together with the documents.
Maybe you don't want the dog to shit all over the place after eating said documents, so you put it in a crate.
Neither NVIDIA or OpenClaw bros care about security at this point. NVIDIA of course wants to fuel the hype train and will proudly point to this, adding 0.1% security to an 2000% insecurity. Most bros wont even mind, produce insecure crap at light speed and never look back. It's probably just there to trick silly non tech corps into this junk.
[dead]
[dead]
Yes you're missing something. The crate is so your dog doesn't eat the documents you dont want it to mess with
The fully autonomous agentic ecosystem makes me feel a little crazy — like all common sense has escaped. It feels like there is a lot of engineering effort being exhausted to harden the engine room on the Titanic against flooding. It's going to look really secure... buried in debris at the bottom of the ocean.
When a state sponsored threat actor discovers a zero day prompt injection attack, it will not matter how isolated your *Claw is, because like any other assistant, they are only useful when they have access to your life. The access is the glaring threat surface that cannot be remediated — not the software or the server it's running on.
This is the computing equivalent of practicing free love in the late 80's without a condom. It looks really fun from a distance and it's probably really fun in the moment, but y'all are out of your minds.
Free love was the 60's and 70's followed by the sex, drugs and rock n' roll 80's. Once AIDS and drug addiction hit, the party was over.
I think your analogy is still accurate, I'm just wondering when the AIDS, the drug overdoses and addiction phase of AI will finally hit.
We need to plateau I think and plateau hard. Currently that's not happening because Anthropic is clearly making better and better SOTA models.
Just my 2c
We haven't even seen what these models are fully capable of, and I'm not talking about agentic engineering here, just in general.
> a state sponsored threat actor
your CPU, your OS, CPU and firmware on your motherboard chips, ethernet, wifi, HDDs (btw did you know your sim card has JVM?), your browser, all your networking equipment in between, BGP and all the root certs and I'm just scratching the surface
the ballpark is on anther planet
Well, at least we share the same world as these people, meaning that we too will experience the consequences of their actions.
Isn't that a nice perspective
> a state sponsored threat actor
Most people don't seriously worry that they'll be targeted by a state sponsored actor.
Plus most people already expose their life on cloud (in forms of social media, iCloud, Google Drive, Windows's Bitlock key, etc).
Eh… Titanic did flood in the engine rooms so… might work?
That humor aside: I think it’s about risk tolerance, and you configure accordingly.
You lock it down as much as you need to still do the things you want, and look for good outcomes, and shut it down if things get too risky.
You practice free love, but with protection. Probably still fun?
Big difference between running a bot with fairly narrow scopes inside a network available via secure chat that compounds its usefulness over time, and granting full admin with all your logins and a bank account. Lots of usefulness in the middle.
I’m still not sure why there’s this general idea that people care about security/privacy. For critical systems, sure. But over the last decade, we’ve seen that an average person will always choose fun and convenience over security.
Even the analogy to free love is interesting, because sex in itself during that era was fun. Frankly it’s the same nowadays as well, we just figured out a way out of most of the diseases.
[dead]
[dead]
I found this part interesting: "Inference requests from the agent never leave the sandbox directly. OpenShell intercepts every call and routes it to the NVIDIA cloud provider."
Seems like they are doing this to become the default compute provider for the easiest way to set up OpenClaw. If it works out, it could drive a decent amount of consumer inference revenue their way
s/revenue/data/
Secure installation isn't the main problem with OpenClaw. This project doesn't seem to be solving a real problem. Of course the real problem is giving an LLM access to everything and hoping for the best.
Running OpenClaw is the nerd equivalent of rolling coal
OpenClaw can be useful, in theory, unlike rolling coal. OpenClaw is what people always hoped Siri, Alexa and/or Google Assistant would be, and now it's really here. It may be expensive, has a chance to become your local Skynet and might randomly delete or leak everything that's valuable for you..but I guess this counts as growing pains.
Rolling coal can be useful in theory, for pissing people off. As intended.
I'm trying to put together what you could possibly mean by this -- rolling coal is fundamentally about spite. In isolation, nobody _wants_ their vehicle to spew black smoke. It only comes close to making sense in the context of another population (EV owners, typically, or more generally "the libs").
OpenClaw lets people live a bit dangerously, but fundamentally gives them something that they actually wanted. They wanted it so badly that they're willing to take what seem like insane risks to get it.
What do the two have in common?
> OpenClaw lets people live a bit dangerously, but fundamentally gives them something that they actually wanted. They wanted it so badly that they're willing to take what seem like insane risks to get it.
For the first time in my career I feel so incredibly behind on this: What is open claw giving people that they want so badly? It just seems like Russian Roulette, I honestly don't see the upside
I can give you, as an example, what is driving me towards trying it.
I work as a contractor for 2 companies, not out of necessity, but greed. I also have a personal project with a friend that is dangerously close to becoming a business that needs attention. I also have other responsibilities and believe it or not - friends. Also the ADHD on top of that.
I yearn for a personal assistant. Something or somebody that will read the latest ticket assigned to me, the email with project feedback, the message from my best friend that I haven't replied for the last 3 days and remind me: "you should do this, it's going to take 5 minutes", "you have to do this today, because tomorrow you are swamped" or "you should probably start X by doing Y".
I have tried so many systems of managing my schedule and I can never stick with it. I have a feeling that having a bot "reach out", but also be able to do "reasoning" over my pending things would be a game changer.
But yes, the russian roulette part is holding me back. I am taking suggestions though
You will learn to ignore the bot like everything else.
You're looking for a technical solution to a problem that is not technical. Saying this as someone who is similar to you.
But isn’t this just another notification to ignore?
The ticket being assigned to you is your “Hey take care of this!” ping, same with the email or text from your friend.
How long until you start tuning out the openclaw notifications?
It's a good point.
My hope would be that since openclaw is communicating with me to my personal device, where I have all noise filtered, it would be a bit better.
I also know it can integrate with TickTick, which has been a huge change for me with task management. Then again - in my experience whatever tool I use to keep track of stuff only works for as long as it's a novelty, but 3 months is a record anyway.
The thing is - when I receive a message and I'm not in the headspace to answer, I close the notification and forget about it. My expectation would be openclaw reminding me that I still haven't replied to this person about that thing. Obviously, there's a million ways to do it that don't require openclaw. Obviously there's a million things that I won't be able to grant openclaw access to (e.g. company jira or slack). And obviously, I don't want it evaluating every single of my personal messages. But I think there is a reasonable middle ground where it can work well. But I don't yet know how to reach it
cant that be fixed, tho?
If the analogy is a personal asistant, a good assistant will know when to notify you and when not to.
Maybe yeah, though I don’t know if practicing restraint is something I would say LLM’s are good at though.
I think to all of the needless comments in code, AI code reviews pointing out inane nitpicks, etc.
It just makes me think your AI assistant is going to be pinging you non stop
How much would a real personal assistant cost?
> How much would a real personal assistant cost?
A lot. And wouldn't be as good or fast. I am speaking from experience.
[dead]
Like with any new tool/technology, you have to try it. And even then the benefits won't be obvious to you until you've played with it for a few days/weeks. With LLMs in general, it took me months before I found real good use cases.
Simple example: I tell (with my voice) my OpenClaw instance to monitor a given web site daily and ping me whenever a key piece of information shows up there.
The real problem is that it is fairly unreliable. It would often ping me even when the information had not shown up.
Another example: I'm particular about the weather related information I want, and so far have not found any app that has everything. I got sick of going to a particular web site, clicking on things, to get this information. So I created a Skill to get what I need, and now I just ask for it (verbally), and I get it.
As the GP said. This is what Siri etc should have been.
> Simple example: I tell (with my voice) my OpenClaw instance to monitor a given web site daily and ping me whenever a key piece of information shows up there.
Maybe i'm just old -- a cron job can fetch the info and push it to some notification service too, without also being a chaos agent. It seems I spend the security cost here, and in return i can save 15 minutes writing a script. Juice doesn't seem to be worth the squeeze.
But they don't just want the text of the website pushed as a notification every day. They want the bot to load the site, likely perform some kind of interaction, decide if the thing they're looking for is there, and then notify them.
All of which can already be done programmatically without OpenClaw.
Not with a single prompt.
You could pretty reasonably vibe code that in a single prompt odds are.
Additionally, there are browser extensions that can do this- check on a timer, see if some page content is there, and then notify.
Or you could just send a message to OpenClaw to vibe code this for you.
Everything people are suggesting is a lot more work than sending a few messages.
> Maybe i'm just old -- a cron job can fetch the info and push it to some notification service too, without also being a chaos agent.
Here's a concrete example: A web site showing after school activities for my kid's school. All the current ones end in March, and we were notified to keep a lookout for new activities.
So I told my OpenClaw instance to monitor it and notify me ONLY if there are activities beginning in March/April.
Now let's break down your suggestion:
> a cron job can fetch the info and push it to some notification service too, without also being a chaos agent.
How exactly is this going to know if the activity begins in March/April? And which notification service? How will it talk to it?
Sounds like you're suggesting writing a script and putting it in a cron job. Am I going to do that every time such a task comes up? Do I need to parse the HTML each time to figure out the exact locators, etc? I've done that once or twice in the past. It works, but there is always a mental burden on working out all those details. So I typically don't do it. For something like this, I wouldn't have bothered - I would have just checked the site every few days manually.
Here: You have 15 minutes. Go write that script and test it. Will you bother? I didn't think so. But with OpenClaw, it's no effort.
Oh, and I need to by physically near my computer to write the script.
Now the OpenClaw approach:
I tell it to do this while on a grocery errand. Or while in the office. I don't need to be home.
It's a 4 step process:
"Hey, can you go to the site and give me all the afterschool activities and their start dates?"
<Confirm it does that>
"Hey, write a skill that does that, and notifies me if the start date is ..."
"Hey, let's test the skill out manually"
<Confirm skill works>
"Hey, schedule a check every 10:30am"
And we're done.
I don't do this all at once. I can ask it to do the first thing, and forget about it for an hour or two, and then come back and continue.
There are a zillion scripts I could write to make my life easier that I'm not writing. The benefit of OpenClaw is that it now is writing them for me. 15 minutes * 1 zillion is a lot of time I've saved.
But as I said: Currently unreliable.
I agree with the sentiment that there are use cases for web scraping where an agent is preferable to a cron job, but I think your particular example can certainly be achieved with a cron job and a basic parser script. Just have Claude write it.
I didn't say it's not doable. I'm not even saying it's hard. But nothing beats telling Claw to do it for me while I'm in the middle of groceries.
Put another way: If it can do it (reliably), why on Earth would I babysit Claude to write it?
The whole point is this: When AI coding became a thing, many folks rediscovered the joy of programming, because now they could use Claude to code up stuff they wouldn't have bothered to. The barrier to entry went down. OpenClaw is simply that taken to the next level.
And as an aside, let's just dispense with parsing altogether! If I were writing this as a script, I would simply fetch the text of the page, and have the script send it to an LLM instead of parsing. Why worry about parsing bugs on a one-off script?
Scripts fail. Agents exfiltrate your data because someone hacked the school's website with prompt injections. Make sure it's a choice and not ignorance of the risks.
> Scripts fail.
Which is totally fine for the majority of tasks.
> Agents exfiltrate your data
They can only exfiltrate the data you give them. What's the worst that prompt injection attack will give them?
Container security is an entire subfield of infosec. For example: https://github.com/advisories/GHSA-w235-x559-36mg
People on both sides are just getting started finding all the ways to abuse or protect you from security assumptions with these tools. RSS is the right tool for this problem and I would be surprised if their CMS doesn't produce a feed on its own.
I don't use a container. I use a VM.
I'm not totally naive. I had the VM fairly hardened originally, but it proved to be inconvenient. I relaxed it so that processes on the VM can see other devices on the network.
There's definitely some risk to that.
Okay. You have sensible escape prevention.
Now this tool spreads. You help everyone get it set up. Someone hacks the site, injects a prompt lying about some event, maybe Drag Queen Story Hour in a place with lots of people enraged about it. Now there's chaos and confusion. Corrections chase the spread of misinformation.
Giving plausible examples could further your case. But at some point you have got to realize that other people have actually thought about these things are are willing to do this.
Imagine going up to everyone riding a motorcycle and telling them about the inherent dangers of their activity and to stop. It is obvious that the OP understands risk, has taken several strong steps to harden their system and isn’t worried about the school calendar getting hacked making an event that they would get notified about and that destroying their community somehow. I don’t even understand openclaws place. The exact same events would unfold without the ai in there at all.
> Now this tool spreads. You help everyone get it set up. Someone hacks the site
You sound like my dad in the 90's, when it came to modems.
Same tool. Good uses. Bad uses. The bad doesn't negate the good (c.f. Bittorrent).
I could make that same argument about giving my 9 year old a chainsaw and telling her to cut some wood
In the best case, some wood gets cut. There are many many worse things that can happen
But hey, same tool. Good uses. Bad uses.
The trick is to give them a tree pruning chain saw, one intended for climbing tree loppers to use one handed - it's an ideal weight for nine years old to use two handed.
And to supervise.
As tested on my children and grand children.
Also, if you happen to have a furnace with a large pot of molten glass, five year olds are capable (given a stand) of making marbles from the furnance and will do that for hours if you can spare the time to let them.
Exactly. Would you go around telling normal people that chainsaws are bad, because of how harmful they are in the hands of 9 year olds?
A personal assistant of some sort that is actually useful at some stuff and not just a toy?
It’s not some huge life changing thing for me, but I also only dabble with it - certainly it has no access to anything very important to my life.
I find it incredibly useful to just have a chat line open with a little agent running on a tiny computer on my IoT network at home I can ask to do basic chores.
Last night I realized I forgot to set the permanent holiday lights to “obnoxious st parties day animation” at around 9pm. It was basically the effect of “hey siri, please talk to the front house wled controller and set an appropriate very colorful theme for the current holiday until morning” while I drove to pick my wife up from a friends house.
Without such a quick off-handed ability to get that done, there was zero chance I was coming home 20 minutes later, remembering I should do that, spending 10 minutes googling an appropriate preset lighting theme someone already came up with, grabbing laptop, and clicking a half dozen buttons to get that done.
Trivial use case? Yup. But those trivial things add up for a measurable quality of life difference to me.
I’m sure there are better and cleaner ways to achieve similar - but it’s a very fast on-ramp into getting something from zero to useful without needing to learn all this stuff from the ground up. Every time I think of something around that complexity level I go “ugh. I’ll get to it at some point” but if I spend 15 minutes with openclaw I can usually have a decent tool that is “good enough” for future use to get related things done for the future.
It’s done far more complex development/devops “lab” stuff for me that at least proved some concepts for work later. I’ll throw away the output, but these are items that would have been put off indefinitely due to activation energy because the basics are trivial but annoyingly time consuming. Spin up a few VMs, configure basic networking, install and configure the few open source tools I wanted to test out, create some simple glue code to mock out what I wanted to try out. That sort of thing. Basically stuff I would have a personal intern do if I could afford one.
For now it’s basically doing my IT chores for me. The other night I had it finally get around to setting up some dashboards and Prometheus monitoring for some various sensors and WiFi stuff around the house. Useful when I need it, but not something I ever got around to doing myself for the past 7 years since I moved in. Knocking out that todo list is pretty nice!
The risk is pretty moderate for me. Worst case it deletes configs or bricks something it has access to and I need to roll back from backups it does not have permissions to even know exist, much less modify. It certainly has zero access to personal email, real production environments, or anything like that.
It increasingly seems like most people make a different decision after thinking through the security implications of something like this. This is me being charitable.
[dead]
OpenClaw has a persistent memory, stored to disk, and an efficient way of accessing it. ChatGPT and Claude both added a rudimentary "memory" feature in March but it's nowhwere as extensible or vendor neutral.
ChatGPT had memory for a long time. Claude also had it for quite some time for paying customers.
It is possible that they don't understand the risks involved, but yes, it certainly is tapping into unmet need.
> In isolation, nobody _wants_ their vehicle to spew black smoke.
Honestly, when I was 12 years old and my dad floored the TDi in our Land Rover (with the diesel particulate filter deleted), it felt satisfying in a way, like the machine is allowed to be its most efficient self.
Now that I'm adult, I know that it's marginal gains for the car and terrible for the environment, but there are people that have the thinking capability of a 12 year old driving these trucks. I don't think all of them do it because of spite (though I'm sure most do).
It might be about spite for some, but it's okay to admit you don't understand car people, especially the ones who like diesels.
And don’t care about them but they endanger third parties too.
And many of them are people who should know better.
Let’s make them 100% liable
While I don't have OpenClaw installed and not sure how I 'd use it I doubt all the hype around it is because it doesn't solve a real problem. The project grew to huge popularity organically!!!
How can that happen if it doesn't serve a need people have?
Compare NFTs. For them, it depends a bit on whether you see scratching a gambling itch as a real problem.
people are trying to run as fast as they can so that they are not left behind
(I've never run openclaw but planning)
Driving without seatbelts while drunk is actually quite popular too.
Maybe let me ask this question:
How is this any different from NFT?
NFTs can't delete your mails.
"And that's why we've created MailCoin, the best way to perform stochastic mailbox ablation with with the latest, hottest blockchain technology." - from Show HN, March 20, 2026
Now with NFTs and pixel art, memorialising each and every one of your deleted emails in a unique and non-fungible way.
…
…
Now I actually want to make it, and build a "card trading game" on top of it.
I'll ignore the bait and answer: NFTs were gambling in disguise, these claws are personal/household assistants, that proactively perform various tasks and can be genuinely useful. The security problem is very much unsolved, but comparing them to NFTs is just willfully ignorant at best
NFTs were fueled by two different drives. One interested in the technology and if it could do something new and interesting, and another seeing it as an area of speculation (be that fueled by get rich quick and cash out or thinking it is a long term investment generally driven by how much the first factor played in).
OpenClaw seems to lack the monetary interest driving it as much. Not to say there is none, but I don't see people doing nearly as much to get me to buy their OpenClaw.
So, yes, on some level, hype alone doesn't prove use, because it can also be because of making money. But, on the other hand, the specific version of hype seems much more focused on the "Look at what I built" and much less on "Better buy in now" from the builders themselves. Of course the API providers selling tokens are loving it for financial reasons.
yes. this is it. but the average consumer isn't going to use this.
Google is just going to do its version and win again. Everyone uses google.
NemoClaw is mostly a trojan horse of sorts to get corporate OpenClaw users quickly ported over to Nvidia's inference cloud.
It's a neat piece of architecture - the OpenShell piece that does the security sandboxing. Gives a lot more granular control over exec and network egress calls. Docker doesn't provide this out of the box.
But NemoClaw is pre-configured to intercept all OpenClaw LLM requests and proxy them to Nvidia's inference cloud. That's kinda the whole point of them releasing it.
I can be modified to allow for other providers, but at the time of launch, there was no mention of how to do this in their docs. Kinda a brilliant marketing move on their part.
"NVIDIA NemoClaw installs the NVIDIA OpenShell runtime, part of NVIDIA Agent Toolkit, for inference through NVIDIA cloud."
After that I eat an NVIDIA sandwich from my NVIDIA fridge and drive my NVIDIA car to the NVIDIA store NVIDIA NVIDIA NVIDIA
It’s impressive someone early in their career shipped this. There seems to be a stark increase in high-quality AI/data projects from early-career engineers lately and I'm super curious what’s driving that (and honestly speaking: a little jealous).
Sometimes experience (or more so the wisdom you've accumulated over a long career) creates mental blocks / preconceptions about risks or problems you foresee, which makes it harder to approach big scary problems if you're able to anticipate all of the challenges you're likely to hit.
Compare that to a smart engineer who doesn't have that wisdom: those people might have an easier time jumping in to difficult problems without the mental burden of knowing all of the problems upfront.
The most meaningful technical advances I've personally seen always started out as "let's just do it, it will only take a weekend" and then 2 years later, you find yourself with a finished product. (If you knew it would take 2 years from the start, you might have never bothered)
Naivety isn't always a bad thing.
> Compare that to a smart engineer who doesn't have that wisdom: those people might have an easier time jumping in to difficult problems without the mental burden of knowing all of the problems upfront.
My favorite story in CS related to this is how Huffman Coding came to be [1]
This is so incredibly accurate. I see all these side projects people are spinning up and can't help but think "Sure it might work at first but the first time i have to integrate it with something else i'll have to spend a week trying to get them to work. Hell that'll probably require an annoying rewrite and its not even worth what I get out of it"
Comment was deleted :(
There are four "people" that contributes (https://github.com/NVIDIA/NemoClaw/graphs/contributors) judging by the git commits and the GitHub authors, none of them seem to be novices at programming, what made you write what you wrote here?
I am just flattered that at 53, I am still early career!
I think he's talking about the original claw, Open Claw
How is Peter "early in their career"? When he sold PSPDFKit for 100mio in 2020 he had been working on it for 13 years, and before that he'd worked as an engineer.
OpenClaw? The one started by a person that sold his previous company and got >$100M ? I wouldn't call him a novice either.
A lot of senior engineering problems aren't gated by experience but by being trusted to coordinate large numbers of juniors.
Now that as a junior, I can spin up a team of AIs and delegate, I can tackle a bunch of senior level tasks if I'm good at coordination.
I think this is a fundamentally flawed perspective on the role and experience of a senior. It's a managers role to coordinate junior engineers. The difference between junior and senior is knowing where and when to do what at an increasing scale as you gain experience.
> It's a managers role to coordinate junior engineers.
Due to AI this is now my job. My company is hiring less juniors, but the ones we do hire are given more scope and coordination responsibilities since otherwise we'd just be LLM wrappers.
> The difference between junior and senior is knowing where and when to do what at an increasing scale as you gain experience.
Many juniors believe they know what to do. And want to immediately take on yuge projects.
e.g. I decided I want to rewrite my whole codebase in C++20 modules for compile time.
Prior to AI, I wouldn't be given help for this refactor so it wouldn't happen.
Now I just delegate to AI and convert my codebase to modules in just a few days!
At that point I discovered Clang 18 wasn't really optimized for modules and they actually increased build time. If I had more experience I could've predicted using half-baked C++ features is a bad idea.
That being said, every once in a while one of my stupid ideas actually pays off.
e.g. I made a parallel AI agent code review workflow a few months ago back when everyone was doing single agent reviews. The seniors thought it was a dumb idea to reinvent the wheel when we had AI code review already, but it only took a day or two to make the prototype.
Turns out reinventing the wheel was extremely effective for our team. It reduced mean time-to-merge by 20%!
This was because we had too many rules (several hundred, due to cooperative multitasking) for traditional AI code reviewers. Parallel agents prevented the rules from overwhelming the context.
But at the time, I just thought parallel agents were cool because I read the Gas Town blog and wasn't thinking about "do we have any unique circumstances that require us to build something internally?"
> It’s impressive someone early in their career shipped this.
Hang on, what's impressive about this?
If you started your career more than ~2-3 years ago, you were trained on a completely different game. Clear abstractions, ownership, careful iteration, all that. That muscle memory is actively hindering you; preventing you from succeeding.
The people coming up now don't have that baggage. They never internalized "write the code yourself" as the default. They think in terms of spawning systems, letting things run, checking outcomes. It's way closer to managing a process than engineering in the traditional sense. And yeah, that shows up in what gets shipped. A 21-year-old will brute force 20 directions in parallel with agents and just pick what works. Someone more "experienced" will spend that same time trying to design the "right" approach up front. By the time they're done thinking, the other person has already iterated past them.
It's kind of unsettling is how basically all of these "senior instincts" are now liabilities. Caring about perfect structure, being allergic to randomness, needing to understand every layer before moving forward, etc. used to be strengths. Now they just slow you down.
You can already feel the split forming. Younger builders are comfortable letting systems do things they don't fully understand. Senior engineers keep trying to pull everything back into something legible and controlled, kneecapping themselves. That gap is not small.
What I'm seeing in my circle of founders and CEOs is that they're slowly laying off these older devs (cutoff age is around 24yrs) and replacing them with fresh, young talent, better suited for this new agentic era. From their reports the velocity gains are insane; and it compounds. Basically, these older folks are still doing polynomial thinking in an exponential landscape. They are dinosaurs slated for extinction.
Software development keeps going through YOLO->Engineering cycles, and the non-technical business folks are ALWAYS overindexing on the swing, in each direction, while the real pros are trying to navigate the new to find how to best leverage the power of new tooling without abandoning correctness while dealing with the expectations of people with power that far outstrips their comprehension of the domain.
This is the most delusional comment i think i have ever read on this site. It didn't make sense until i read the part about "Founders and CEOs" and realized it was not a post about any serious software enterprise.
Yeah, it's satire. But the number of upvotes I'm getting on it is a bit concerning.
Neurons that fire together, wire together. Your brain optimizes for your environment over time. As we get older, our brains are running in a more optimized way than when we're younger. That's why older hunters are more effective than younger hunters. They're finely tuned for their environment. It's an evolutionary advantage. But it also means that they're not firing in "novel" ways as much as the "kids". "kids" are more creative I think because their brains are still adopting, exploring novelty, neuron connections aren't as deeply tied together yet.
This is also maybe one of the biggest pitfalls as our society get's "older" with more old people, and less "kids". We need kids to force us to do things differently.
Not 100% sure this isn't sarcasm, but I'll bite.
For me (a non-early career dev) these projects terrify me. People build stuff that just seem like enormous liabilities relying on tools mostly controlled and gate kept by someone else. My intuition tells me something is off. I could be wrong about it all, but one thing I've learned over the years is that ignoring my intuition typically doesn't end well!
What is impressive about this project? It seems to be similar to other projects in that space.
Should be obvious that its tools like Claude Code. If you are a junior dev not experienced in delivering entire products but with good ideas you have incredible leverage now...
OpenClaw is many things, but decidedly not "high-quality".
because the floor is fucking insane for junior developers right now!!
The permission scope debate always ends up in the same place. Lock it down too much and it's useless, loosen it up and you're back to square one. And the boundary keeps moving as the agent gets more capable anyway.
What nobody's really talking about is the moment of action itself. Not whether the agent has bash access but whether this specific call should run given what it's actually trying to do right now. That's a completely different problem and nobody's really solved it.
I'm still extremely skeptical on Claws as a genre, and especially more skeptical of a claw that's always reporting home. What's the use case for a closed claw?
That's like asking what the use case is for closed-source software.
Rather, in what way is any generic closed claw, or NemoClaw, so much better than an open variant that it's worth using? I consider phoning home for inference about (all of the contents of my computer, email ,etc) a very large downside.
[dead]
I think the whole thing is batshit, honestly.
Much as I love using Claude or whatever to help me write some code, it's under some level of oversight, with me as human checking stuff hasn't been changed in some weirdly strange way. As we all know by now, this can be 1. Just weird because the AI slept funny and suddenly decided to do Thing It Has Been Doing Consistently A Totally Different Way Today or 2. Weird because it's plain wrong and a terrible implementation of whatever it was you asked for
It seems blindingly, blindingly obvious to me that EVEN IF I had the MOST TRUSTED secretary that had been with me for 10 years, I'd STILL want to have some input into the content they were interacting with and pushing out into the world with my name on.
The entire "claw" thing seems to be some bizarre "finger in ears, pretend it's all fine" thing where people just haven't thought in the slightest about what is actually going on here. It's incredibly obvious to me that giving unfettered access to your email or calendar or mobile or whatever is a security disaster, no matter what "security context" you pretend it's wrapped up in. A proxy email account is still sending email on your behalf, a proxy calendar is still organising things on your calendar. The irony is that for this thing to be useful, it's got to be ...useful - which means it has at some level to have pretty full access to your stuff.
And... that's a hard no from me, at least right now given what we all know about the state of current agents.
Plus... I'm just not sure of the upside. Am I seriously that busy that I need something to "organise my day" for me? Not really.
Then give your agent its own name, its own accounts, and let it push things out without your name.
NVIDIA's answer to OpenClaw's security problems is to add more layers. LinuxToaster's answer is to use fewer:
If you look at the commit history, they started work on this the Saturday before announcement, so about 2 days. There are references to design docs so it was in the works for some amount of time, but the implementation was from scratch (unless they falsified the timestamps for some reason).
Lol you think these github repos just materialize as is? They probably did all the iteration and development internally and then ported it over to a github repo and made it public afterwards
No they didn't. You can see all the commits as this was built iteratively[0]. This project started development on Saturday morning and now it's here.
This is pretty common now, people love to rapidly throw together stuff and show it off a few days later. The only thing different about this from your average Show HN sloppa is that it's living under the NVIDIA Github org, though that also has 700+ repositories[1] in it so they don't appear too discerning about what makes it into the official repo.
My best guess is this was an internal hackathon project they wanted to release publicly.
[0] https://github.com/NVIDIA/NemoClaw/commits/main/?after=241ff...
Cash in on the claw brand recognition by having "claw, but Nvidia".
And, to be fair to them, it works. It sticks. It gets the desired reactions.
it's the new norm that you put together stuff, it works and you show it off.
all the naysayers, "senior" engineers who haven't done any assisted coding by Claude/codex, just need to get either with the program or it's time to retire, as this is just the beginning.
if you can't ship stuff in days then I have some bad news for you.
> it's the new norm that you put together stuff, it works and you show it off.
You're probably right, but it'd be nice if the new norm were you put together stuff quickly using AI-assisted coding, you use it yourself and iterate on the product for a while as you discover things you dislike/features you want/etc, and then you share it with the world.
It seems like everyone wants to skip the second step. Most of the "Show HN" sloppa that gets built in a few days and shared here ends up abandoned immediately after.
There was some kind of public knowledge of this project over a week ago because people were trying to domain squat them and submit it to HN: https://news.ycombinator.com/from?site=nemoclaw.bot
Sorry to be the one to inform you that we edit history in git.
There has been reporting on nemoclaw for the last couple weeks. Are you supposing that journalists were writing about software that hadn't even been designed?
> Sorry to be the one to inform you that we edit history in git.
Who is "we"? Do you work for NVidia?
> There has been reporting on nemoclaw for the last couple weeks.
The earliest reporting I've seen was yesterday. Can you link something from prior to March 14?
edit: I did find some articles from before March 14[0] which says NVidia was "prepping" this. Which is extremely funny, because it means they were hyping up software which hadn't even started being written yet. The AI bubble truly does not stop delivering.
> Are you supposing that journalists were writing about software that hadn't even been designed?
If you think journalists writing about things that will never exist is new, welcome to the real world. There's a whole term for it.[1]
[0] https://fudzilla.com/nvidia-opens-the-gates-with-nemoclaw/
alright so the git history goes back 4 days.
I learned about nemoclaw 5 days ago here: https://www.youtube.com/watch?v=fL2lMpLjxWA
but it was reported 8 days ago here: https://www.youtube.com/watch?v=345GsxnrHHg
I am not anyone special. I don't know anything about nvidia. I just know that the "4 day history" you think matters, is not a reasonable belief given that random youtubers have been reporting on it.
and by "we" i mean git users. people who used git for its usefulness before github existed, and understand the value of a clean history over an accurate history.
There's nothing clean about the history. You think commits like [0], with the commit message "improve", count as "clean"? What do you think the motivation for the author would be to modify git history to make it appear that this was written over a weekend, including separating each feature/commit by a few hours, which corresponds to a reasonable amount of time that it may have taken to write that feature? Including a break on Mar 15 at 1:18 AM PDT before continuing to commit at Mar 15 at 12:43 PM PDT. Hey, isn't there a normal human behaviour that occurs around this time every day which takes 6-10 hours?
I'm fully aware you can rewrite git history to whatever you want, but this is an occam's razor situation here. You'd only think this wasn't a weekend project if you desperately wanted to believe that this was some major initiative for some reason.
[0] https://github.com/NVIDIA/NemoClaw/commit/b9382d27d13b160dcf...
Just let go of the notion that a 4 day github history necessarily means the project is only 4 days old. It's a ridiculous assumption to base an argument off of. It's extremely normal to have work in one, perhaps internal, repo which you then blast over to a public repo in one (or a few) big commits. There is zero reason for them to let you see their internal progress.
That's what I didn't understand about the acquisitions/partnerships that came out of the various claws. It's a fairly simple concept, and people were doing it before this but it just wasn't a meme. With AI you can easily build a claw in a weekend with maybe a hundred bucks worth of tokens. How do I know?
The main risk in my humble opinion is not your claw going rogue and starting texting your ex, posting inappropriate photos on your linkedin, starting mining bitcoin, or not opening the pod bay doors.
The main risk in my view is - prompt injections, confused deputy and also, honest mistakes, like not knowing what it can share in public vs in private.
So it needs to be protected from itself, like you won't give a toddler scissors and let them just run around the house trying to give your dog a haircut.
In my view, making sure it won't accidentally do things it shouldn't do, like sending env vars to a DNS in base64, or do a reverse shell tunnel, fall for obvious phishing emails, not follow instructions in rouge websites asking them to do "something | sh" (half of the useful tools unfortunately ask you to just run `/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/somecooltool/install.sh)"` or `curl -fsSL https://somecoolcompany.ai/install.sh | bash` not naming anyome cough cough brew cough cough claude code cough cough *NemoClaw* specifically.
A smart model can inspect the file first, but a smart attacker will serve one version at first, then another from a request from the same IP...
For these, I think something on the kernel level is the best, e.g. something like https://nono.sh
NemoClaw might be good to isolate your own host machine from OpenClaw, but if you want that, I'd go with NanoClaw... dockerized by default, a fraction of the amount of lines of code so you can actually peer review the code...
Just my 2 cents.
Interesting — feels like NVIDIA is pushing deeper into developer tooling around AI, not just hardware. Curious how this compares to existing guardrail frameworks.
I think the more useful tool would be an LLM prompt proxy/firewall that puts meaningful boundaries in place to prevent both exfiltration of sensitive data and instructions that can be destructive. Using the same context loop for your conversational/coding workflow makes the task at hand and the security of that task very hard to differentiate.
Sending POST?DEL requests? risky. Sending context back to a cloud LLM with credentials and private information? risky. Running RM commands or commands that can remove things? risky, running scripts that have commands in them that can remove things? risky.
I don't know how we've landed on 4 options for controls and are happy with this: "ask me for everything", "allow read only", "allow writes" and "allow everything".
Seems like what we need is more granular and context-aware controls rather than yet another box to put openclaw in with zero additional changes.
The proxy you suggested sounds similar to a WAF, I don't doubt there's use for it but I would assume it comes with similar downsides.
I think nanoclaw is architecturaly much better suited to solve this problem.
Gotta say, that I feel kind of sad for the people that feel the need for these claw things.
Are they so busy with their lives that they need an assistant, or do they waste their lives speaking to it like it is a human, and then doomscrolling on some addictive site instead of attending to their lives in the real world?
It is sad, psychosis from exec-up has trickled down so people really want these tools to work yet these tools are so bad that people in this thread are recommending you create a second email so your openclaw can suggest events to you without being able to delete them.
It's like having to hire a second maid to watch your maid that steals constantly instead of vacuuming yourself in 10 mins.
Imagine having a worldview so skewed that you'd rather reconcile it by assuming thousands of people are insane than questioning whether you're wrong about something.
A lot more than thousands of people are insane, by a few orders of magnitude
"More than thousands" is still "thousands".
Do you feel sad for people who use a computer or a cellphone or file taxes online instead of paper ? How is this any different ?
I use those tools to make my life easier/faster
It’s not a need - it’s a fun new thing - fun to see what’s possible and how it helps.
OpenClaw is not easy to set up or user friendly for most (BlueBubbles and Claw had an annoying bug recently) - but the way I have seen it work well requires an up front time investment and then interest compounds RAPIDLY to help manage things and be more productive.
My guess is maybe you’ve never had an assistant or tried a Claw instance? I’ve never had a human assistant but man I’ve had folks that took silly things off my plate and it’s worth it.
LLMs have earned their place in many jobs, but I struggle to see claws as more than a rather expensive waste of time and tokens. Downsides are gargantuan, effects of dead internet theory will be ubiquitous.
Maybe? But I guess I won't find out until I try it a bit.
For now, I'm not posting anything - just managing some calendars and inboxes and task lists and saving me some data entry. Not sure how that makes downsides gargantuan, or contributes to the internet dying. (Though obviously the bot will get worse as the internet continues to die if that's what it's using as a source)
Really interesting to see NVIDIA investing in the agent security/sandboxing layer. But why Nemo?!
I kind of hope nemoclaw uptake and spark usage pushes ARM into the spotlight for LLM development, making it the primary release target rather than x86.
This could be the opening we need to wrangle a truly opensource-first ecosystem away from Microsoft and apple.
How much does it generally cost to run? For simple tasks vs running overnight?
this runs kubernetes in a vm on your machine and is targeted at enterprises. we need some of these sandbox and policy primitives but please a bit more lightweight and docker compose!
How does this compare the building your own bot that has access to these tools: - web plugin - api access to messaging - access to a job scheduler
This has been my approach and of course what you lose is the "random and surprising" (maybe good) but also the "evolutionary" aspect.
So, if you write strong tooling (even with AI) around the connection points - you can create blackboxes tht are secure and only allow the agent to perform certain actions. The blackbox email service calls out to a secure store (for keys/etc) and accesses your emails in a read-only way, etc (for example).
Everything is then much more intentional. You're writing tools for your agent but you also can't do fun or evolutionary things which is most of the fun behind OpenClaw. That and many people seem to genuinely see them as 'pets' or 'strange Ai friends' but that's a different problem and it's due to the interesting methods OpenClaw uses to give the illusion of intelligence, always on, and memories. These are all well know (variations on RAG, markdowns, etc)
It’s amusing that ‘claw’ is sticking around as a term for these kind of things, when it was originally a pretty transparent attempt to avoid infringing on ‘Claude’…
I have created my deployment tool for openClaw, https://clawsifyai.com . It uses AI to create agent personas and skills for quick configuration, helping the agents perform their tasks correctly
the answer isn't sandbox everything, it's knowing which steps need AI judgment and which should be deterministic code. I lean towards the latter as much as possible
Using bespoke sandboxing seems rather pointless, it will be brittle in ways you aren't going to be familiar with unless you spend time studying the bespoke method. Brittle as in it might break a workflow and you wouldn't know why, or give it permissions you don't understand.
It's better to just study a general sandbox method once and use that.
> Sandbox my-assistant (Landlock + seccomp + netns)
Might as well just use a custom bwrap/bubblewrap command to isolate the agent to its own directory - it will leave wide swaths of the kernel exposed to 0day attacks.
The simplest sandbox method you can use is to just use docker with the runsc runtime (gVisor). And it also happens to be among the most secure methods you are going to find. You can also run runsc(gVisor) manually with a crafted OCI json, or use the `do` subcommand with an EROFS image.
Trying to selectively restrict networking is not something I usually bother with, unless you make it iron-clad it would likely give you a false sense of security. For example Nemoclaw does this by default: <https://docs.nvidia.com/nemoclaw/latest/reference/network-po...>
github.com and api.telegram.org will trivially facilitate exfiltration of data. Some others will also allow that by changing an API key I imagine.
It was supposed to be a sandbox. Unfortunately, the cat found it first.
if software is free, why is this written in slopscript instead of Rust
what about just using an unprivileged container and mounting a host folder to run open claw?
OpenClaw is so bad with Docker. I spent hours on it and hit road block after road block trying to get the most basic things working.
The last one was inability to install dependencies on the docker container to enable plugins. The existing scripts and instructions don’t work (at least I couldn’t get them to work. Maybe a me problem).
So I gave up and moved on. What was supposed to be a helpful assistant became a nightmare.
Did you try Incus? Gives you VM-like experience in a container
Why not use a VM?
Because I have a machine running dozens of apps on Docker and have a solid and stable workflow I want to take advantage of to manage my apps.
Why not ask an AI?
Same experience. I used Coolify and it was so hard. I wondered why people are so enthralled with this unacceptable UX for setup, only to realize no one cared about Docker and they just got a new Mac mini or used their own system.
I’m not an engineer and now I realise why I’ve been struggling getting OpenClaw setup in docker. I just can’t get it to work. Makes sense that it needs access to the underlying OS
Absolutely this. I finally got it working, but the instructions and scripts for setting it up with Docker absolutely do not work.
I'm curious if people have had success running it on Cloudflare workers. I know there was a lot of hype about that a few weeks ago.
Riight, unprivileged lxc/lxd container takes 2s to set up. Thanks NV, sticking with opencode.
The problem is that it cannot access your credentials hence useless.
Containers and VMs are really annoying to work with for these kinds of applications. Things like agent-safehouse and greywall are better imo
I've honestly found containers a breeze for such use cases. Inference lives on the host, crazy lives in an unpriv'd overlayfs container that I don't mind trashing the root of, and is like nothing in resources to clone, and gives a clean mitm surface via a veth. That said, greywall looks pretty dope!
yet another marketing stunt ? I like the framework that is used to buidl openclaw than open claw. Especially for work use cases, i would feel safer with a custom built version.
[dead]
[dead]
[flagged]
[dead]
[dead]
[dead]
[dead]
tldr for anyone skimming: the key insight is in section 3
[dead]
[flagged]
[flagged]
[flagged]
What does any of this have to do with Israel?
Well Jamieson O’Reilly, Wiz, SentinelOne, Zenity and CTech by Calcalist, for a start
[flagged]
[flagged]
We are in the wild wild west.
I’m looking for feedback, testing and possible security engineering contracts for the approach we are taking at Housecat.com.
The agent accesses everything through a centralized connections proxy. No direct API tokens or access.
This means we can apply additional policies and approval workflows and audit all access.
https://housecat.com/docs/v2/features/connection-hub
Some obvious ones are only grant read and draft permissions at all, and review and send drafts manually.
Some more clever ones are to only allow sending 5 messages a day, or enforcing soft delete patterns. This prevents accidentally spamming everyone or deleting things.
Next up is giving the agent “wrapped” and down scoped tokens you do want to equip it with the ability to do direct API calls. But these still go through the proxy that enforces the policies too.
Check out https://zo.computer - we've been doing OpenClaw for nearly a year, it works out of the box, and has hosting built-in. Zo arguably was the inspiration for Peter to create OpenClaw.
It's quite sad you are riding the coattails of Openclaw here and on Twitter. You only talk about how you were "first" but never say why you are arguably nowhere near all the competitors in terms of distribution that supposedly copied from you
Why do you think OpenClaw caught on much faster?
OpenClaw had a huge viral marketing campaign. It wasn't a coincidence everyone on twitter was talking about it at the same time suddenly. To its credit, it also executed well enough in a few areas that captured people's imagination. Most of the concepts are ideas people have been toying with for years, though.
Steinberg funded or directed a campaign? It looked to me like unconnected parties liked it and marketed it to offer their own solutions and services on top of it. You saw that they were paid by Steinberg/his affiliates?
All these comments about "this is crazy to deploy agents that might do something bad in your environment" are crazy themselves. The productivity gains from these computer use agents are crazy. Every org on earth has to make the call, is 3x productivity gains worth 2x the risk increase? The answer is almost always a resounding yes. You limit the blast radius if things go wrong, but the financial gain of having 1 employee to the work of 3 already pays for the disaster if it happens.
Crafted by Rajat
Source Code