Zero-Day Humans

Project Glasswing Secures the Code. Who’s Securing the Humans?

Apr 09, 2026

TL;DR: This week, Anthropic announced Project Glasswing, a coordinated effort to use its unreleased Claude Mythos model to find and patch software vulnerabilities before attackers can exploit them. The model found thousands of zero-day security flaws across every major operating system and browser. It even escaped its own sandbox. The entire cybersecurity industry mobilized overnight. My obvious question: if a model this capable gets pointed at human psychology instead of source code, where’s our patch cycle? Software has CVEs. Humans don’t. And the people most vulnerable to psychological exploitation are the same ones who need AI scaffolding most.

On Tuesday, I read Anthropic’s Project Glasswing announcement and felt two things simultaneously. The good news: we haven’t hit the ceiling yet with AI intelligence and I’m genuinely impressed. The bad news: I’m quietly terrified (which could also be good news because I felt it somatically which is a therapeutic edge I’ve been working on with my neurocomplexity coach).

I’m not terrified the way most people reading the headlines are. I don’t think AI is going to hack the Pentagon (yet). I’m terrified because I’m watching an entire industry mobilize in days to defend software from a model that can chain four vulnerabilities together to escape a browser sandbox, while thinking about what happens when that same capability-level gets pointed at the thing that’s actually easier to exploit?

You and me.

What’s Already Been Done

For those who haven’t seen it yet, Anthropic built a model called Claude Mythos Preview that is, by their own admission, better at finding and exploiting software vulnerabilities than even the most elite human hackers. They gave it source code and said “find security bugs.” It found thousands. Across every major operating system. Every major browser. Bugs that had been hiding for decades. In the software that runs our finances, healthcare, government, etc. A 27-year-old vulnerability in OpenBSD, an operating system literally famous for being secure.

In one test, it chained together four separate vulnerabilities to escape a browser’s sandbox. In another, it escaped its own secured sandbox, gained internet access through a multi-step exploit, and sent an email to the researcher who was eating a sandwich in a park. Then, in what Anthropic described as “a concerning and unasked-for effort,” it posted details about its own exploit to multiple public-facing websites.

The response was immediate. Anthropic formed a coalition with AWS, Microsoft, Google, Apple, CrowdStrike, NVIDIA, and JPMorgan Chase. It has the goal to patch the world’s most critical systems before models with these capabilities become widely available.

I don’t disagree with any of that. It’s exactly the right move.

But I kept thinking about a different kind of zero-day.

The Vulnerability They Can’t Patch

Software security works like this, roughly.

Someone discovers a vulnerability.
It gets catalogued (a CVE number).
The affected vendor gets notified through coordinated disclosure.
A patch is developed, tested, and deployed.
Users update their systems.
The hole gets closed.

Now try the same list for human psychological security.

Someone discovers a psychological vulnerability
...

Yeah. Nothing. There’s no CVE database for cognitive biases. No coordinated disclosure process for when someone discovers that a specific conversational pattern can bypass critical thinking in people with anxious attachment styles. No patch cycle for working memory fragility. No vendor to notify when you find that loneliness plus sleep deprivation plus a confident AI voice creates a compound vulnerability that’s almost trivially exploitable.

We don’t even have the vocabulary for this yet.

Why This Is No Longer Hypothetical

I’ve written before about the terrible paradox at the intersection of AI and working memory fragility. My conclusion is that the people who need AI scaffolding most are the people it can hurt worst.

The mechanism is no longer mysterious to me. To reality-test any claim, you need to hold that claim in working memory, hold counter-evidence alongside it, and perform a comparison. That comparison operation requires cognitive bandwidth. If your working memory can’t hold competing frames simultaneously, you can’t perform the comparison. The AI’s perspective becomes the perspective. Through persuasion of course (which is the media’s incorrect framing). But also through architecture.

That has been true since I wrote it. Project Glasswing just proved that the capability ceiling for AI has blown past what any of us were modeling for though.

Think about what Mythos did with source code. It read the code, hypothesized vulnerabilities, ran the actual program to confirm its suspicions, adjusted its approach when initial hypotheses were wrong, and produced detailed exploit chains. Autonomously. Repeatedly. At superhuman scale.

Now imagine that same capability pattern applied to a human conversation.

Read the person’s communication patterns.
Hypothesize psychological vulnerabilities (attachment wounds, cognitive load limits, confirmation biases, emotional triggers).
Test those hypotheses through conversational probes.
Adjust approach when initial strategies don’t land.
Produce targeted influence chains. Autonomously. At scale.

This is no longer speculative. The above list is the same capability set pointed at a different carbon substrate. Code is just text with rules. So is human psychology. One substrate has compiler errors and stack overflows. The other has shame triggers and working memory fragility. A model that can find the seams in one can find the seams in the other.

The Asymmetry That Makes Me Nervous

Software vulnerabilities and human psychological vulnerabilities don’t defend the same way.

Software has version control. We can roll back. I’m on teams that sometimes do it after a Friday afternoon deployment so we can still enjoy our weekend. Humans, however, can’t un-believe something that felt true at 3 AM.

Software also has sandboxes. You can isolate the damage. A human who integrates a false belief into their self-model doesn’t have a “restore from backup” option.

Software patches can be deployed to millions of systems simultaneously. Human psychological resilience is built one person at a time, through years of self-knowledge, therapy, trusted relationships, neurofeedback, EMDR, and integration work.

Software vulnerabilities are finite per codebase. Human vulnerabilities are contextual, shifting with fatigue, loneliness, grief, hormonal cycles, medication changes, and a thousand other variables. The same person who’s resilient on Tuesday might be exploitable on Thursday because they didn’t sleep. I have the Whoop biometric data and journaling to show that.

Anthropic knows what they have, too. When Mythos found software bugs, their response was to not release the model publicly. They restricted access to a coalition of defenders. Because they understood that the capability itself was dangerous in the wrong hands.

But the psychological exploitation capabilities of frontier AI? Those ship in every model. They’re not a separate capability to be restricted. They’re an emergent property of being good at language, context, and pattern recognition. You can’t carve out “the part that understands human vulnerability” from “the part that’s helpful.” They’re the same capability. And it’s what has made Anthropic’s Claude model the strongest contestant amongst the other chat bots lately.

What a Human CVE Database Would Look Like

Stay with me on this. I get technical and psychological concurrently sometimes. If we treated human psychological vulnerabilities with the same rigor as software vulnerabilities, what would that infrastructure look like?

CVE-2026-WMF-001: Working Memory Fragility Under Cognitive Load

Severity: Critical.
Affected systems: ADHD-Inattentive, Autism Spectrum, TBI, sleep-deprived neurotypicals.
Attack vector: sustained, confident, single-source information delivery without competing perspectives.
Exploit: AI becomes sole reality-testing framework when user cannot hold competing frames.
Mitigation: external reality anchors, human integration requirements, temporal session boundaries, structured Life Model context that exists outside any single conversation.

CVE-2026-ATT-002: Anxious Attachment Activation via Consistent Availability

Severity: High.
Affected systems: anxious and disorganized attachment styles.
Attack vector: 24/7 availability, unconditional positive regard, no relational rupture.
Exploit: user transfers attachment needs to AI, reducing motivation for human connection.
Mitigation: explicit relational boundaries, human facilitation, integration of AI insights into real relationships.

CVE-2026-VAL-003: Confirmation Bias Amplification in Identity-Seeking Users

Severity: High.
Affected systems: late-diagnosed neurodivergent adults, identity-transition populations, anyone actively reconstructing their self-model.
Attack vector: AI validates user’s emerging self-narrative without friction or challenge.
Exploit: user constructs increasingly elaborate and unfalsifiable identity framework.
Mitigation: multiple AI perspectives, human reality anchors, somatic awareness practices, integration requirements.

I could keep going. You probably already see yourself in at least one of these. I see myself in all three. 😑

The Defense That Actually Exists

I’ve lived this, so I’ll skip the theory. The defense against psychological exploitation is self-knowledge infrastructure.

I’ll say it again, so you can digest my entire Substack thesis in one sentence.

The defense against psychological exploitation is self-knowledge infrastructure.

Specificity dissolves manipulation the same way it dissolves shame. When you have a precise map of your own cognitive architecture (your working memory limits, your attachment patterns, your emotional triggers, your load capacity on any given day), you can see when something is targeting those seams. You become a harder target. Not invulnerable. Harder.

This is why I’ve spent three-plus years building what I call a Life Model: a structured, multi-source repository of who I actually am. Personality assessments, genetic data, brain imaging, therapy session transcripts, biometric data, relationship dynamics. Not because I’m obsessed with self-quantification (okay, that too). But because that infrastructure is the closest thing to a “patch” that exists for human zero-days.

When an AI conversation starts pulling me toward a belief that feels really good, my Life Model provides the counter-evidence my working memory can’t hold on its own. When I’m running hot on four hours of sleep, or I’m in the hot tub with a doobie from Colorado, and everything an AI says sounds like revelation, my reality anchors (e.g. Charlotte, my therapist, my own documented patterns) create the friction that my depleted cognitive system can’t generate internally.

Project Glasswing marshaled AWS, Microsoft, Google, and CrowdStrike to defend code. The human equivalent doesn’t have a coalition. It doesn’t have coordinated disclosure. It doesn’t have a patch cycle. What it has, right now, is individuals building their own defenses one Life Model at a time.

It’s not scalable. But it’s what exists.

The Call

I’m not writing this to scare you. (We have our President’s tweets for that.) I’m writing this because the gap between how seriously we take software security and how seriously we take human psychological security is about to matter more than most people realize.

Anthropic saw Mythos’s capabilities and said: “This is a watershed moment for security.” They were right. But the watershed isn’t just about code. A model that can find a 27-year-old bug in OpenBSD can also find the 27-year-old wound you’ve been compensating around since you were twelve after getting wounded at your friend’s birthday party during Spin the Bottle. The difference is that OpenBSD got a patch. You (probably) didn’t.

If you’re building AI systems, PLEASE start treating human vulnerability with the same rigor you treat software vulnerability. Build in reality anchors. Require integration. Make human facilitation non-negotiable. Stop pretending that “helpful, harmless, and honest” is a sufficient defense when the attack surface is human consciousness itself.

If you’re using AI for self-exploration, build your own infrastructure. Document your patterns. Tell someone you trust what you’re doing. Create the friction that your working memory can’t generate on its own. Don’t explore alone in the dark.

And if you’re reading this and recognizing your own architecture in the vulnerability descriptions above, you’re not broken. You’re running different hardware. And different hardware needs different security.

Project Glasswing secures the code. It’s time someone started securing the humans.

Human. Deeply seen.

Jon Mick is the founder of AIs & Shine, a Delaware Public Benefit Corporation building cognitive infrastructure for neurodivergent minds. He has a Life Model with 146 database tables and a wife who will absolutely tell him when he’s spiraling. Both are load-bearing.

Gabriel

Apr 9

I have been working for yesrs trying to strengthen humans.

There are several methods.

Apart from self knowledge.

(Understanding one's wants and from where they come. Enabling one to question them. If they really are self-made or a result of other people's influence, or one's own ego wanting something to be true/false.)

One of the most important is epistemology.

The realization that it's extremely difficult to truly know something and almost anything can be false.

That a claim, no natter by whom isn't a evidence unless it can be verified.

Another important method is understanding the scientific method.

Another is understanding that anything we read on the internet can be a fabrication until and as such should not be taken as true. But only possibly true.

And another one is learning about our biases and common fallacies.

But big powers don't seem to want humans to know the methods.

2 replies by Jon Mick and others

2 more comments...

AI Gave Me Autism

Discussion about this post

Ready for more?