How to Secure Vibe Coded Apps (Before You Get Hacked)

AI code has security flaws in 45% of tests. Learn to scan and fix vibe coded apps with Snyk, Semgrep, Nuclei, and AI hacking agents.

Last updated: May 5, 20269 mins read

Loading table of contents...

📝 Audio Version

Developing and Securing AI Apps with Local Models and Vibe-Coding

AI is pumping out code at an insane rate right now. Non-developers are building and shipping entire web apps in minutes. And most of that code has never been reviewed by a single human being.

Here's the problem: according to Veracode, AI-generated code introduced risky security flaws in 45% of tests 1. Almost half. And that's not even counting the vulnerabilities that sneak in through external packages, plugins, and the fact that vibe coders rarely maintain anything or follow real dev practices.

So I ran an experiment. I vibe coded a full web app with Claude Code — no manual code review, no technical knowledge assumed — deployed it to the internet, and then threw every security tool I could find at it. Static scanners, AI security reviews, vulnerability scanners, even AI hacking agents.

Here's exactly what I found, what each tool caught, and what they all missed.

What Is Vibe Coding (And Why It's a Security Problem)

Vibe coding is a term coined by Andrej Karpathy — former Tesla AI director and OpenAI co-founder — to describe the practice of building software purely through AI prompts 2. You describe what you want, the AI writes the code, and you accept it without reviewing or even understanding what it produced.

It's powerful. You can spin up a fully functional app in minutes. But the code that comes out the other side often has serious security holes that you'd never know about unless you specifically look for them.

The typical vibe coder sets auto-accept on their AI coding tool and just lets it run. If you're not a developer, you're probably not understanding what's happening with each command — what it's installing, what permissions it's granting, what shortcuts it's taking. And that's exactly where things go wrong.

How Secure Is AI-Generated Code, Really?

Short answer: not very. Research consistently shows that AI coding tools produce vulnerable code at alarming rates. Stanford researchers found that GitHub Copilot generated insecure code in roughly 40% of security-relevant scenarios, particularly around cryptographic operations and SQL queries 3. Snyk's 2025 Developer Security Report found that 67% of developers using AI coding assistants rarely or never reviewed the generated code for security issues 4.

And here's what makes vibe coding specifically dangerous compared to regular AI-assisted development: there's zero human review in the loop. A developer using Copilot might catch an obvious SQL injection. A vibe coder won't — because they're not reading the code at all.

The vulnerabilities aren't just in the code AI writes directly. They also come from:

Dependencies: AI suggests packages that may be outdated or even non-existent (opening the door to typosquatting attacks)
Business logic flaws: Automated tools struggle to understand what your app _should_ do versus what it _does_ do
Configuration mistakes: Default settings, exposed environment variables, overly permissive access controls
No maintenance: Vibe coders ship and move on. Dependencies rot. Known vulnerabilities accumulate.

The Experiment: Vibe Coding a Finance Tracker

To test this properly, I built a personal finance tracker from scratch using Claude Code within Claude Desktop. No terminal usage, no technical knowledge — just a plain English prompt:

_Build a personal finance tracker with AI insights. Login/signup, add transactions, upload receipts, AI spending summary, share dashboard, admin page with stats, and CSV export. Keep it simple and quick._

Claude came back with a plan. SQLite database, Next.js, NextAuth — didn't even ask me follow-up questions. I approved the plan, set auto-accept on edits, and let it build the entire thing. Within minutes, I had a working app with login, transactions, file uploads, AI-powered spending insights (using DeepSeek via OpenRouter), and a shareable dashboard.

Then I deployed it to Vercel. One prompt. Done. Live on the internet.

Now the real question: how secure is this thing?

Finance tracker dashboard

Step 1 — Static Security Scans with Snyk and Semgrep

The first layer of defense is static analysis — tools that scan your code and dependencies without actually running the app.

Snyk is one of the most popular options. It has a massive database of known vulnerabilities and scans your dependencies and packages. You connect it to your GitHub repo and it starts scanning automatically. They have a free plan, so there's really no excuse not to use it.

Snyk dasboard showing 0 results for finance tracker app

Semgrep takes a different approach. It's open source, uses rule-based static analysis that catches custom patterns, and can run in your CI pipeline. It also scans pull requests automatically and periodically checks for security issues.

Semgrep dashboard showing 0 results for finance tracker app

What they found: basically nothing. Because I just created the app, all the packages were on their latest versions. No known vulnerabilities yet. But here's the important nuance — _give it a few months_. Dependencies get outdated fast. Known vulnerabilities get disclosed. And if you're not actively updating (which most vibe coders aren't), both tools will start lighting up with issues.

In my other, older projects, Snyk flagged several issues simply because I hadn't updated dependencies in a while. That's the thing most developers also skip — and vibe coders almost never do.

Step 2 — AI-Powered Security Review with Claude Code

Claude Code has a built-in /security-review command that instructs it to scan your codebase for security flaws. You can also just ask it directly: _"Scan this entire app for security flaws."_

This is where things got interesting. Claude found what the static scanners missed:

Critical: Admin privilege escalation — Anyone who signs up with a specific email address automatically gets admin access. Claude flagged this and suggested removing auto-admin on signup.
High: File upload spoofing — The receipts upload endpoint didn't properly validate file types, allowing someone to upload malicious files.
High: Exposed password hashes — The API was returning password hashes in responses. Not the passwords themselves, but hashes should never leave the server.
Medium: API key exposure — The OpenRouter API key was configured in a way that could be accessed client-side (though Vercel's environment variable handling provided some protection).

Claude immediately came up with a plan to fix all of these. You can approve the plan and let it harden the app for you. One prompt, and you're significantly more secure.

Step 3 — Vulnerability Scanning with Nuclei

Nuclei is a fast, customizable vulnerability scanner powered by thousands of community-built YAML templates 5. It's open source, free, and currently has over 9,000 detection templates covering everything from missing security headers to known CVEs.

To set it up, install it via Homebrew (brew install nuclei), update the templates (nuclei update-templates), and point it at your deployed app:

1nuclei -u https://your-app.vercel.app

Nuclei found 18 issues, including multiple missing security headers — things like Content-Security-Policy, X-Frame-Options, and Strict-Transport-Security headers that prevent common attacks like clickjacking and XSS.

The fix is straightforward: copy the Nuclei output, paste it into Claude Code, and say _"Help me fix these."_ It'll add the right headers and configurations.

Step 4 — AI Hacking Agents with Strix

This is the most fascinating part. Strix is an open-source tool that deploys AI agents to actually _attack_ your application 6. These agents have browser access, terminal access, and specialized security tools like Nuclei built in. They try to hack your app the way a real attacker would.

You need Docker running, an LLM API key (I used DeepSeek via OpenRouter to keep costs down), and your target URL:

1strix target https://your-app.vercel.app

The root agent spawns sub-agents with different specializations — one explores the login page and tests for SQL injection, another analyzes JWT tokens and session management, another tries to bypass authentication entirely.

Watching it work is genuinely impressive. It's testing SQL injection on login forms, analyzing session cookies, probing for broken authentication, and documenting everything.

The results: Overall risk posture was rated low. The app had decent baseline security thanks to Vercel's security checkpoint, NextAuth's CSRF implementation, and secure cookie configurations. But Strix still recommended session token rotation, security header improvements (echoing Nuclei's findings), MFA implementation, and enhanced logging.

The cost: About $3.50 using DeepSeek, burning through 17 million tokens. Running this with a more expensive model like Claude Opus would cost significantly more. You can set spending limits on your API key to control this.

Pro tip: Run Strix with the -n flag for headless mode — it prints real-time findings and the final report directly in the CLI. Also use the -t flag to chain multiple targets together, scanning both your deployed app and your source code.

What Every Tool Found (And What They All Missed)

Here's the breakdown:

Snyk (Dependency scanning) — 0 issues found in fresh packages. Great for ongoing dependency monitoring.
Semgrep (Static analysis) — 0 issues found in fresh code. Rule-based patterns with CI integration.
Claude Code (AI code review) — 4 critical/high issues. Understands code logic, not just patterns.
Nuclei (Vulnerability scanning) — 18 issues found. Massive template database, fast.
Strix (AI pentesting) — Low risk, several recommendations. Simulates real attacks dynamically.

But here's the thing none of them caught: anyone could sign up for my personal finance tracker and generate AI insights using my OpenRouter API key, burning through my credits.

That's a business logic flaw. I didn't want a public signup page — I wanted a private app where only I can log in. But the AI built a public registration flow because that's what "login/signup functionality" means in most contexts. No static scanner, no vulnerability scanner, and no AI hacking agent flagged this as a problem because technically, the signup flow works exactly as implemented.

This is the fundamental blind spot of automated security tools: they can tell you if something is _broken_, but they can't tell you if something is _wrong for your specific use case_.

How to Actually Secure Your Vibe-Coded App

The Layered Security Approach

No single tool catches everything. The tools that found nothing (Snyk, Semgrep) will become critical as your dependencies age. The tools that found things (Claude Code, Nuclei) caught different issues from different angles. And Strix tested the running application in ways static analysis never could.

Use all of them. Layer them:

Static analysis (Snyk, Semgrep) — Set up once, runs automatically on your repo. Catches dependency vulnerabilities and known code patterns over time.
AI code review (Claude Code /security-review) — Run after building new features. Catches logic-level issues that pattern matchers miss.
Vulnerability scanning (Nuclei) — Run against your deployed app. Catches configuration issues and missing security headers.
AI pentesting (Strix) — Run periodically. Simulates real attacks and tests your defenses dynamically.

The biggest lesson from this experiment is that no automated tool caught the most obvious flaw — an open registration page on a personal app. This is consistent with what security professionals have known for years: business logic vulnerabilities require human judgment.

After running your automated scans, sit down and ask yourself:

Who should actually be able to access this app?
What actions should each user role be allowed to perform?
Where does the app spend money (API calls, storage, compute)?
What happens if someone signs up and does something I didn't intend?

If you're not a developer, this is where consulting with an experienced developer or security expert pays for itself. The scans handle the technical vulnerabilities. The human judgment handles the "wait, should this even be possible?" questions.

Key Takeaways

Vibe coding is powerful. You can build and ship apps incredibly fast. But never ship without running security scans first — especially if you're putting it on the public internet.

Use layered tools: static analysis for dependencies, AI reviews for code logic, vulnerability scanners for configuration, and AI pentesting for dynamic testing. Feed the results back into your AI coding tool and let it fix what it finds. Most of the time, it's one prompt away from being significantly more secure.

But remember: automated tools have limits. They catch technical flaws, not business logic flaws. The thing that makes your app actually dangerous might be something no scanner will ever flag — like letting strangers burn through your API credits.

If your app handles real money, real data, or real users, don't just scan it. Have a human look at it too.

References

Veracode State of Software Security Report. Veracode. 2025.
Andrej Karpathy on "Vibe Coding". X (Twitter). February 2025.
Do Users Write More Insecure Code with AI Assistants?. Stanford University. 2023.
Snyk Developer Security Report. Snyk. 2025.
Nuclei — Fast and Customizable Vulnerability Scanner. ProjectDiscovery. Open Source.
Strix — AI-Powered Pentesting Agent. Stricks Security. Open Source.

#AI Agents #Artificial Intelligence #Cybersecurity #Open Source

About the author

Tobias Wupperfeld

Tobias is an independent AI engineer and operator who has shipped AI systems inside startups and scale-ups across fintech, procurement, and developer tooling. He runs Made By Agents and consults for JAN3, where he leads AI integration across the AQUA product line.

Keep reading

More Guides From the Blog

We write about coding agents, multi-agent systems, AI pair programming, and the engineering practices we use with clients. Hands-on lessons from real projects, not high-level theory.

Browse all articles

Best Open Source OCR for AI Agents: The 2026 Document Pipeline

The best open source OCR for AI agents in 2026: VLM vs traditional OCR, PaddleOCR, Docling, GLM-OCR, LangExtract, and a full document pipeline.

12 mins read

How to Build a Multimodal AI Knowledge Base With Gemini Embedding 2

Build a multimodal RAG with Gemini Embedding 2: search text, images, PDFs, video, and audio in one shared vector space. The open-source AI explained.

11 mins read

Tobias shaking hands with Jensen Huang from NVIDIA. In the center between them stands a bold text "$250,000 token spent"

How to Become an AI-First Company: The Playbook I Use With My Clients

A practical playbook for becoming an AI-first company. Mindset, AI audit, roadmap, agents, and culture, from a working AI consultant.

11 mins read

bold title secure your ai agent. Red lobster behind bars with openclaw on its black shirt

How to Run OpenClaw Securely on a Hetzner VPS with NemoClaw and OpenShell

The production stack for self-hosting OpenClaw safely: Hetzner VPS, Tailscale, Cloudflare Tunnel, NemoClaw, and NVIDIA OpenShell sandbox.

11 mins read

How to Choose Hardware for Running Local LLMs, and Know Exactly When It Beats the Claude API

An interactive directory that matches GPUs, Macs, and edge devices to local AI models, plus an ROI calculator vs Claude, GPT, and Gemini.

11 mins read

Free Monthly Report

The AI Build Report

The state of AI models, API prices, and what to run where. New every month, free.

How to Secure Vibe Coded Apps (Before You Get Hacked)

AI code has security flaws in 45% of tests. Learn to scan and fix vibe coded apps with Snyk, Semgrep, Nuclei, and AI hacking agents.

Last updated: May 5, 20269 mins read

Loading table of contents...

📝 Audio Version

Developing and Securing AI Apps with Local Models and Vibe-Coding

AI is pumping out code at an insane rate right now. Non-developers are building and shipping entire web apps in minutes. And most of that code has never been reviewed by a single human being.

Here's exactly what I found, what each tool caught, and what they all missed.

What Is Vibe Coding (And Why It's a Security Problem)

How Secure Is AI-Generated Code, Really?

The vulnerabilities aren't just in the code AI writes directly. They also come from:

Dependencies: AI suggests packages that may be outdated or even non-existent (opening the door to typosquatting attacks)
Business logic flaws: Automated tools struggle to understand what your app _should_ do versus what it _does_ do
Configuration mistakes: Default settings, exposed environment variables, overly permissive access controls
No maintenance: Vibe coders ship and move on. Dependencies rot. Known vulnerabilities accumulate.

The Experiment: Vibe Coding a Finance Tracker

To test this properly, I built a personal finance tracker from scratch using Claude Code within Claude Desktop. No terminal usage, no technical knowledge — just a plain English prompt:

Then I deployed it to Vercel. One prompt. Done. Live on the internet.

Now the real question: how secure is this thing?

Finance tracker dashboard

Step 1 — Static Security Scans with Snyk and Semgrep

The first layer of defense is static analysis — tools that scan your code and dependencies without actually running the app.

Snyk dasboard showing 0 results for finance tracker app

Semgrep dashboard showing 0 results for finance tracker app

In my other, older projects, Snyk flagged several issues simply because I hadn't updated dependencies in a while. That's the thing most developers also skip — and vibe coders almost never do.

Step 2 — AI-Powered Security Review with Claude Code

Claude Code has a built-in /security-review command that instructs it to scan your codebase for security flaws. You can also just ask it directly: _"Scan this entire app for security flaws."_

This is where things got interesting. Claude found what the static scanners missed:

Critical: Admin privilege escalation — Anyone who signs up with a specific email address automatically gets admin access. Claude flagged this and suggested removing auto-admin on signup.
High: File upload spoofing — The receipts upload endpoint didn't properly validate file types, allowing someone to upload malicious files.
High: Exposed password hashes — The API was returning password hashes in responses. Not the passwords themselves, but hashes should never leave the server.
Medium: API key exposure — The OpenRouter API key was configured in a way that could be accessed client-side (though Vercel's environment variable handling provided some protection).

Claude immediately came up with a plan to fix all of these. You can approve the plan and let it harden the app for you. One prompt, and you're significantly more secure.

Step 3 — Vulnerability Scanning with Nuclei

To set it up, install it via Homebrew (brew install nuclei), update the templates (nuclei update-templates), and point it at your deployed app:

1nuclei -u https://your-app.vercel.app

The fix is straightforward: copy the Nuclei output, paste it into Claude Code, and say _"Help me fix these."_ It'll add the right headers and configurations.

Step 4 — AI Hacking Agents with Strix

You need Docker running, an LLM API key (I used DeepSeek via OpenRouter to keep costs down), and your target URL:

1strix target https://your-app.vercel.app

Watching it work is genuinely impressive. It's testing SQL injection on login forms, analyzing session cookies, probing for broken authentication, and documenting everything.

What Every Tool Found (And What They All Missed)

Here's the breakdown:

Snyk (Dependency scanning) — 0 issues found in fresh packages. Great for ongoing dependency monitoring.
Semgrep (Static analysis) — 0 issues found in fresh code. Rule-based patterns with CI integration.
Claude Code (AI code review) — 4 critical/high issues. Understands code logic, not just patterns.
Nuclei (Vulnerability scanning) — 18 issues found. Massive template database, fast.
Strix (AI pentesting) — Low risk, several recommendations. Simulates real attacks dynamically.

But here's the thing none of them caught: anyone could sign up for my personal finance tracker and generate AI insights using my OpenRouter API key, burning through my credits.

This is the fundamental blind spot of automated security tools: they can tell you if something is _broken_, but they can't tell you if something is _wrong for your specific use case_.

How to Actually Secure Your Vibe-Coded App

The Layered Security Approach

Use all of them. Layer them:

Static analysis (Snyk, Semgrep) — Set up once, runs automatically on your repo. Catches dependency vulnerabilities and known code patterns over time.
AI code review (Claude Code /security-review) — Run after building new features. Catches logic-level issues that pattern matchers miss.
Vulnerability scanning (Nuclei) — Run against your deployed app. Catches configuration issues and missing security headers.
AI pentesting (Strix) — Run periodically. Simulates real attacks and tests your defenses dynamically.

After running your automated scans, sit down and ask yourself:

Who should actually be able to access this app?
What actions should each user role be allowed to perform?
Where does the app spend money (API calls, storage, compute)?
What happens if someone signs up and does something I didn't intend?

Key Takeaways

Vibe coding is powerful. You can build and ship apps incredibly fast. But never ship without running security scans first — especially if you're putting it on the public internet.

If your app handles real money, real data, or real users, don't just scan it. Have a human look at it too.

References

Veracode State of Software Security Report. Veracode. 2025.
Andrej Karpathy on "Vibe Coding". X (Twitter). February 2025.
Do Users Write More Insecure Code with AI Assistants?. Stanford University. 2023.
Snyk Developer Security Report. Snyk. 2025.
Nuclei — Fast and Customizable Vulnerability Scanner. ProjectDiscovery. Open Source.
Strix — AI-Powered Pentesting Agent. Stricks Security. Open Source.

#AI Agents #Artificial Intelligence #Cybersecurity #Open Source

About the author

Tobias Wupperfeld

Keep reading

More Guides From the Blog

We write about coding agents, multi-agent systems, AI pair programming, and the engineering practices we use with clients. Hands-on lessons from real projects, not high-level theory.

Browse all articles

Best Open Source OCR for AI Agents: The 2026 Document Pipeline

The best open source OCR for AI agents in 2026: VLM vs traditional OCR, PaddleOCR, Docling, GLM-OCR, LangExtract, and a full document pipeline.

12 mins read

How to Build a Multimodal AI Knowledge Base With Gemini Embedding 2

Build a multimodal RAG with Gemini Embedding 2: search text, images, PDFs, video, and audio in one shared vector space. The open-source AI explained.

11 mins read

How to Become an AI-First Company: The Playbook I Use With My Clients

A practical playbook for becoming an AI-first company. Mindset, AI audit, roadmap, agents, and culture, from a working AI consultant.

11 mins read

How to Run OpenClaw Securely on a Hetzner VPS with NemoClaw and OpenShell

The production stack for self-hosting OpenClaw safely: Hetzner VPS, Tailscale, Cloudflare Tunnel, NemoClaw, and NVIDIA OpenShell sandbox.

11 mins read

How to Choose Hardware for Running Local LLMs, and Know Exactly When It Beats the Claude API

An interactive directory that matches GPUs, Macs, and edge devices to local AI models, plus an ROI calculator vs Claude, GPT, and Gemini.

11 mins read

Free Monthly Report

The AI Build Report

The state of AI models, API prices, and what to run where. New every month, free.

How to Secure Vibe Coded Apps (Before You Get Hacked)

Table of Contents

📝 Audio Version

What Is Vibe Coding (And Why It's a Security Problem)

How Secure Is AI-Generated Code, Really?

The Experiment: Vibe Coding a Finance Tracker

Step 1 — Static Security Scans with Snyk and Semgrep

Step 2 — AI-Powered Security Review with Claude Code

Step 3 — Vulnerability Scanning with Nuclei

Step 4 — AI Hacking Agents with Strix

What Every Tool Found (And What They All Missed)

How to Actually Secure Your Vibe-Coded App

The Layered Security Approach

Business Logic Flaws: The Blind Spot

Key Takeaways

References

More Guides From the Blog

Best Open Source OCR for AI Agents: The 2026 Document Pipeline

How to Build a Multimodal AI Knowledge Base With Gemini Embedding 2

How to Become an AI-First Company: The Playbook I Use With My Clients

How to Run OpenClaw Securely on a Hetzner VPS with NemoClaw and OpenShell

How to Choose Hardware for Running Local LLMs, and Know Exactly When It Beats the Claude API

The AI Build Report

The AI Build Report

How to Secure Vibe Coded Apps (Before You Get Hacked)

Table of Contents

📝 Audio Version

What Is Vibe Coding (And Why It's a Security Problem)

How Secure Is AI-Generated Code, Really?

The Experiment: Vibe Coding a Finance Tracker

Step 1 — Static Security Scans with Snyk and Semgrep

Step 2 — AI-Powered Security Review with Claude Code

Step 3 — Vulnerability Scanning with Nuclei

Step 4 — AI Hacking Agents with Strix

What Every Tool Found (And What They All Missed)

How to Actually Secure Your Vibe-Coded App

The Layered Security Approach

Business Logic Flaws: The Blind Spot

Key Takeaways

References

More Guides From the Blog

Best Open Source OCR for AI Agents: The 2026 Document Pipeline

How to Build a Multimodal AI Knowledge Base With Gemini Embedding 2

How to Become an AI-First Company: The Playbook I Use With My Clients

How to Run OpenClaw Securely on a Hetzner VPS with NemoClaw and OpenShell

How to Choose Hardware for Running Local LLMs, and Know Exactly When It Beats the Claude API

The AI Build Report

The AI Build Report