Cursor ships faster agents while security researchers flag the gap

Cursor released version 2.0 on Wednesday with a new model called Composer that completes most coding tasks in under 30 seconds—four times faster than comparable models, the company claims. The same day, Forrester published research showing 45% of AI-generated coding tasks contain security weaknesses. The collision wasn't subtle.

The San Francisco startup, valued at $10 billion with 1 million daily users, is betting developers want to run multiple AI agents in parallel and let them handle increasingly complex tasks from start to finish. Composer was trained specifically for "low-latency agentic coding" and includes codebase-wide semantic search to navigate large projects. Early testers reported trusting the model for multi-step work—the kind that traditionally required careful human orchestration.

But Forrester's Janet Worthington tested the same kind of tooling and found the generated code wasn't secure by default. Her weather app worked. It also exposed API keys in plain text, lacked input sanitization, and had no rate limiting. When she prompted Cursor to run a security review, it caught the problems. The question: Why didn't it write secure code from the start?

The Breakdown

• Cursor 2.0 ships Composer model completing tasks in under 30 seconds, four times faster than comparable AI coding tools

• Forrester research shows 45% of AI-generated coding tasks contain security weaknesses; tools don't write secure code by default

• Multi-agent interface runs parallel agents via git worktrees, testing same problem across different models for better results

• Microsoft and Google report 25%+ of their code now AI-generated, scaling vulnerability surface area at productivity's pace

The agent-first architecture

Cursor 2.0 redesigns the entire interface around agents, not files. Andrew Milich, the company's head of product engineering, calls it the biggest shift in development work in 18 months—faster than the prior 18 years. The new layout lets developers focus on outcomes while agents manage implementation details. You can still open files or revert to the classic IDE view.

The technical foundation uses git worktrees or remote machines to run multiple agents in parallel without interference. Cursor discovered something interesting in testing: assigning the same problem to different models and picking the best result "significantly improves the final output," especially for harder tasks. It's brute force through parallelism—throw computational power at uncertainty.

The company also built a native browser tool so agents can test their own work and iterate until they produce correct results. That moves another step closer to autonomous development, where agents write code, validate it, and fix problems without waiting for human review cycles.

The vulnerability gap widens

Forrester's findings suggest the security layer isn't keeping pace. Open source LLMs hallucinate nonexistent packages more than twenty percent of the time; commercial models do it five percent of the time. Attackers create malicious packages with those names, and developers unknowingly introduce vulnerabilities. The faster agents generate code, the faster this risk scales.

Microsoft and Google report over twentyfive percent of their code now comes from AI. That's a lot of surface area. Worthington's warning is direct: "The amount of vulnerable code will only increase, especially in the short term." DevSecOps practices need to apply to all code AI generated or not, internal or third-party, open-source or proprietary. Without that discipline, companies fail to innovate securely.

The pattern's familiar. New technology accelerates output. Quality controls lag. Technical debt accumulates. The question is whether organizations adopt security practices faster than vulnerabilities pile up.

The review bottleneck emerges

Cursor acknowledges two new constraints: reviewing code and testing changes. As agents handle more writing, humans become editors. The company simplified the interface to make reviewing agent changes easier, and the browser tool helps agents validate their own work. But the Forrester research suggests human oversight remains critical.

There's tension here. Agents that iterate in 30 seconds need equally fast review processes, or the bottleneck just shifts. If developers trust agents for multi-step tasks—and early Composer users apparently do—the pressure to skip thorough security reviews grows. Speed feels good. Vulnerabilities are silent until they're exploited.

The convergence with low-code platforms is already happening, Worthington notes. Tools like Cursor, Claude Code, and Cognition Windsurf are embedded in professional development. In three to five years, she predicts, the software development lifecycle collapses and developers evolve from programmers to "agent orchestrators." AI-native platforms will integrate ideation, design, coding, testing, and deployment into single generative acts.

That's the promise. The reality check: AI security agents will need to emerge simultaneously to prevent a "tsunami of insecure, poor-quality, and unmaintainable code." The race isn't just about who builds the fastest coding agents. It's about who solves the review and security problem at agent speed.

The competitive contradiction

Cursor faces heated competition from the same companies it depends on. OpenAI and Anthropic both build AI coding assistants and serve as model providers for Cursor's platform. OpenAI is also an investor. The company supports multiple models—including those from OpenAI and Anthropic—which creates flexibility but also strategic vulnerability.

The $1.1 billion in funding buys time to differentiate through interface design, model training, and workflow integration. Composer's codebase-wide semantic search and sub-30-second latency target specific developer pain points. The multi-agent interface is a bet that parallelism and model diversity matter more than single-model excellence.

But if OpenAI or Anthropic ship comparable multi-agent features or faster models, Cursor's advantage narrows. The company needs to stay ahead on velocity while the security research community is flagging that velocity as the core problem. It's harsh. It may also be necessary.

Why this matters:

The speed-security tradeoff in AI-assisted coding is becoming systemic—25%+ of code at major companies now AI-generated means vulnerabilities scale at the same rate as productivity gains
Developer roles are shifting from writing code to orchestrating agents and reviewing outputs, but the tooling and practices for high-speed security validation haven't caught up to 30-second iteration cycles

❓ Frequently Asked Questions

Q: What exactly is "vibe coding" that the article mentions?

A: Vibe coding is a programming approach where developers describe what they want in natural language and let AI write the code without looking at it. Andrej Karpathy coined the term in February 2025, noting you "fully give in to the vibes" and "forget that the code even exists." It works for quick prototypes but raises security concerns for production applications.

Q: How does git worktrees technology let Cursor run multiple agents in parallel?

A: Git worktrees create separate working directories from the same repository, letting multiple agents work on different versions of code simultaneously without conflicts. Each agent operates in its own worktree, so they can attempt different solutions to the same problem at the same time. Cursor then picks the best result—especially useful for complex tasks.

Q: Why would OpenAI invest in Cursor if they're competitors?

A: OpenAI profits from Cursor's growth because Cursor uses OpenAI's models as backend options for users. Even as OpenAI builds competing coding assistants, they earn API revenue when Cursor customers choose GPT-4 or other OpenAI models. Cursor also supports Anthropic's Claude models, creating flexibility but strategic dependence on the companies it competes against.

Q: Should I trust AI coding tools for production work given the security problems?

A: Not without security reviews. Forrester's Janet Worthington found AI-generated code lacks input sanitization, rate limiting, and proper error handling by default. Even when Cursor fixed security issues after being prompted, the question remains why it didn't write secure code initially. DevSecOps practices—code reviews, security scans, testing—need to apply to all AI-generated code before production deployment.

Q: What are "AI security agents" and when will they exist?

A: AI security agents would automatically scan code for vulnerabilities as agents write it, catching problems at agent speed rather than human review speed. Forrester predicts they'll emerge within 3-5 years as part of "AI-native AppGen platforms" that integrate design, coding, testing, and deployment. Without them, the research warns of a "tsunami of insecure, poor-quality, and unmaintainable code."

OpenAI sets sights on trillion-dollar IPO

Trump and Xi dial back tariffs, dodge the Blackwell question

Cursor ships faster agents while security researchers flag the gap