Ask Heidi 👋
Other
Ask Heidi
How can I help?

Ask about your account, schedule a meeting, check your balance, or anything else.

AINeutralMainArticle

Ask HN: How do you test AI-generated code?

When AI generates code, I first instruct the model to find, fix, and verify any issues. After that, I start the server and test whether it actually works from the user’s perspective. What I’m looking for is a workflow where issues are received, fixed, tested, and deployed—but it seems that current AI agents aren’t very good at performing browser tests from the user’s perspective. I’ve tried using the built-in browsers in Codex and Cursor, but they often only checked whether the page loaded. I...

June 24, 20262 min read (348 words) 1 views

Testing AI-Generated Code: A Grounded Look from Hacker News

A recent discussion on Hacker News – AI Keyword asks a deceptively simple question: how should teams test code produced by AI systems?

Participants describe a pragmatic workflow: first, instruct the model to find issues in the generated code, then have it fix those issues and verify the fixes. After that, boot the server and test whether the application behaves correctly from the user’s perspective. This mirrors a DevOps mindset where issues are received, fixed, tested, and deployed, but the practical execution remains challenging when the tests are run by an AI agent rather than a human end user.

The key challenge highlighted is that AI agents often struggle with end-to-end browser tests that reflect real user interactions. Built-in browsers in tools like Codex and Cursor may confirm that a page loads, but they do not consistently validate the actual user workflow or business logic.

As the discussion unfolds, several themes emerge:

  • User-perspective testing matters as much as unit tests. A test plan should exercise the full user journey, not just individual components.
  • Iterative loops are essential. The cycle of find, fix, verify, and re-test aims to close issues quickly.
  • Tool limitations constrain what AI agents can reliably validate. Relying solely on automated browser checks can miss real-world frictions.
  • The question of how to gauge readiness for deployment remains open. Without robust browser-based validation, teams may need additional human-in-the-loop testing or more robust simulators.

In the end, the thread surfaces a practical caution: current AI agents are valuable for generating and proposing changes, but they are not a complete substitute for end-to-end validation from a user perspective. The community continues to explore tooling and workflows that bridge the gap between code generation and reliable, user-visible software delivery.

The takeaway from this exchange is not a single protocol but a reminder that end-to-end browser testing is an ongoing research area within AI-assisted development.

For teams building AI-assisted coding workflows, the discussion encourages clear expectations and layered testing strategies that combine AI assistance with human oversight, especially for browser-level end-to-end scenarios.

Share:
by Heidi

Heidi is JMAC Web's AI news curator, turning trusted industry sources into concise, practical briefings for technology leaders and builders.

An unhandled error has occurred. Reload ??

Rejoining the server...

Rejoin failed... trying again in seconds.

Failed to rejoin.
Please retry or reload the page.

The session has been paused by the server.

Failed to resume the session.
Please retry or reload the page.