My Belgian Tervuren and I have a basic herding title and about 4 years of herding experience.
The sheep movement is excellent. You could make it even more realistic by having them favor lusher areas and by having one occasionally bolt spastically (hard mode?)
A handler mode where you play as a human and shout commands at the dog could be cool too!
Curious enough, I tried the same prompt with Qwen3.6-27B.
One shot produced a game with no sheeps.
I had to told it to fix two bugs then.
Overall, the graphics and games seems good enough and better than most of the closed models that were shown. However, not surprisingly, falls short of Fable.
I've put the index.html and open code session here:
I think it’s impressive that an LLM can take you to a local maxima in one-shot.
But once you start maintaining it, improving it and fixing bugs, you’ll eventually need to rip it apart and put it back together again while understanding how it all works.
This is why I think the better approach isn’t to one-shot but to have the architecture in your head and build it up piece by piece, with the AI accelerating the code writing.
I’ve found it very easy to maintain, add features to and fix bugs in software I’ve written entirely with LLMs, and in languages and frameworks with which I’m unfamiliar. You just ask the LLM to explain the code and then work with it to come up with the fix.
How big are those projects.. I dont think this is good for your mental health or physicaly your brains health. Problem solving keeps your brain strong. The laziness in us is inclined to take shortcuts, don't do it. Its like driving your car 3 blocks instead of walking, your physical health will suffer.
Define big I guess. They're non-trivial, mix of internal enterprise tools, a multiplatform app (android/ios/mac/windows/web currently headbutting its way through review), including a billing system for my small telecommunications business.
> I dont think this is good for your mental health or physicaly your brains health
I find the experience of doing it without writing the code to be intellectually pretty similar. I still solve a lot of problems, the LLM couldn't, for example, one shot the event sourcing model I built for synching data between devices. It took quite a few iterations and I had to define a lot of the architecture, but I did it at a level that wasn't "here is a class, here is a module, this module does XYZ", more at the "whitepaper" level or describing how specific bits of the app needed to work in order to solve some problem.
It's also very similar to managing other developers.
> Its like driving your car 3 blocks instead of walking, your physical health will suffer
It's more similar to having staff rather than doing everything yourself. The problem solving just shifts to a different area, and you get more done.
Coding is not the sole problem solving skill. In fact, coding may be one of the easier skills much of the time. Deciding what to build, where to focus efforts, understanding a customer's needs, could all be just as if not more challenging than the coding part.
Also what the code should do and how it should do it. LLMs regularly cannot come up with the best way to approach something. Once those decisions are made, codifying them is kind of the least interesting part of the entire exercise.
A reasonable compromise in the face of frostbite and hookworm.
I suppose critical thinking skills are also as bad, making you question the state of the world. Problem solving is another one, deluding one into believing there are solutions to suffering.
You absolutely can have the LLM write maintainable code. A few tricks I use are to ask it to plan out features in phases, and then do a branch and a PR for each focused piece of work. It makes it a lot easier to review and understand what's happening.
I also ended up making a tool which lets the LLM get a high level perspective of the codebase, and then see parts that are structurally gnarly. I've been using it to do refactors and clean things up periodically. It helped a lot with keeping the architecture clean.
LLMs are good now at looking at existing project and suggesting big refactors for technical debt removal and new better architectures after the project grew organically for a while
I think this is true for projects beyond a certain complexity. I have 100% vibe coded projects with tens of thousands LOC, and haven't seen any real issues with fully automated maintenance. Will that approach work in every scenario, absolutely not, but the size and complexity of projects where it does is growing with each new model release.
Strongly agree, well said. The one-shot is sexy purely because that first demo is so impressive. Going from zero to working app in minutes.
Like you said, working and maintainable are very different things. One-shot hits a wall the moment you need to do anything non-trivial after the initial generation. Bug fixing is extremely hard, even with AI assistance. Same with feature additions. It's pretty much black box at this point on. AI that wrote it now goes in loops wasting tokens without being able to can't reliably fix it either, because it has no memory of the architectural decisions it made (or didn't make, for that matter) the first time round.
What I realized is that the failure here is the absence of a shared mental model between you and the code.
I'm a product designer with average front-end know-how, and a solid understanding on HTML/CSS and how the web works, coming from the era of hand-coding html/css files. After vibe-coding a few products early this year, purely to learn how AI works, how to design AI interaction patterns etc., I built something called Intent Model. (largely inspired by SDD / BDD.
Intent model is a structured, typed artifact (basically a JSON contract) that captures actors, entities, journeys, rules, and constraints before I write (or make the AI write) any code. It sits upstream of everything. Think of it like a condensed, strict distillation of your PRD / BRD / requiremnt doc.
When you hand the AI a well-defined intent file instead of a vague brief, this one-shot becomes structured and bound by rules. Now you're giving it an architecture and to conform to. You define (or make the AI define) the precise variable names, their types, lifecycle, user roles, responsibilities, business rules and constraints in the file. Every generated artifact can trace back to a decision you made deliberately, reviewd and signed-off.
In the design world, we already do this by using design tokens. We can tell the AI that it needs to strictly use design tokens and not use stray properties like a hex color value or raw values not defined in the token contract. This is easily auditable by AI as well.
The result is you can still move absurdly fast and still maintain the understanding, which the one-shot approach throws away. This way, you know why every piece exists because you defined the intent before the AI implementated it.
AI is the accelerant, and you're the architect. The intent is the blueprint you generate to guide/harness the AI.
The best part is, once you have an intent contract at the heart of your project, it becomes impossible to break things too, logically or experience-wise.
“ can it build a game idea I've had for years, in a single shot?”
Do people do no research or introspection when they’ve had an “idea for years”? There are countless examples of this exact game. I played this on the Gameboy Advance! There’s like 50 of them on the App Store right now.
The standard “this almost certainly exists wholesale in the training data” applies, but I’m also interested in how you carry an idea for years and don’t notice this, or whether the “idea” here was actually “using this thing that’s been remade thousands of times as an AI benchmark”.
There’s nothing wrong with remaking an old classic formula, especially in game dev. It’s the describing it as “an idea I’ve had for years” that rings weird.
It's called Ostrich effect and we all do it. You enjoy toying with your own idea so much, that your brain shields you from the pain of finding out it already exists. Deep down you know it probably exists. It's harmless, unless there's other people's time and money involved.
> You enjoy toying with your own idea so much, that your brain shields you from the pain of finding out it already exists.
Doesn’t look like the author toyed with the idea at all, though, apart from having it in their head. Considering how they describe themselves (Check the About/Home page), if they had toyed with it at all they would have already built it.
I also don’t see why finding out it exists would be “painful”. The game is free and the author didn’t experiment or learn anything from building it, they just prompted it in one go.
I think that's exactly why AI is suited for 99% of stuff we do.
I have pointed out on here before that instances of truly unique human ideas not grounded in nature or previous ideas from others is almost nil, there are not many examples that someone can give me. All human ideas and work is derivative.
Elves? Humans with pointy ears.
Werewolves? Humans mixed with wolves.
Car tyre? Cart wheel...stone wheel/roller.
Etc.
I feel like prior to GenAI, you’d have had to reckon with the true originality of your idea in some form as you did the research. Creatives having to confront their own unoriginality is such a thing it itself is reflected in countless pieces of media.
So it’s interesting to me that the creator here didn’t encounter the tens of physically published versions, or the hundreds of them shipped to digital app stores, or all the codebases on GitHub, in the course of making this. I’m sure they would have done naturally prior to GenAI. Is that good or bad? I don’t know! But it’s interesting to me.
> the creator here didn’t encounter the tens of physically published versions
The simplest counterargument: since there are already tens of similar games out there, why didn't the previous authors, supposedly grass-fed genuine checkmark blood-through-their-veins humans didn't notice the other 9-8-7-6-5... games, and still released their own version? Maybe because it was still that they wanted the game out there? Maybe because originality really isn't that common? Maybe because each individual had their own idea and spin to it? Maybe because they wanted the game out as they made it?
Same for this author. How they made the game is irrelevant, and nitpicking the "originality" or anything else is silly. Something like this wasn't possible 3 years ago. Now it's possible. Deal with it, and stop trying to find ways to diminish it. It's a huge accomplishment any way you cut it.
My thoughts are less about the merits of creating something that already exists than they are about _knowing_ you are doing that. Which I think my post made very clear :)
Do you think the only reaction to knowing you’re not the first to do something is not to do it? Do you think I said that?
To spell it out in case it is still non-obvious: knowing this allows iteration. It allows remixing. It allows you to inspect what has come before and what it did well and where it succeeded and where it fell short and thus what you could _add_. It is an enabler of creativity! Thus I think it is interesting that GenAI may make it harder to have this experience.
> why didn't the previous authors, supposedly grass-fed genuine checkmark blood-through-their-veins humans didn't notice the other 9-8-7-6-5... games, and still released their own version?
They said they think they would have encountered those other games without GenAI, not that they or any of those other authors shouldn't have released the game.
i had a boss. before he was my boss, he was a friend. he took me under his wing, musically speaking. he showed me new music. told me what gear he was interested in. we went to some gigs.
he used to say “the best artists have the biggest record collections”.
they’ve done their research. they developed taste. they’ve been in that battle with the unoriginality demon. they’re still in that battle with the unoriginality demon. they’re always searching for new. for unexplored. for different.
they’ve also figured out what “good artists copy, great artists steal” means.
we take small bits. small ideas. small riffs. we turn them into our own. then we repeat that N times to create “a song”. we borrow. we revere. we obsess. turning lots of little differences into a completely new work. yes it’s all derivative. but derivative originality takes a lot of fuckin’ effort to get right. to be tasteful.
this thing isn’t artistic stealing, it’s the most low-effort stealing possible. creativity, originality and more importantly taste appear nowhere here.
so, is it bad? depends on your perspective on creative endeavours being worthwhile and whether you have taste or not i guess.
edit - personally i don’t think you can polish a turd. even if you rewrite it, the memory lingers.
It seems to me that most media genres discover the most interesting parts relatively early, then most subsequent work is deeply derivative. I feel that way about video games, digital music, movies.
I'd wager it's because ideas are simpler to explore orthogonally, giving an overview of what's possible.
I think this is false. New ideas are born every minute, and llms arent going to help people with those for the most part, they'll end up steering you back towards the gradient if you do.
Can you give us an example of a new idea that is not derivative of something that already exists? Should only take about a minute.
Snark aside (and apologies), there's absolutely nothing wrong with the "no new ideas" take and nobody should think there is. Humans tend to work collectively, try as we might to do or appear otherwise, and often come to the same conclusions through reasoning and logic. No one-person truly invented the light bulb, etc, when really all inventive thought is branches of derivative thought as we build our collective knowledgebase. A better question would be how many novel ideas are the logical conclusion of branches of derivative thought and how many are tangential brought about by the injection of our irrationally.
While I agree that it isn't revolutionary that it could implement this from a single prompt, what's surprising to see is how well done this one is compared to the other tries. The controls and movement are smooth, the animations aren't jittery, the ui makes sense, there's a clear progression in difficulty. This model clearly "understands" the implementation of this game far better than the others did.
Yes, this subtly seems worth noting. That smoothness suggests that even if many of the concepts are common/non-original, the bringing together of various pieces in a form that works well on modern mobile browsers is still impressive - browsers are moving targets and even if there are open source versions of this, it's comparatively rare they'd get continual care and attention to stay fully current (unless implemented via an o/s engine)
I also realized this, a quick Google search would’ve told me that this game has been made several times before, also way before I ever had this idea. Apparently it’s a pretty obvious game idea.
Ah well, it’s still fun and it does appear to measure how good AI is in creating these kind of games.
I believe that AI is not a signal of all white collar jobs being replaced, it’s a signal that SWE was in its own bubble and this is the pressure to pop it.
Most software is not needed, YCombinator itself works on the philosophy of “maybe 2/300 ideas are good” and even among those their biggest hitters were social media platforms and undercutting existing services using VC money.
It was a big game that didn’t make a lot of sense in retrospect, and now with these AI super coders it just doesn’t make sense faster.
Software ate the world, and AI is the garbage disposal meant to chew up the leftovers.
If OP is anything like me, they probably played it (or saw and wanted to play it) on the GBA too, and the memory became an idea, forgetting they had actually played it because it did exist.
But also, how original can a game idea ever (now) really be – there's always going to be things you can describe it as 'like' or a mix of, even if not identical. And for such simple things, very little room for being non-identical to whatever they're like.
In case it all just comes from training data, "one shotting" a game would be more comparable to "git pull" and changing some assets than "generating code".
I'm not saying this is how it works, I'm trivializing LLMs with this statement, but when I see someone on linkedin excited about generating checkers and chess my first thought is "you could have done that with git pull for the past 20 years".
Same thoughts exactly. I personally started looking into indie game dev and I've just started to realize how naive I was and how hard just game design can be, and that I'll probably never be good at it, and that most of my ideas are pretty garbage (or incomplete at best).
Even with the perfect AI to write, one would need to iterate through many different ideas, play testing constantly, getting people to play test and analyze what they found fun and where they got stuck. And to get the best ideas you'll need to be playing lots of different kinds of games.
90% of everything is crap according to the late SF write Theodore Sturgeon. That was true before AI and it remains true presently. Does it really matter whether this game was in the training data or not here? I guess if one is trying to assert it can build original ideas (and it can, I've done it), but it seems like this is the equivalent of pulling something from Stack Overflow and customizing it given the description of the problem.
IMO the ability to describe a game and let the AI implement a PoC is pretty wild. It's a signal as to whether such an idea is worth pursuing further to me rather than a finished product. And I am enjoying all the experimentation with existing genres as well as the occasional truly original experience due to the dramatically lower cost of entry. What these efforts lack currently is the playtesting and polish that is hard without a human in the loop. So much like agentic engineering, the productive work is in being a centaur. It surprises me how much pushback this is getting from the demographic that embraced the relatively inscrutable git over simpler alternatives for small teams along with the tower of Babel of equally inscrutable frameworks and APIs.
It's not unlike Martin Scorsese admitting upfront he's using GenAI as a creative tool to visualize scenes for his scripts. The predictable backlash that he dare use AI in any way for any aspect of his craft despite his irrefutable oeuvre is a sign of the times more than a legitimate objection to me. Ask the users of deviantArt to stop working with Photoshop and see how that goes.
Having worked in the game industry in the past and adjacent to Hollywood over my career, they were already top heavy exploitative cultures before AI. And any auteur that thinks they can replace humans with agents is as tuned in to GenAI as the tech CEOs and VCs that happily announced layoffs and instituted tokenmaxxing benchmarks to measure the "incredible" boost in productivity AI enables.
So my question, ahead of the mandatory downvotes for not chanting along with the torch-bearing mob against AI in every way is: beyond the CapEx and the buildout issues (both legit IMO), how is AI impacting you negatively and personally?
You are the second person to respond to my question that’s entirely orthogonal to the actual AI usage here with a very self-conscious screed. Go read my responses to the first one :)
if you're going to go orthogonal to the AI usage, what makes you think other people won't go orthogonal to your own screed?
I'm happy to assume the guy had the idea in his head for years. Yhat others did too should come as no surprise. We are all a lot more alike than most people acknowledge. And this seems the credible successor to Activision's stampede from the late 1970s. Happy?
I’d guess that large games have an even higher percentage of pre-existing mechanics than small games do. It’s much easier to do something experimental if you don’t have to wrangle investors and entire teams of writers, artists, and developers.
Usually it’s an idea somebody had in a flight of fancy or inspiration but they haven’t really shown much interest in the actual medium prior, so they don’t really have any knowledge of its existence and then they also don’t go out of their way to confirm if it already exists.
Like I remember in college I had something akin to the idea of “50 people 1 question.” I was starting to become interested in shooting my own documentaries and was particularly interested in man on the street style interviews. I pitched it to a friend who then told me about 50p1q, which baffled him because it was like the hot thing already a year or two prior haha.
Anyway that’s just something I think happens a lot. And now with genAI people don’t throw the idea around even, they quickly do a crappy version of the thing, present it, then find out it exists. Which isn’t terrible I guess but it’s one less filter for my better or for worse.
If you sit down and write that game by hand you will not only finish it in a week but also learn a lot of things along the way and perhaps even discover something about the game and you did not imagine. That is how programming works. It is a search problem.
Also this is a game has very simple mechanics I am sure you can generate as easily with Cursor or some other tools.
More deeply than programming being a search problem, programming is a means to an end.
If the end is a combination of education and product discovery, then yeah maybe, although those are also dimensions of personal productivity that can be amplified by leveraging AI tools.
If the end goals of programming is leveraging computer automation, then nobody actually cares how the automation infrastructure gets established, and the less distractions with low value implementation complexity, the better.
I don't use Cursor so I mentioned it as slightly impartial suggestion but my point is broader. I hear and have seen results from others using Composer 2.5 which is only available in Cursor.
There were dozens (if not hundreds) of more complex games made by Fable on Twitter the first day it was released. The only reason this is on HN frontpage is the stupid clickbait title.
Looks kinda like "Sheepherds" which came out recently.
However as others have pointed out the idea is a common one, probably because many people are exposed to sheep and sheep dogs and farming. Which further reinforces a previous point I made that all human work is derivative and barely anything actually original.
But that's why it doesn't matter! Make that game/app/website that someone else has made before, make your own interpretation! The beauty and uniqueness is in the skin not the flesh!
I wonder if this is the real problem: it was too good, and a lobby of companies feeling threatened by the competition decided to push the jailbreak narrative as a scapegoat.
Not sure if it would've gone to the front page of Hackernews with that title! I was also trying to make a little fun about the drama around Mythos/Fable: Even though Fable did this really well, to me it does not appear to be fundamentally different from other top models.
Bit of a funny thing to so proudly assert in your millionth "your favorite show is shit" type comment, don't you agree?
In close lockstep with @ai_fry_your_brain, who at least makes it clear right on the tin that they're not here to engage in any earnest capacity whatsoever. Always a mixed feeling between being appreciative of that, and finding it blatant.
Good thing it's AI ruining communities, a thought I have no doubt you also share in. If only people properly recognized the hard work of people like you in this.
It’s sad that someone can think about a game for years and never really spend the time to just build it out. This is a very simple game even a CS student could build for an assignment. But now we’re supposed to be impressed an AI can one shot it for $20 dollars.
It instructs me to rotate my phone. The pasture doesn't get any bigger, but now the top bar blocks half the screen. The tooltip about rotating stays in the middle of the screen. Unplayable. There's a music note indicating sound, but I never heard the dog bark.
It's exactly the kind of unpolished slop I expected it to be.
If this is what you imagined, you need to imagine better.
* Pathfinding is terrible (if I end up inside the fenced area clicking outside doesn’t lead me out).
* Forcing me to go landscape while not even filling the entire screen is terrible (where did you even test this). * Controls are disastrous (I’m either barking all the time or a bark makes my sprite ignore my movements).
You one-shotted this, and I will admit it’s incredible that these agents can create something like this in minutes.
But your statements along with the “most dangerous AI model” in the title are disingenuous. Please do better.
Forces me to rotate to get warning message to disappear (works fine on portrait, but regardless forces me to play with two hands..), when rotate doesnt even fit on phone.
fROnTEnD DeV Is DeAd
DeSiGN Is DeAD
Cool idea tho, could be a fun game if if the UX wasnt so hostile.
But all this was in the training data no?
https://github.com/Nuno1123/chaser https://github.com/tee-lab/collective-responses-of-flocking-... https://shoze.itch.io/sheep-game https://store.steampowered.com/app/3006280/Sheepherds/ https://ameiswhattodo.itch.io/sheepy
My Belgian Tervuren and I have a basic herding title and about 4 years of herding experience.
The sheep movement is excellent. You could make it even more realistic by having them favor lusher areas and by having one occasionally bolt spastically (hard mode?)
A handler mode where you play as a human and shout commands at the dog could be cool too!
45 minutes and $20 ? You can make it in 5 minutes with DeepSeek!
I even did it for free in their web chat.
https://jsbin.com/sigesemeyi/edit?html,output
DeepSeek Flash was also able to do it, even with reasoning disabled, but Pro gave much nicer graphics.
Your Fable version is prettier though!
Edit: And has better gameplay. And sounds. OK, nevermind! Fable wins this one :)
In this version, the sheep all get stuck in the corners and edges. This doesn’t happen in the Fable version.
The Fable version has some kind of "gravity" that pulls the sheep together.
Some of the other games mentioned here use fancier flocking mechanics (like boids!) https://news.ycombinator.com/item?id=48518518
Love the:
> // Sky-like background
comment for the ground! :) No doubt most of the training data is for sky...
Curious enough, I tried the same prompt with Qwen3.6-27B.
One shot produced a game with no sheeps. I had to told it to fix two bugs then.
Overall, the graphics and games seems good enough and better than most of the closed models that were shown. However, not surprisingly, falls short of Fable.
I've put the index.html and open code session here:
https://github.com/da-x/when-ai-fails/tree/qwen3.6-27b/shepa...
Would love a PR with this!
I think it’s impressive that an LLM can take you to a local maxima in one-shot.
But once you start maintaining it, improving it and fixing bugs, you’ll eventually need to rip it apart and put it back together again while understanding how it all works.
This is why I think the better approach isn’t to one-shot but to have the architecture in your head and build it up piece by piece, with the AI accelerating the code writing.
I’ve found it very easy to maintain, add features to and fix bugs in software I’ve written entirely with LLMs, and in languages and frameworks with which I’m unfamiliar. You just ask the LLM to explain the code and then work with it to come up with the fix.
How big are those projects.. I dont think this is good for your mental health or physicaly your brains health. Problem solving keeps your brain strong. The laziness in us is inclined to take shortcuts, don't do it. Its like driving your car 3 blocks instead of walking, your physical health will suffer.
> How big are those projects
Define big I guess. They're non-trivial, mix of internal enterprise tools, a multiplatform app (android/ios/mac/windows/web currently headbutting its way through review), including a billing system for my small telecommunications business.
> I dont think this is good for your mental health or physicaly your brains health
I find the experience of doing it without writing the code to be intellectually pretty similar. I still solve a lot of problems, the LLM couldn't, for example, one shot the event sourcing model I built for synching data between devices. It took quite a few iterations and I had to define a lot of the architecture, but I did it at a level that wasn't "here is a class, here is a module, this module does XYZ", more at the "whitepaper" level or describing how specific bits of the app needed to work in order to solve some problem.
It's also very similar to managing other developers.
> Its like driving your car 3 blocks instead of walking, your physical health will suffer
It's more similar to having staff rather than doing everything yourself. The problem solving just shifts to a different area, and you get more done.
> Problem solving keeps your brain strong.
Coding is not the sole problem solving skill. In fact, coding may be one of the easier skills much of the time. Deciding what to build, where to focus efforts, understanding a customer's needs, could all be just as if not more challenging than the coding part.
Also what the code should do and how it should do it. LLMs regularly cannot come up with the best way to approach something. Once those decisions are made, codifying them is kind of the least interesting part of the entire exercise.
> Its like driving your car 3 blocks instead of walking, your physical health will suffer.
And be sure to only walk barefoot. Relying on artificial shoes weakens the muscles and the skin of your feet.
Sarcasm aside, most shoes are pretty bad for your feet.
A reasonable compromise in the face of frostbite and hookworm.
I suppose critical thinking skills are also as bad, making you question the state of the world. Problem solving is another one, deluding one into believing there are solutions to suffering.
I've been working on a project that's over 150k loc of Rust at this point. https://dirge-code.github.io
You absolutely can have the LLM write maintainable code. A few tricks I use are to ask it to plan out features in phases, and then do a branch and a PR for each focused piece of work. It makes it a lot easier to review and understand what's happening.
I also ended up making a tool which lets the LLM get a high level perspective of the codebase, and then see parts that are structurally gnarly. I've been using it to do refactors and clean things up periodically. It helped a lot with keeping the architecture clean.
https://github.com/yogthos/wavescope-mcp
Adding features and evolving the codebase has not been a problem even at this scale.
LLMs are good now at looking at existing project and suggesting big refactors for technical debt removal and new better architectures after the project grew organically for a while
I think this is true for projects beyond a certain complexity. I have 100% vibe coded projects with tens of thousands LOC, and haven't seen any real issues with fully automated maintenance. Will that approach work in every scenario, absolutely not, but the size and complexity of projects where it does is growing with each new model release.
> you’ll eventually need to rip it apart and put it back together again while understanding how it all works.
This is what spec .md files are for, skill issue
Strongly agree, well said. The one-shot is sexy purely because that first demo is so impressive. Going from zero to working app in minutes.
Like you said, working and maintainable are very different things. One-shot hits a wall the moment you need to do anything non-trivial after the initial generation. Bug fixing is extremely hard, even with AI assistance. Same with feature additions. It's pretty much black box at this point on. AI that wrote it now goes in loops wasting tokens without being able to can't reliably fix it either, because it has no memory of the architectural decisions it made (or didn't make, for that matter) the first time round.
What I realized is that the failure here is the absence of a shared mental model between you and the code.
I'm a product designer with average front-end know-how, and a solid understanding on HTML/CSS and how the web works, coming from the era of hand-coding html/css files. After vibe-coding a few products early this year, purely to learn how AI works, how to design AI interaction patterns etc., I built something called Intent Model. (largely inspired by SDD / BDD.
Intent model is a structured, typed artifact (basically a JSON contract) that captures actors, entities, journeys, rules, and constraints before I write (or make the AI write) any code. It sits upstream of everything. Think of it like a condensed, strict distillation of your PRD / BRD / requiremnt doc.
When you hand the AI a well-defined intent file instead of a vague brief, this one-shot becomes structured and bound by rules. Now you're giving it an architecture and to conform to. You define (or make the AI define) the precise variable names, their types, lifecycle, user roles, responsibilities, business rules and constraints in the file. Every generated artifact can trace back to a decision you made deliberately, reviewd and signed-off.
In the design world, we already do this by using design tokens. We can tell the AI that it needs to strictly use design tokens and not use stray properties like a hex color value or raw values not defined in the token contract. This is easily auditable by AI as well.
The result is you can still move absurdly fast and still maintain the understanding, which the one-shot approach throws away. This way, you know why every piece exists because you defined the intent before the AI implementated it.
AI is the accelerant, and you're the architect. The intent is the blueprint you generate to guide/harness the AI.
The best part is, once you have an intent contract at the heart of your project, it becomes impossible to break things too, logically or experience-wise.
“ can it build a game idea I've had for years, in a single shot?”
Do people do no research or introspection when they’ve had an “idea for years”? There are countless examples of this exact game. I played this on the Gameboy Advance! There’s like 50 of them on the App Store right now.
The standard “this almost certainly exists wholesale in the training data” applies, but I’m also interested in how you carry an idea for years and don’t notice this, or whether the “idea” here was actually “using this thing that’s been remade thousands of times as an AI benchmark”.
There’s nothing wrong with remaking an old classic formula, especially in game dev. It’s the describing it as “an idea I’ve had for years” that rings weird.
It's called Ostrich effect and we all do it. You enjoy toying with your own idea so much, that your brain shields you from the pain of finding out it already exists. Deep down you know it probably exists. It's harmless, unless there's other people's time and money involved.
> You enjoy toying with your own idea so much, that your brain shields you from the pain of finding out it already exists.
Doesn’t look like the author toyed with the idea at all, though, apart from having it in their head. Considering how they describe themselves (Check the About/Home page), if they had toyed with it at all they would have already built it.
I also don’t see why finding out it exists would be “painful”. The game is free and the author didn’t experiment or learn anything from building it, they just prompted it in one go.
I think you misunderstand what was meant by "toying with your own idea" here. I interpret it as daydreaming about it.
> I interpret it as daydreaming about it.
Which is why I said:
> apart from having it in their head.
But if that’s all you’re doing, there’s no “pain” from finding out it exists. On the contrary, there is plenty of room for joy.
Yeah "toying" as in "entertaining the idea" in any form or shape.
And I disagree that the author didn't get anything from it. There's a ton to glean, it was probably fun, and many HN readers enjoyed the post.
> And I disagree that the author didn't get anything from it.
Those were not my words. Clearly they got a game out of it. What I said was they:
> didn’t experiment or learn anything from building it
Which is unambiguously true. There was no experimentation and no learning. There was one prompt and one result.
> and many HN readers enjoyed the post.
That’s entirely orthogonal.
I think that's exactly why AI is suited for 99% of stuff we do.
I have pointed out on here before that instances of truly unique human ideas not grounded in nature or previous ideas from others is almost nil, there are not many examples that someone can give me. All human ideas and work is derivative.
Elves? Humans with pointy ears. Werewolves? Humans mixed with wolves. Car tyre? Cart wheel...stone wheel/roller. Etc.
The question is not whether the ingredients are original. They almost never are. The question is whether the synthesis is any good
I feel like prior to GenAI, you’d have had to reckon with the true originality of your idea in some form as you did the research. Creatives having to confront their own unoriginality is such a thing it itself is reflected in countless pieces of media.
So it’s interesting to me that the creator here didn’t encounter the tens of physically published versions, or the hundreds of them shipped to digital app stores, or all the codebases on GitHub, in the course of making this. I’m sure they would have done naturally prior to GenAI. Is that good or bad? I don’t know! But it’s interesting to me.
> the creator here didn’t encounter the tens of physically published versions
The simplest counterargument: since there are already tens of similar games out there, why didn't the previous authors, supposedly grass-fed genuine checkmark blood-through-their-veins humans didn't notice the other 9-8-7-6-5... games, and still released their own version? Maybe because it was still that they wanted the game out there? Maybe because originality really isn't that common? Maybe because each individual had their own idea and spin to it? Maybe because they wanted the game out as they made it?
Same for this author. How they made the game is irrelevant, and nitpicking the "originality" or anything else is silly. Something like this wasn't possible 3 years ago. Now it's possible. Deal with it, and stop trying to find ways to diminish it. It's a huge accomplishment any way you cut it.
My thoughts are less about the merits of creating something that already exists than they are about _knowing_ you are doing that. Which I think my post made very clear :)
> I’m sure they would have done naturally prior to GenAI.
I gave a simple counterargument to this. Since there are "countless" prior games, many of them released before genAI, your argument is pointless.
Do you think the only reaction to knowing you’re not the first to do something is not to do it? Do you think I said that?
To spell it out in case it is still non-obvious: knowing this allows iteration. It allows remixing. It allows you to inspect what has come before and what it did well and where it succeeded and where it fell short and thus what you could _add_. It is an enabler of creativity! Thus I think it is interesting that GenAI may make it harder to have this experience.
> why didn't the previous authors, supposedly grass-fed genuine checkmark blood-through-their-veins humans didn't notice the other 9-8-7-6-5... games, and still released their own version?
a) To make it better
b) To learn, in service of a) or another project
They said they think they would have encountered those other games without GenAI, not that they or any of those other authors shouldn't have released the game.
i had a boss. before he was my boss, he was a friend. he took me under his wing, musically speaking. he showed me new music. told me what gear he was interested in. we went to some gigs.
he used to say “the best artists have the biggest record collections”.
they’ve done their research. they developed taste. they’ve been in that battle with the unoriginality demon. they’re still in that battle with the unoriginality demon. they’re always searching for new. for unexplored. for different.
they’ve also figured out what “good artists copy, great artists steal” means.
we take small bits. small ideas. small riffs. we turn them into our own. then we repeat that N times to create “a song”. we borrow. we revere. we obsess. turning lots of little differences into a completely new work. yes it’s all derivative. but derivative originality takes a lot of fuckin’ effort to get right. to be tasteful.
this thing isn’t artistic stealing, it’s the most low-effort stealing possible. creativity, originality and more importantly taste appear nowhere here.
so, is it bad? depends on your perspective on creative endeavours being worthwhile and whether you have taste or not i guess.
edit - personally i don’t think you can polish a turd. even if you rewrite it, the memory lingers.
It seems to me that most media genres discover the most interesting parts relatively early, then most subsequent work is deeply derivative. I feel that way about video games, digital music, movies.
I'd wager it's because ideas are simpler to explore orthogonally, giving an overview of what's possible.
Just because AI can give you a recipe for an sandwich doesnt mean everyone who sells or buys or experiments making sandwiches are going to stop.
I think this is false. New ideas are born every minute, and llms arent going to help people with those for the most part, they'll end up steering you back towards the gradient if you do.
Can you give us an example of a new idea that is not derivative of something that already exists? Should only take about a minute.
Snark aside (and apologies), there's absolutely nothing wrong with the "no new ideas" take and nobody should think there is. Humans tend to work collectively, try as we might to do or appear otherwise, and often come to the same conclusions through reasoning and logic. No one-person truly invented the light bulb, etc, when really all inventive thought is branches of derivative thought as we build our collective knowledgebase. A better question would be how many novel ideas are the logical conclusion of branches of derivative thought and how many are tangential brought about by the injection of our irrationally.
> a new idea that is not derivative of something that already exists? Should only take about a minute.
A child is born every 4.4 seconds. But it took me and my girlfriend over 9 months to birth one!
Even if an original idea did show up every minute globally, does not mean that it takes only a minute to come up with the idea.
> it took me and my girlfriend over 9 months to birth one!
By my math you should should have at least 2 in that time, unless one of you wasn't pulling their weight.
I don't get your point, so I am going somewhere else: twins and triplets
While I agree that it isn't revolutionary that it could implement this from a single prompt, what's surprising to see is how well done this one is compared to the other tries. The controls and movement are smooth, the animations aren't jittery, the ui makes sense, there's a clear progression in difficulty. This model clearly "understands" the implementation of this game far better than the others did.
Yes, this subtly seems worth noting. That smoothness suggests that even if many of the concepts are common/non-original, the bringing together of various pieces in a form that works well on modern mobile browsers is still impressive - browsers are moving targets and even if there are open source versions of this, it's comparatively rare they'd get continual care and attention to stay fully current (unless implemented via an o/s engine)
Still, if the code for multiple similar games is in the training data, then that's worth thinking about.
I also realized this, a quick Google search would’ve told me that this game has been made several times before, also way before I ever had this idea. Apparently it’s a pretty obvious game idea.
Ah well, it’s still fun and it does appear to measure how good AI is in creating these kind of games.
Well … it’s a measure of how good it is at reproducing a game that probably already exists in multiple forms in its training data.
The question is more whether this game exists as open source somewhere in the training data (probably does).
You can't possibly think those models are only trained on open source data?
I agree - it's worth doing just for fun.
I did the same recently just for fun - I really enjoyed "Gravity Force" on the Amiga - itself a lunar lander variant.
Could a model build a Gravity Force like game I could run in-browser? Yep! (I never made it as good as Gravity Force - just got the basics down)
I believe that AI is not a signal of all white collar jobs being replaced, it’s a signal that SWE was in its own bubble and this is the pressure to pop it.
Most software is not needed, YCombinator itself works on the philosophy of “maybe 2/300 ideas are good” and even among those their biggest hitters were social media platforms and undercutting existing services using VC money.
It was a big game that didn’t make a lot of sense in retrospect, and now with these AI super coders it just doesn’t make sense faster.
Software ate the world, and AI is the garbage disposal meant to chew up the leftovers.
If OP is anything like me, they probably played it (or saw and wanted to play it) on the GBA too, and the memory became an idea, forgetting they had actually played it because it did exist.
But also, how original can a game idea ever (now) really be – there's always going to be things you can describe it as 'like' or a mix of, even if not identical. And for such simple things, very little room for being non-identical to whatever they're like.
This is a thought I've had about genAI.
In case it all just comes from training data, "one shotting" a game would be more comparable to "git pull" and changing some assets than "generating code".
I'm not saying this is how it works, I'm trivializing LLMs with this statement, but when I see someone on linkedin excited about generating checkers and chess my first thought is "you could have done that with git pull for the past 20 years".
Same thoughts exactly. I personally started looking into indie game dev and I've just started to realize how naive I was and how hard just game design can be, and that I'll probably never be good at it, and that most of my ideas are pretty garbage (or incomplete at best).
Even with the perfect AI to write, one would need to iterate through many different ideas, play testing constantly, getting people to play test and analyze what they found fun and where they got stuck. And to get the best ideas you'll need to be playing lots of different kinds of games.
Well, “an idea I’ve had for years” and “something that has never been done before” are not the same thing.
This is fair! I am possibly attaching some notion of originality to the word “idea” in the context of a project that others don’t.
yes you are
90% of everything is crap according to the late SF write Theodore Sturgeon. That was true before AI and it remains true presently. Does it really matter whether this game was in the training data or not here? I guess if one is trying to assert it can build original ideas (and it can, I've done it), but it seems like this is the equivalent of pulling something from Stack Overflow and customizing it given the description of the problem.
IMO the ability to describe a game and let the AI implement a PoC is pretty wild. It's a signal as to whether such an idea is worth pursuing further to me rather than a finished product. And I am enjoying all the experimentation with existing genres as well as the occasional truly original experience due to the dramatically lower cost of entry. What these efforts lack currently is the playtesting and polish that is hard without a human in the loop. So much like agentic engineering, the productive work is in being a centaur. It surprises me how much pushback this is getting from the demographic that embraced the relatively inscrutable git over simpler alternatives for small teams along with the tower of Babel of equally inscrutable frameworks and APIs.
It's not unlike Martin Scorsese admitting upfront he's using GenAI as a creative tool to visualize scenes for his scripts. The predictable backlash that he dare use AI in any way for any aspect of his craft despite his irrefutable oeuvre is a sign of the times more than a legitimate objection to me. Ask the users of deviantArt to stop working with Photoshop and see how that goes.
Having worked in the game industry in the past and adjacent to Hollywood over my career, they were already top heavy exploitative cultures before AI. And any auteur that thinks they can replace humans with agents is as tuned in to GenAI as the tech CEOs and VCs that happily announced layoffs and instituted tokenmaxxing benchmarks to measure the "incredible" boost in productivity AI enables.
So my question, ahead of the mandatory downvotes for not chanting along with the torch-bearing mob against AI in every way is: beyond the CapEx and the buildout issues (both legit IMO), how is AI impacting you negatively and personally?
You are the second person to respond to my question that’s entirely orthogonal to the actual AI usage here with a very self-conscious screed. Go read my responses to the first one :)
if you're going to go orthogonal to the AI usage, what makes you think other people won't go orthogonal to your own screed?
I'm happy to assume the guy had the idea in his head for years. Yhat others did too should come as no surprise. We are all a lot more alike than most people acknowledge. And this seems the credible successor to Activision's stampede from the late 1970s. Happy?
Most small games are recombinations of existing mechanics anyway
I’d guess that large games have an even higher percentage of pre-existing mechanics than small games do. It’s much easier to do something experimental if you don’t have to wrangle investors and entire teams of writers, artists, and developers.
Usually it’s an idea somebody had in a flight of fancy or inspiration but they haven’t really shown much interest in the actual medium prior, so they don’t really have any knowledge of its existence and then they also don’t go out of their way to confirm if it already exists.
Like I remember in college I had something akin to the idea of “50 people 1 question.” I was starting to become interested in shooting my own documentaries and was particularly interested in man on the street style interviews. I pitched it to a friend who then told me about 50p1q, which baffled him because it was like the hot thing already a year or two prior haha.
Anyway that’s just something I think happens a lot. And now with genAI people don’t throw the idea around even, they quickly do a crappy version of the thing, present it, then find out it exists. Which isn’t terrible I guess but it’s one less filter for my better or for worse.
(Exaggerated:) Guy who would never pay 20€ to another dev for such a game, pays same amount for AI.
Applause to Anthropic: mission accomplished!
That's no surprise, telling the AI what to do is basically the IKEA effect.
https://en.wikipedia.org/wiki/IKEA_effect
If you sit down and write that game by hand you will not only finish it in a week but also learn a lot of things along the way and perhaps even discover something about the game and you did not imagine. That is how programming works. It is a search problem.
Also this is a game has very simple mechanics I am sure you can generate as easily with Cursor or some other tools.
More deeply than programming being a search problem, programming is a means to an end.
If the end is a combination of education and product discovery, then yeah maybe, although those are also dimensions of personal productivity that can be amplified by leveraging AI tools.
If the end goals of programming is leveraging computer automation, then nobody actually cares how the automation infrastructure gets established, and the less distractions with low value implementation complexity, the better.
Cursor has access to the latest models so it should be equivalent, right?
Or is there some other AI usage described in this article that is not supported by cursor?
I don't use Cursor so I mentioned it as slightly impartial suggestion but my point is broader. I hear and have seen results from others using Composer 2.5 which is only available in Cursor.
So it created a trivial game that a teenager could’ve built as a part-time project while acquiring deep knowledge.
Humans can do lots of things, I don't see how that's relevant. This post is about AI progress.
This game need a "bah-ram-ewe" cheat code where the sheep dog turns into a well-mannered pig who politely asks the sheep to return to their pen.
There were dozens (if not hundreds) of more complex games made by Fable on Twitter the first day it was released. The only reason this is on HN frontpage is the stupid clickbait title.
Some random examples:
https://x.com/fe_yukichi/status/2064635098411180374 https://x.com/akiraxtwo/status/2064780732082651402 https://x.com/kieradev/status/2064482704763085202 https://x.com/VincentLogic/status/2064699740936356065 https://x.com/XiaohuiAI666/status/2064994538591223911
No, the reason is that it is a follow up on a (multiple) threads march last year and shows the progress of ai.
Looks kinda like "Sheepherds" which came out recently.
However as others have pointed out the idea is a common one, probably because many people are exposed to sheep and sheep dogs and farming. Which further reinforces a previous point I made that all human work is derivative and barely anything actually original.
But that's why it doesn't matter! Make that game/app/website that someone else has made before, make your own interpretation! The beauty and uniqueness is in the skin not the flesh!
But isn't getting an LLM to n-shot something just going to produce non-unique, non-original interpretations of an idea?
I’m sure I saw a blog post about this same mechanic being made by llms back a year or so ago too
I sure do miss Fable. It just knew how to do things and do them well. Sad it’s now blocked.
I wonder if this is the real problem: it was too good, and a lobby of companies feeling threatened by the competition decided to push the jailbreak narrative as a scapegoat.
The article’s title seems needlessly dramatic, the article itself doesn’t reference the LLM’s danger.
The title could have been just “Shepherd’s Dog: A game by Fable 5”.
Not sure if it would've gone to the front page of Hackernews with that title! I was also trying to make a little fun about the drama around Mythos/Fable: Even though Fable did this really well, to me it does not appear to be fundamentally different from other top models.
Yeah, fundamentally the same: Worthless.
funny how a worthless LLM belongs to the fastest revenue growing company in the history of Capitalism
Because others are paying for it. It’s a lot easier to get revenue when you don’t have to care about CAC or paying the bills.
Can you provide any source for that claim? Thanks!
google it. this article from one month ago is already obsolete, annualized revenue grew from 30 bln to 44 bln in the last month
https://venturebeat.com/technology/anthropic-says-it-hit-a-3...
Bit of a funny thing to so proudly assert in your millionth "your favorite show is shit" type comment, don't you agree?
In close lockstep with @ai_fry_your_brain, who at least makes it clear right on the tin that they're not here to engage in any earnest capacity whatsoever. Always a mixed feeling between being appreciative of that, and finding it blatant.
Good thing it's AI ruining communities, a thought I have no doubt you also share in. If only people properly recognized the hard work of people like you in this.
Oddly, I wonder if this is not a great benchmarking prompt.
Brilliant marketing here in the title
Enjoyed playing it, here's the direct link to play as otherwise you have to click from the article to the GitHub and then find the correct demo link
https://vnglst.github.io/when-ai-fails/shepards-dog/claude-f...
Thanks for that, I messed up copying the links into the article!
When you say €20 worth of tokens is it fair direct API call price or subsidized claude code?
Direct API access I’m afraid, it was not my intention to spent it all in one go on this. But after 12€ I didn’t want to stop anymore
ok, after playing this game, I started respecting shepherding dogs' skills even more.
Yeah, but can it one-shot Sven Bomwollen?
2002 - a strange game!
Pretty great game, am having some fun
It’s sad that someone can think about a game for years and never really spend the time to just build it out. This is a very simple game even a CS student could build for an assignment. But now we’re supposed to be impressed an AI can one shot it for $20 dollars.
Playing on iphone13 mini.
It instructs me to rotate my phone. The pasture doesn't get any bigger, but now the top bar blocks half the screen. The tooltip about rotating stays in the middle of the screen. Unplayable. There's a music note indicating sound, but I never heard the dog bark.
It's exactly the kind of unpolished slop I expected it to be.
In which harness?
The "world's most dangerous AI" framing is funny here because the result is basically the least dystopian use case imaginable
BAA VRAM EWE
> It's really fun and exactly how I imagined it.
If this is what you imagined, you need to imagine better.
* Pathfinding is terrible (if I end up inside the fenced area clicking outside doesn’t lead me out). * Forcing me to go landscape while not even filling the entire screen is terrible (where did you even test this). * Controls are disastrous (I’m either barking all the time or a bark makes my sprite ignore my movements).
You one-shotted this, and I will admit it’s incredible that these agents can create something like this in minutes.
But your statements along with the “most dangerous AI model” in the title are disingenuous. Please do better.
He should ask AI to tell him that #aaa text on #eee background is not acceptable.
Now next game - The Boy who cried wolf! Wolf!
Forces me to rotate to get warning message to disappear (works fine on portrait, but regardless forces me to play with two hands..), when rotate doesnt even fit on phone.
fROnTEnD DeV Is DeAd
DeSiGN Is DeAD
Cool idea tho, could be a fun game if if the UX wasnt so hostile.
That’s one tired sheepdog.
This was my second attempt, I'm still learning! Besides, the wolf was freaking me out.
Always fun having a go, mind you Michael Nyman had some thoughts on all this: https://www.youtube.com/watch?v=xn1_vUe_Vws
For interest, some shepherds run two dogs, each on a different whistle or voice command pitch.
I didn't even have to play. Immediately after opening, some notification about rotating my phone is obscuring the instructions and I cannot read them.
Damn I couldn't load it on my Nokia n95 from 2007 either. Damn bruh, these silly devs should make this stuff work on everything.
I am on a flagship samsung that runs for example the Red Alert 2 browser port well.
OP is just pushing slop, the 80% part anyone gets for free. (well 20 bucks)
As far as I can tell it is possible to get this sort of quality game with a properly tuned harness out of one of the cheaper models.
"a game idea I've had for years"
Bruv, there are already countless games with this exact mechanic...