Summoning software

An exploration of vibe coding to make apps and games

Mar 22, 2025

Hello!

Over the past couple of weeks, I've been summoning fully interactive apps and playable games on demand using AI. It's now possible to simply explain what you want to a language model and after a couple of minutes have an interactive app prototype that you can use and share with others. No technical expertise required - the AI writes all the code for you and the AI platform makes it run. You just need a browser, an account, and an idea of what you want to make.

You may have heard of this trend as 'vibe coding', a term coined by Andrej Karpathy.

Andrej Karpathy's tweet (sign in required) will likely go down in history.

Now before we all get too excited, there are many caveats. Sometimes the vibe-coded apps work amazingly well first go, and sometimes after multiple attempts you have only a crude or broken approximation of what you were envisioning. After all, we are talking about generative AI here. I used the term 'summon' deliberately as it captures the mystery of not quite knowing what is going to emerge from the portal after you make your request. And as you'll see below more than one unexpected monster appeared in response to my calls.

In today's post, I'll be running through all the things I tried and what I learned making them. Rather than only cherry-picking the best examples, this means you'll also see the many failures. This will provide a more realistic overview of their capabilities.

You'll be able to try out all of the apps and games yourself (although I can't guarantee how well that will work on mobile, so for best results try these on your desktop). I'll give some reflections at the end and I'll also explain how you can try making your own for free. But first, a bit of background.

Claude Artifacts, Sonnet 3.7 and Simtheory 'Create with Code'

Anthropic launched the ability to create and share interactive apps and games last year in a feature called "Artifacts". This was before the term 'vibe coding' was invented. To catch you up, this video featuring a cute crab (1m) shows how to go from images to playable game using Claude:

What prompted me to revisit this feature and experiment with vibe coding was the release of Anthropic's latest model, Claude Sonnet 3.7. This model is special because it was trained on a very large compute cluster and it is also a hybrid 'reasoning' model. This means it is a normal large language model and it can also be asked to spend time 'thinking' through a problem before responding. Ethan Mollick calls Claude 3.7 one of the first of a new generation of AI models, "Gen3", and in his experiments he's found it to be significantly better than previous models across the board, including at creating Artifacts.

I subscribe to an AI platform called Simtheory that has a similar feature called 'Create with Code'. In Simtheory, I can create interactive components using the pro versions of the Sonnet 3.7 models that I can't access in Claude on the free plan. I also don’t have to worry about hitting the rate limits. Create with Code also has a few extra abilities like creating music and visual assets in the background.1

I wanted to see for myself what this new Anthropic model was capable of by putting it to the test in Artifacts and Create with Code.

Explorations in vibe coding

Behold my creations, presented in roughly the order I summoned them. (If you want some technical details, I've included them in the 'method' section at the end.) For each app or game I provide the question guiding my explorations, some comments on my process, a link to try the apps, and an evaluative ‘vibe check’.

Black hole explainer

Question: Can I create an engaging and interactive graphical explanation of how black holes work?

Claude web interface with the text chat on the left side and the black hole interactive on the right. — Prompt: "Create an interactive that explains in a cool way what black holes are". I guess dots are pretty cool now.

Process: It took Claude three attempts in a single chat to make one that I was relatively happy with.

Try it: Interactive Black Hole Explainer

Vibe check: Basic, but it does what I asked.

Interactive Solar System Model

Question: Can I create an accurate interactive model of the solar system?

Process: After 6 failed iterations in Claude I hit what will become a familiar warning: "Your message will exceed the length limit for this chat".

I switched to Simtheory and after two versions got something where the user interface worked, but it was really difficult to actually see anything because the planets were too small. After requesting a more stylised version, the app got stuck forever on the loading screen.

Screenshot of Simtheory chat interface with the app on the left stuck on a loading screen while the AI claims to have fixed it in the chat. — You said you fixed it, Claude, but I don’t believe you.

Try it: Loading Solar System... (Yes this is broken)

Vibe check: Clearly too ambitious an idea. Not even close.

Metronome Widget

Question: When practising the flute, I often use a free web-based metronome surrounded by ads. Can I make my own widget that has all the essential features?

Process: I created a working version in a single Claude chat and made three revisions before hitting the 'Context length limit' warning.

The infamous pop-up in the top left: “Your message will exceed the limit length for this chat”.

I switched to Simtheory, pasting the code in and asking for an improved version. After some bug fixing and arguments with Claude about colour I got a final version (v9).

The word 'complementary' in the context window kept making it put purple back in.

Try it: Flute Practice Metronome

Vibe check: Great success. It does everything I want, works well and looks good. I will be bookmarking this and using it in my practice sessions.

Time tracking app

Question: Can I make a time-tracking app that allows me to record what I did today and visualise it?

Simtheory chat interface with a simple time tracker demo app. — Let’s just say it took a lot longer than two minutes to draft this post.

Process: I gave Simtheory a more detailed list of requirements and got a functional app in two shots. Simtheory has a database feature so it remembers my entries from session to session.

Try it: Time Tracker Widget

Vibe check: It works, but it's a bit basic and ugly. I can see that if I invested a lot more time in it I could maybe get something closer to Toggl, which I use for time tracking.

Nostalgic childhood space game

Question: There was this game I used to play on one of my dad's very early Macintoshes in the 90s, where you had to fly around and collect crystals. I can mostly remember what the game was like. Can I recreate a version of it?

The red swarms like chasing you so I recommend getting a head start and then reversing direction to blast them.

Process: I roughly described the game to Claude and despite a promising load screen, I hit the message limit and couldn't generate the entire game.
I switched to Simtheory and in the first chat I tried the app froze so I tried again in a new chat. In one shot I had a fully playable game that was actually quite fun. I played through a few levels and then asked for some improvements. Unfortunately, the model seemed to get stuck and never made the updates, despite trying a few times. I changed tack and invoked Claude 3.7 Reasoning to improve the code in the chat and to use 'think' rather than 'code' skill. This version of the model fixed the problems and made all the requested changes successfully. I iterated through several more rounds of feedback until I was happy.

Try it: Crystal Voyager (image assets take a few seconds to load, until then it’s just shapes)

Vibe check: Success, it's fun and challenging! I'd say it captures the spirit of the game I remember, although it took on a life of its own as I iterated on it. I could have spent a lot longer tweaking it.

Phrasal verbs animated flashcards

Question: When I taught English, phrasal verbs (go out, come in) were fun but challenging to learn for students. Can I make an interactive flashcard widget that animates the meanings of each verb?

Process: In a Simtheory Chat I asked for a list of 20 phrasal verbs and then requested an interactive flashcard game with options, animations and feedback at the end. In one shot I got a working result. Trying to add flashcard randomisation caused the Create with Code feature to break the app and despite multiple attempts, the model couldn't resolve it. The switch to reasoning mode trick worked, but it decided to reduce from 20 to 5 verbs, so I asked it to add them back in for the final result.

Simtheory chat with a flashcard app on the left. The word "break down" appears followed by a crude block and stick animation of a car slowly approaching a stick figure and then blasting a grey cloud of smoke across its head. — This gets me every time.

Try it: Phrasal Verb Flashcards

Vibe check: Success. It works as requested, and the animations are just delightful. I recommend opening the app and watching all of them. The only thing holding it back is that, while hilarious, some of the animations are either too cryptic or plain misleading for educational purposes, so I would need to find a way to improve just the problematic ones.

Cookie Monster Chatbot

Question: I read on the Simtheory Discord that it's possible to add LLM chatbot functionality to apps in Create with Code. How about a Cookie Monster chatbot?

Process: Simtheory made the chatbot in one shot, along with an image of what looks like an alien wearing Cookie Monster's skin cuddling a cookie. I added the ability to choose a language, it also worked perfectly.

Qu'est-ce qu'il y a à manger? DES BISCUITS, BIEN SÛR, BANANE! NOM NOM NOM! … (ChatGPT helped me with this, apparently it translates to "What is there to eat? Cookies, of course, dummy!")

Try it: Cookie Monster Chatbot
Vibe check: I'm sure we all have better things to do than talk to a freakish AI approximation of cookie monster in another language, but then again it is kind of incredible how easy it was to make this monstrosity.

Interactive chatbot language practice roleplay

Question: Can I make an interactive customisable chat roleplay for language practice, where learners can pick their language, level and roleplay scenario?

Process: First I created a job interview that looked very promising but had the fatal flaw of not respecting the options selected by the user and then not showing the chatbot messages.

The first version looks has a functioning user interface but the chat and feedback experience leave much to be desired as can be seen in the GIF.

I decided to try a different tack and make one where the user could pick not only the language but also the role-play scenario. Despite two attempts, it had the same fatal flaws of not respecting the language choices and a buggy user interface.

I'm not sure my Spanish is improving with this app but I think my French may have more range than the waiter's.

This one includes a (rather weird) generated image to match the scene, but will only speak English.

Claude was able to implement the UI and fake a chatbot in two turns, but it can't connect to an LLM, unlike Simtheory, so the responses are all pre-written and canned.

Try it:

MultiLingual Interview Practice (Simtheory)
Language Practice Chatbot (Simtheory)
LinguaRoleplay (Simtheory)
Interactive Language Learning Roleplay (Claude)

Vibe check: All form no function. I'd be better off creating a custom GPT or Cogniti agent that asks questions to personalise the practice session and provide feedback.

Custom Chatbot

This got me thinking, could I create a chatbot that allowed you to customise its system instructions, just like a custom GPT?

Process: I asked: "Create an interactive website that works like a custom GPT. The user can add special instructions that will be added to the system prompt but not shown to the user. Give it all a Victorian look and feel". It took over 10 iterations to get to the final chatbot, as each addition and refinement would break something else, for example the image would work but the chat responses would be strangely formatted. At some point the base chatbot stopped speaking in a Victorian style and instead adopted a Shakespearean tone. I used Ctrl+F to find the base system prompt in the code and directed Simtheory to update just that prompt with my own hand-written instructions.

A charming conversation with my new Cossack friend about his miniature horse.

Try it: Victorian Intelligence

Vibe check: While essentially pointless as an app, it is more fun that it has any right to be, and proof that it is possible to vibe code a custom GPT.

Russian Prepositions Game

Question: Russian prepositions are tricky, can I create a simple 3D game where you have to put a cat in the right place based on an instruction given in Russian?

Process: Having heard that Claude is good at writing prompts for itself, I decided to try getting Claude to write a detailed prompt based on some basic notes for the game. The resulting game was unplayable, with the 3D environment not rendering.

I then tried writing my own prompt2 and giving it to both Claude and Simtheory to generate the game. In two shots Claude created a version where you can move the little snowman cat around, but never get the answer right.

In the Claude Artifact version the cat looks like a cute little snowman with a tail.

Over seven turns of iteration and debugging, Simtheory created a much more detailed and ‘realistic’ 3D model where you can't move the cat at all. The generated sentences and other features worked as requested.

Get a load of those lifelike cat textures and authentic Soviet-era decor. The gentle breathing on the cat and static on the TV are icing on the cake.

Try it:

Кот и Предлоги: Russian Prepositions Game (Claude)
Кошка в Квартире - Russian Preposition Game (Simtheory)

Vibe check: So close and yet so far. Or as they say in Russia: “Так близко и всё же так далеко.”

Reflections

Overall, the results of the vibe checks were decidedly mixed. There were three or four successes, but the majority of what I tried to make ended in dead ends or uncanny Cookie Monsters.

This was not for lack of trying. As a 'summoner' rather than a programmer, I rely entirely on the model for coding expertise. I am sure that developers could achieve better results by first knowing what to ask for (i.e. particular coding frameworks or approaches) and second by being able to review and debug the code to see where the model is going wrong.

Many times I had the feeling that changes I wanted to make would be a matter of editing a line or two of code, if only I knew what to look for. This would be faster, easier and more precise than asking the model which likes to rewrite the whole thing every time unless you tell it not to. In one case I pulled this off (the Victorian chatbot), and it gave me a taste of what this would be like.

I can see that over the course of my experiments, my prompting skills and tactics improved. For example, I quickly learned that I had better results when I included more of my requirements in the initial prompt, minimising the need to wait for rewrites to add something simple. This resulted in more success and consequently more ambitious efforts.

As I work in educational technology, naturally many of the things I tried to make were learning interactives. While some of these were very close, I'd want to do quite a lot of quality control and more refining before sharing them with students. I know of at least one person, Joel Gladd, who has been doing this since 2024, going so far as to embed them in his online textbooks.

Link to X post (sign in required). Joel Gladd is worth following on X or Bluesky, he shares a lot of interesting information and spot-on takes about genAI.

The most unqualified success was probably the flute metronome. The thing is, I was trying to recreate something that already existed. I have to acknowledge that it is possible that the model is simply plagiarising the code that was scraped from the internet for its training data. Overall, the closer I stuck to simple and existing solutions, the better the results. The more I tried something original or more complicated, the more I hit diminishing returns and dead ends.

What are others saying about vibe coding?

While I was working on this post, Kevin Roose of the NYT published an (annoyingly similar) article about his experience vibe-coding over the past year titled "Not a Coder? With A.I., Just Having an Idea Can Be Enough" (Gift link). For him, it was "a mind-blowing experience" that induced "AI vertigo". He spared us the exploding head emoji at least, but needless to say, he was very enthusiastic about it all. He sees the trend only increasing as AI improves and concludes that it is a positive thing that anyone can now easily build niche tools for themselves on demand.

It didn't take Gary Marcus long to jump in with a more critical view. Putting aside for the moment Marcus' criticism of Roose and the NYT for hyping flawed AI and his ongoing beef with Casey Newton, he raised several concerns about vibe-coding that mirror my experience:

AI is better at regurgitating than generating anything truly novel
It's easy to get 80% of the way there, the devil is in the final 20%

Marcus also makes the point that AI code may be fast to produce but may be very costly to debug maintain and debug over the long term. He concludes that we're going to need software developers for a long time yet given the limitations of genAI and vibe coding specifically.

I think both Roose and Marcus make good points, so is there a productive way of navigating between these opposing views?

Simon Willison’s article Not all AI-assisted programming is vibe coding (but vibe coding rocks) offers just that by unpacking what he sees as the difference between vibe coding and software development. "When I talk about vibe coding I mean building software with an LLM without reviewing the code it writes", he says, and "If an LLM wrote the code for you, and you then reviewed it, tested it thoroughly and made sure you could explain how it works to someone else that’s not vibe coding, it’s software development." He offers advice to non-coders like me about when it is ok to vibe code that I would strongly recommend reading. For example, make sure what you're making is low stakes and be very careful about security.

Where I've landed is very much in line with Willison. Vibe coding is great for ideating, prototyping and learning about programming and genAI. It's fun and weird and frustrating and creative. If you're lucky you can even make a handy tool or fun game for yourself or others. But for serious software development, genAI is most useful for coders who know what they are doing.

Try making your own Claude Artifacts for free

Interested in trying to make your own? Here's how to get started using Claude for free.

Go to claude.ai and sign up for a free account. You'll need to provide a mobile number.
Go to Settings/Profile and toggle on Enable artifacts.
Select Start a new chat.
Ask Claude to 'create a react component that...' or 'create a playable HTML5 game that...'
Select the generated Artifact in Claude's chat response to open it in the preview sidebar.
Chat with Claude to make changes or improvements.

While you can start for free, you will run out of tokens after only a couple of changes on a free Claude account. Complex apps are more likely to hit the maximum conversation length warning because all the lines of code also take up tokens in the chat context even though they are displayed to one side.

Tips

Iterate, iterate, iterate!
Don't hesitate to restart in a new chat if you hit a dead end.
Use another AI chat to help you think through all the features you want so the prompt is as close as possible on the first go.
If you get stuck, try pasting the code into another LLM to ask it to fix it or for a second opinion before pasting it back into Claude to run it again.

I would genuinely love to see whatever you make - please share your artifact creations in the Comments or by reply to this email.

C'EST TOUT, ME AMIS! OMMM NOM NOM NOM!

I'm about to move house so it might be a couple of weeks until I post again, Dispatch or normal post. See you on the other side!

Antony :)

Thanks for reading. Tachyon is written by a human in Perth, Australia.

Subscribe to receive all future posts in your inbox. If you liked this post and found it useful, consider forwarding to a friend who might enjoy it too.

Method

I accessed Claude Sonnet 3.7 from a free account and used it in Normal Thinking mode.
- I have shared all development chats from Claude and linked them in the Process sections.
I have a paid Pro account in Simtheory and used Claude 3.7 Sonnet (high output). This is a beta model that can output up to 128k tokens, and according to the creators of Simtheory gives better results than the standard 3.7 Sonnet for using Create with Code.
- Simtheory does not allow sharing of chats, so I was not able to share these.
I used a basic conversational prompting pattern, simply describing what I want and then iterating on the versions. You can see the approach in the shared Claude chats and screenshots.

I subscribe to Simtheory because it allows me to try out all the latest models from a single subscription without worrying about rate limits, and they frequently roll out some very interesting and innovative features. It's an impressive platform considering it was built by two Aussie blokes rather than the team at Anthropic with billions in investment. They also host a podcast called This Day in AI which is one of the rare podcasts that goes for longer than an hour that I will regularly listen to.

The prompt was "Create a 3D interactive game featuring a cute cat in a cosy Soviet era open plan apartment. The aim of the game is to practice prepositions. The player is given an instruction in Russian like "put the cat under the table" and they have to put the cat in the right place. All instructions are in Russian, and the English is only displayed if the player uses a 'hint' button. Once the cat is in the right place, the player can check if they are right. If they aren't they are told to try again. If correct, they are given a new instruction. There are ten possible locations, and each game the player has to put the cat in three correctly to win."