From vague idea to working app with AI (real process walkthrough)

Not a tutorial. A real account of how a vague frustration became a working tool: what the AI produced, what got cut, what broke, and what shipped. The process is messier and faster than most people expect.

Where it actually starts

The idea was not a product idea. It was an annoyance. Spending had drifted upward for three months, not dramatically but persistently, and I had no clear picture of when it had started or why. Every budgeting app I tried answered a different question than the one I was asking.

The question was not "how much did I spend on coffee last month." The question was "is my spending going up, and if so, where is the drift coming from."

That gap between the question I had and what the tools answered was the starting point. Not a vision, not a product concept. Just a thing that didn't exist and should.

Most builds start this way if you let them. The problem is usually smaller and more specific than people expect when they imagine "having an app idea." The specificity isn't a limitation. It's what makes the build possible.

Before the first prompt

There is a step most people skip, and it's the one that determines whether the AI session produces something useful or something that has to be completely rebuilt.

Before writing a single prompt, I wrote three sentences:

  1. What the tool does: shows weekly spending totals with drift between weeks
  2. What the tool doesn't do: no accounts, no bank connections, no categories, no cloud storage
  3. What success looks like: open the app, log a number, see whether spending is up or down vs last week

That's it. Three sentences. But they changed everything about what came out of the AI.

Without the second sentence (what the tool doesn't do), the AI's default is to add. More features, more views, more options. With it, the constraint was built into the first prompt. The output was narrower and immediately more useful.

If you skip this step, you will spend two sessions removing features instead of one session building the right ones. The time investment is the same. The frustration is not.

The first prompt (and what came back)

The prompt was specific about the output format, not just the feature list. Instead of "build me a spending tracker," it described what the screen should look like:

"Build a single-page HTML/CSS/JS spending tracker that runs in the browser with no backend. Store data in localStorage. The main view shows this week's total, last week's total, and the week before — three numbers, side by side. Below that, an input form: amount, description, date. That's the full interface."

The AI produced a working first version in one response. It ran in the browser, stored entries in localStorage, and showed the three-week comparison exactly as described.

It also added a category dropdown, a monthly chart, and a delete-all button, none of which were in the prompt. This is normal. The AI defaults to comprehensive. The prompt was specific but the AI still padded it out.

The second prompt was: "Remove the category dropdown, the monthly chart, and the delete-all button. Simplify the entry form to amount, description, and date only."

That produced the actual first version.

What broke, and how it got fixed

Two things broke in the first working version.

First, the weekly grouping was off. Weeks were being calculated from Sunday, but the entries I was logging were Thursday-forward. The week boundary didn't match how I thought about it. A quick prompt to change the week start to Monday fixed it, but only after spending twenty minutes wondering why the numbers didn't add up the way I expected.

The lesson: AI-generated logic makes assumptions you can't see until you test the output with real data. Testing immediately after each step is not optional. If I had tested the date grouping before moving on, I would have caught it in two minutes instead of twenty.

Second, the drift comparison didn't visually distinguish between "spending is up" and "spending is down." Both states looked identical, just different numbers. A small prompt to add a color signal (green for lower than last week, neutral for similar, a muted red for higher) fixed the readability immediately.

That change required one prompt and took less than a minute. But I would not have thought to ask for it until I used the tool on real data and noticed the display felt flat.

What was cut (and why each cut mattered)

At the end of the first session, the tool had five features that didn't make it to the final version. Each cut had a reason:

Category tagging

The AI kept suggesting it. Every second iteration, a category field would reappear somewhere. The reason it was cut: categories create work at the point of entry and rarely produce insight proportional to that work. The question wasn't "how much did I spend on food." It was "is total spending drifting." Categories don't answer that.

Export to CSV

Cut because it implies the data matters beyond the current session. The whole point of the tool is that you don't need to archive it. You need to see the trend right now. Export creates the illusion that more data management options equals more value. It doesn't.

Budget limit alerts

Cut because they shift the tool from descriptive to prescriptive. The tool's job is to show reality, not to tell you whether you're good or bad. That framing change would have turned a useful mirror into an annoying judge.

Recurring subscription tracker

Cut because it was a separate tool masquerading as a feature. It would have required its own data model, its own view, and its own interaction logic. It belonged in a different build. (It didn't get built.)

Notes field on each entry

Cut because it adds input friction every time you log something. A spending tracker you use every day has to be faster than the alternative, which is not logging at all. One extra field is enough to drop the habit.

What the finished version looked like

One HTML file. Four sections: this week's total, last week's total, the week before, and a simple entry form. Drift is color-coded. Data lives in the browser. No setup, no account, no app to download.

The entire build took one session, roughly two hours including testing, iteration, and the detour caused by the date-grouping bug. The session produced a tool I use every day.

That's the real output measure. Not how it looks in a screenshot. Not how impressive the feature list sounds. Whether it gets used.

You can try the finished tool here: open the Spending Reality Check. The full case study, including what failed and what I'd do differently, is at the build case study.

What the process actually looks like

Here's the real sequence, with no steps removed:

  1. Notice a frustration that no existing tool addresses well
  2. Write three sentences: what it does, what it doesn't do, what success looks like
  3. Write the first prompt describing the output format, not the feature list
  4. Test the first version with actual data, not hypothetical inputs
  5. Remove everything the AI added that wasn't asked for
  6. Fix the one thing that broke (there will be one thing)
  7. Use it for a week before touching it again

That's it. The whole process fits in an afternoon. The hard part is step 2, specifically the middle sentence: what the tool doesn't do. That sentence is what keeps the scope small enough to finish.

If you find it hard to write that sentence, it means the idea isn't clear enough yet. That's not a problem. It's diagnostic. It tells you what to figure out before building anything.

The Idea Clarifier tool is useful at exactly this point. It forces that clarity before you start.

The step most people skip is writing what the tool won't do before writing the first prompt. Without it, the AI defaults to adding. You end up spending two sessions removing features instead of one session building the right ones.

Applying this to your own build

The spending tracker is specific. The process is not. The same sequence applies to any small tool you want to build with AI: a note-taking utility, a checklist app, a calculator for some workflow you repeat every week.

The constraint is always the most important decision. Not the features. The boundaries. What will this tool never do, and why? Figure that out first, and the build stays small enough to finish.

If you're still figuring out what to build, what you can actually build with AI shows real categories with real examples. If you have an idea but it's not sharp enough yet, start with the Idea Clarifier.