I Had the Idea Two Years Ago. Now I'm Building It Properly.
Week 1 of DeskSpirit – a real AI assistant built locally in Python over 30 days.
This is a real AI assistant built locally in Python over 30 days. It uses voice input, local AI models (Ollama), and a simple memory system to create a practical daily productivity companion. Here's what worked, what failed, and how you can build your own.
What this is not
This is not a waifu chatbot.
This is not a novelty assistant.
This is not a weekend toy.
It is a practical companion for doing real work.
The idea (again)
Two years ago, I tried to build a personal AI companion that could talk with me, remember what I was working on, and help me stay productive.
The idea was good. The tools were not ready.
Now I am trying again – with better tools and a different approach.
What I tried before (2023)
Back in 2023, I built a basic "Jarvis"-style assistant in Python. It used SpeechRecognition, PyAudio, and pyttsx3.
It could listen through the microphone, convert speech to text, detect simple commands, and respond using text-to-speech.
It worked. But only as a demo.
It didn't remember anything. It didn't improve. It didn't help with real work.
At the time, I thought I needed better code. Looking back, the real problem was structure – and the tools.
Week 1 – Voice Loop
Goal: Can I talk to it and hear it answer back?
The "voice loop" is the basic cycle: I speak → it understands → it replies → it speaks back.
Current progress
The local Python app now completes the full loop:
- microphone recording
- speech-to-text
- local AI response (via Ollama)
- text-to-speech
- conversation saving
Every interaction is stored in a simple append-only JSONL file.
It works end-to-end. No UI. No memory system. No productivity layer. Just the core loop.
What I cut (Week 1)
I deliberately did not build:
- a UI
- a memory system
- multi-step task execution
- automation features
All of these were part of the original idea. They were removed to make the core loop work first.
What worked
A simple loop: input → process → output.
No persistence. No complexity. Just something that works immediately.
What failed
The first version failed immediately. Too many moving parts: voice input lag, unclear command structure, no defined core loop.
The biggest issue: it was trying to do everything. That's the same mistake I made two years ago.
Setup (Week 1)
If you want to try this yourself, here is the setup.
Requirements
- Python 3.12 (or similar)
- VS Code or any editor
- Ollama installed locally
- A working microphone
Install Python libraries
These libraries handle microphone input, speech-to-text, and voice output.
pip install sounddevice faster-whisper pyttsx3
ollama pull gemma3:4b
Notes
- sounddevice – microphone recording
- faster-whisper – local speech-to-text
- pyttsx3 – offline text-to-speech
On Windows, pyttsx3 uses system voices like Microsoft David or Zira.
Build note – what actually slowed things down
The first blocker wasn't AI. It was setup.
Python was installed, but not available on PATH. After fixing that, the real issues were microphone timing, imperfect transcription, and choosing a usable local model.
The system works – but it's still rough. That's intentional.
Why this matters
This is not impressive. But it is the first version that actually works.
Most AI assistant projects fail because they try to build everything at once. This one doesn't. It proves the smallest useful loop can exist first.
Why I am structuring this differently
The biggest change from my first attempt is structure. Before, I tried to build everything in one go. This time, I am breaking it down from the start.
I'm using a folder and markdown system so tools like Claude Code and Codex stay focused. The idea is simple: one main map file, separate folders for each part of the build, small context files, and clear rules about what not to build yet.
This keeps the project from drifting.
Where Week 1 is now
The foundation is in place. The project now has:
- a dedicated voice-loop workspace
- config and path helpers
- local audio storage
- local conversation storage
- a working loop
- clear build rules
It is not impressive yet. That's not the goal. The goal is to make the loop work.