May 7, 2026
Blog

Why I Moved Mokey.me from Gemini Vertex to Ollama Cloud (Gemma 4 31B)

In the world of software engineering, the best technical decision isn’t always the one backed by the most marketing budget; it’s the one that actually stays up when your users need it.

For the past few months, I’ve been deep in the trenches building Mokey.me, experimenting with various AI models to find the right “brain” for the project. My journey took an unexpected turn from a Big Tech giant to a more streamlined, open-source stack.

The Gemini honeymoon that ended early

Like many, I started with high hopes for Gemini. I moved from the preview versions to the Vertex AI API, assuming that the “official enterprise solution” would provide the stability and production-ready performance a financial product requires.

The reality, however, was a series of friction points:

  • Inconsistent Latency: The API would perform brilliantly one moment and hit a wall of timeouts the next.
  • Infrastructure “Ghosts”: From random quota issues to unexplained server IP blocks, the overhead of managing the “official” path became a full-time job.
  • The Trust Gap: For a financial tool, accuracy is non-negotiable. I found Gemini’s responses to be hit-or-miss; sometimes I only felt 45% confident in the output. That’s a dangerous margin for production.

The “Pub Talk” epiphany

Interestingly, the pivot happened during a casual drinking session with friends. Between rounds, we talked about self-hosting, API costs, and the rapid evolution of local AI. It hit me: Sometimes, a simpler stack is the superior stack.

I decided to stress-test Ollama Cloud paired with Gemma 4 31B. The difference was night and day:

  • Predictability: The API behavior became boring; in a good way. No surprises, just consistent performance.
  • Confidence: My trust in the model’s output jumped to roughly 98%. It wasn’t just about “intelligence”; it was about the reliability of the integration.

Why Ollama wins for the Indie Developer

As an indie dev, I don’t have time to fight my infrastructure. I need a partner that fits a specific philosophy:

  1. Stability over Benchmarks: I care more about a model that works every time than one that wins a theoretical leaderboard.
  2. Predictable Costs: No opaque enterprise billing; just straightforward, scalable usage.
  3. Simplicity: The developer experience (DX) with Ollama is lean and stays out of my way.

The Road Ahead for Mokey.me

Going forward, Mokey.me is migrating to Ollama Cloud with Gemma 4 31B.

This experience has fundamentally shifted my approach. I’m leaning further into open models and moving away from heavy dependence on closed, “black-box” ecosystems.

Gemini is a powerhouse with massive potential, but model intelligence is only half the battle. If the infrastructure isn’t rock-solid, the intelligence is wasted. Today, open models are no longer just the “budget alternative”; they are the practical, professional choice for developers who value their sanity.

Thanks for reading!