How much time did you spend second-guessing your AI this week?

Apr 29
3 min read

Better Answers, Less Waste: Working smarter with multiple AI models

Using the one LLM within a tax research platform or when creating AI agents to help with firm workflows (e.g. Claude Cowork) may sound tempting. One AI model; I’m familiar with it, I trust it. Simple.

In practice, relying on a single model is often the fastest way to build in blind spots, overspend on unnecessary horsepower, and lock yourself into someone else’s roadmap.

At Elfworks, we’ve taken a different approach: we use four models to underpin our Tax Research capability and our AI Agent building process. ChatGPT, Claude, Gemini and Grok. Not because we like complexity for its own sake, but because multiple models produce better judgment, better outputs, and better commercial outcomes.

Here’s why.

1. Tax Research: Four models. Four biases. One better answer.

Every model has its own strengths, weaknesses, and bias profile. Some are excellent at reasoning but prone to over-explaining. Others are fast and cheap but occasionally shallow. Some are careful on legislation review tasks but weaker on case law analysis. None are perfect.

That’s exactly why Elfworks uses four models.

By sending the same complex tax problem through four different LLMs, we let four different biases bash against each other until a consensus view emerges. It’s a practical way of reducing model-specific error and surfacing a better final answer than any single model produces on its own.. When models disagree, our algorithm steps in to adjudicate. The same way a principal reviews conflicting advice before it goes to a client.

2. AI Agent build: Sledgehammer to crack a nut? No - horses for courses instead

As Elfworks moves into the build of AI agents for the accounting industry, the multi-model ‘horses for courses’ approach is becoming even more valuable. An AI Agent is a collection of skills linked together through an orchestrating LLM or series of LLMs. Not every skill requires the same level of AI firepower. Why throw premium reasoning capacity at a job that is basically extraction, summarisation, or drafting from a fixed template? Sometimes the smartest move is not the smartest model. It’s the right model.

This modular approach lets us match model capability to the task:

• certain models for high-quality reasoning,

• another for structured extraction,

• another for fast and inexpensive drafting,

• another for verification and critique.

That means we can keep quality high without paying premium rates across the board.

You may need a hydraulic press to finally halt the Terminator, yet you don’t need such a machine to crack a walnut.

3. AI model agnostic: we care about performance, not brand loyalty

One of the biggest risks at this juncture of the AI journey is getting too attached to a single model.

Building around it. Assume its roadmap is your roadmap. Waiting for its future improvements as if those improvements are guaranteed to solve everything.

We don’t do that.

Elfworks is AI model agnostic. We are not locked into one model and its future upgrades. We don’t build around a single vendor’s promises. We focus on which four models produce the best output for the task at hand, right now.

That gives us freedom:

• freedom to pick the best-performing model,

• freedom to avoid being overexposed to one vendor,

• freedom to respond quickly as the market changes,

• freedom to keep improving without rewriting the whole system.

That flexibility is one of the reasons Elfworks is moving faster and more intelligently than our competition who are tied to a single AI model.

Elfworks is purpose-built for accounting firms that want AI outputs they can actually stand behind. If you'd like to see how the multi-model approach works in practice, contact us at info@elfworks.ai or visit www.elfworks.ai to start your trial today.

How much time did you spend second-guessing your AI this week?

Recent Posts

Comments