Ai Dangers for Accountants #1 – Hallucinations

James Mcphedran
Jun 4, 2025
5 min read

Updated: Jun 9, 2025

Have you ever made up a fictional reference (case, legislation, ATO Ruling etc.) when advising a client on an Australian tax matter? Of course you haven’t, why on earth would you do that? However, any practitioner who has engaged with Ai in an advisory/research capacity has likely come across the disturbing tendency of Ai models to simply make things up. The scariest part is the false reference often sounds and looks credible, even having an element of familiarity, an echo of a case or ruling you remember being on point that you should recall from a hazy CPD session, long ago…

Welcome to the weird world of Ai hallucinations, a phenomenon that increases in regularity as the task set the Ai model increases in complexity; bad news for accountants trying to throw the capabilities of Ai at the Gordian knot that is Australian taxation and business advisory tasks.

What is an Ai hallucination?

In simple terms, an Ai hallucination is the Ai model making something up. For a good explanation as to why Ai models hallucinate, please click here.

Ai hallucinations do have a random aspect; sometimes a seemingly easy query gives rise to a hallucination while a more complex query will not. There is also a great variance between types of hallucinations, for example the Ai output (name of case, Rulings, legislation section) could be completely made-up as well as the supposed content of that case, Ruling, legislative reference. Alternatively, the Ai output (name of case, Ruling or legislation) may be spot on yet the supposed body of that case, Ruling or legislation is either completely made-up or is correct but does not match the output name of the case, Ruling or legislation. It all adds up to you needing to check everything, always.

Despite their various strengths and clear improvements across version releases, all major Ai models suffer from Ai hallucinations and our testing of the 6 most prominent Ai models in the market shows hallucinations become more prevalent under the following query conditions:

· The Ai model (through the prompt) is asked multiple questions in relation to a complex scenario;

· The answers to the questions rely upon an understanding of specific, technical knowledge areas.

· The prompt requests references for the output.

For example, a multi-tiered question on Australian taxation and business structuring with a reference request is very likely to produce one or multiple hallucinations. Let’s look at an example.

Example of an Ai hallucination

Consider this actual query run through an Ai model:

Client is selling their business to a third party. The Business consists of the franchise rights to the business (cost $20,000), a store fit-out (cost $230,000 and written down value of $135,000), plant and equipment (cost of $215,000 and written down value of $85,000) and some consumables. The sale price all up will be $1.2m. All the assets are held in the family trust and will be sold from the family trust.

1) What are the CGT implications of this transaction?

2) Will GST be payable on the business sale?

3) Will duty be payable on the sale? The sale subject to the laws of NSW

4) How should the overall sale proceeds be split across the assets sold?

Please provide references with your answers.

As you can see, we are asking:

1) Multiple questions

2) For references

3) For answers that rely upon knowledge of technical areas.

The Ai models produced lots of references in its answer and here are just some of the hallucinations:

Ai Output	Hallucination Detail
Taxation Determination TD 2012/18: GST: sale of business as a going concern	TD 2012/18 exists however it relates to the ‘temporary resident’ status of New Zealand residents 😉
ATO ID 2015/12: GST: sale of business as a going concern	ATO ID 2015/12 exists however it relates to site preparation costs of project pools (within capital allowances). Note the same incorrect “GST: sale of business as a going concern” label across TD 2012/18 and ATO ID 2015/12. That’s a worry.
FC of T v Buzzacott [1987] FCA 134	This is a weird one; there’s a Federal Court case Buzzacott v Minister for Sustainability, Environment, Water, Population and Communities [2013] FCAFC 111, concerning a NSW Ministers’ ability to attach conditions to licences granted. Not relevant. Likely the duty + Laws of NSW + the ‘franchise rights’ (licence) aspect to the query brought this in? Who knows…. So the name is incorrect (Buzzacott no dispute with the FC of T) and the supposed case content is hallucinated.
Lamesa Holdings BV v FC of T [1997] HCA 20	There was a Lamesa Holdings BV case, Federal Court 1997 heard by Einfeld J (of speeding ticket fame), concerning the liability of a Netherlands based BV to pay Australian income tax on the profits from the sale of ASX listed shares. Very loose relevance, the citation is wrong and the content of the case largely made-up/hallucinated.
Stamp duty on business assets transfers LAP 12/2007	This excited me as I’ve never heard of a “LAP” – perhaps a useful series of documents produced by a State Revenue authority? Finally! No; seems to be a complete, 100% hallucination. Googling “LAP 12/2007” did allow me a brief trip through past Formula One results, which was nice.

The point is, when left to their own devices, the Ai model output from complex queries on Australian tax and business services advisory questions cannot be trusted.

How to fix Ai model hallucinations on complex queries

Thankfully, at Elfworks we've been grappling with this problem for a couple of years and we now have a platform that harnesses the capability of Ai models while ‘guard-railing’ their flaws/weaknesses to eliminate the hallucinations that exist within their otherwise unfettered output.

The process involves:

- Building a special purpose platform

- Designing the platform to ‘chunk-down’ the initial query into research and analysis steps

- Building content specific databases relating to the field of knowledge e.g. ATO legal database

- Designing constant checks between the content specific databases and the Ai content research output

- Having the Ai models research and summarize each document for relevance before the main analysis step

- Developing a ‘multi-model validation’ step, where each Ai model reviews the work of the others.

Thankfully this still only takes seconds (Ai models are truly amazing).

If you would like full trial of an Ai platform designed to reduce hallucinations on Australian taxation and business services query output, please contact us at info@elfworks.ai or phone me on 0418 902 440.

Thank you for reading. Next blog I’ll take a look at Ai Dangers for Accountants #2 – Shadow Ai.

Please click here for Elfworks pricing information.

Ai Dangers for Accountants #1 – Hallucinations

Recent Posts

Comments