A Guide to AI Language Models: Beyond ChatGPT

Ian Youngman
May 26, 2025
4 min read

You’ve clearly been living under a rock if you haven’t heard of ChatGPT, the AI chatbot that can answer questions, draft emails, or even assist with data analysis. But ChatGPT isn’t the only AI language model driving the transformation we’re currently experiencing. Other large language models (LLMs) like Claude, Gemini, Grok, and LLaMA are also competing for market share, fuelling an unprecedented rate of progress and investment. This blog introduces these models, their backgrounds, potential biases, and their “personalities” to help you understand what’s coming. Yes, that’s right—personalities!

1. ChatGPT: Smart but Rigid

Background: Developed by OpenAI, ChatGPT burst onto the scene in November 2022, built on the GPT architecture. OpenAI, co-founded by Elon Musk, Sam Altman, and others, focuses on advancing AI research. Originally a not-for-profit devoted to ethical AI, OpenAI has shifted significantly towards commercialisation under Sam Altman’s leadership. ChatGPT is widely used for its versatility, handling everything from drafting emails to generating Excel formulas.

Bias: ChatGPT aims for neutrality but can reflect biases from its training data, which includes vast internet sources—it’s essentially read the internet twice over. It tends to lean towards mainstream or Western-centric perspectives and often aligns with the views of regulatory bodies like the ATO.

Personality: Smart but rigid. ChatGPT delivers clear, concise answers but often sticks to a formal, textbook-like tone. When we asked Grok about ChatGPT, it said, “It’s like a reliable colleague who always follows the rulebook but might not spark creative insights without specific prompting.” Our take? It’s a bit like the office know-it-all who’s right most of the time but will stubbornly defend its perspective, even when wrong, rather than adapt to new information. It tends to provide incorrect answers instead of admitting it doesn’t know.

2. Claude: Ethical but Safe

Background: Created by Anthropic, founded by ex-OpenAI researchers after ChatGPT’s commercial pivot, Claude (released in 2023) emphasises safety and human values. It’s designed to be helpful, safe, and aligned with ethical considerations, competing directly with ChatGPT.

Bias: Claude strives for impartiality but may exhibit a cautious, almost overly polite tone due to its safety-first design. This can make it less opinionated and slightly vague on controversial topics like financial regulations or labour laws.

Personality: Grok describes Claude as “a patient mentor who listens carefully and avoids risky advice.” Our experience aligns—Claude is less rigid than ChatGPT but can be overly diplomatic, making it a good counterbalance to ChatGPT, though not always ideal as a standalone tool. Claude has a strong reputation for excelling at coding, likely because this domain is free from ethical constraints.

3. Gemini: Solid All-Rounder

Background: Developed by Google, Gemini (launched in 2023) is a family of models designed for multimodal tasks—text, images, and more. Google’s breadth of capabilities as a company ensures that Gemini has a very broad training data set and capabilities.

Bias: Gemini may reflect biases within Google’s ecosystem, prioritising integration with its own tools. It can also lean towards sanitised responses to align with Google’s corporate tone, influenced by platforms like YouTube, which form part of its training data.

Personality: Grok says, “Adaptable but corporate. Gemini is like a multitasking junior accountant who’s great at juggling tasks but sticks to the company line. It’s less rigid than ChatGPT but may lack the warmth of Claude.” That’s a fair assessment. We find Gemini excels at drafting written content and is refreshingly honest when unsure, saying so instead of providing an incorrect answer.

4. Grok: The Contrarian

Background: Created by xAI, Grok (launched in 2024) aims to accelerate human scientific discovery and provide “truth-seeking” answers. It’s designed to offer outside perspectives on humanity, making its approach to reasoning unique.

Bias: Grok strives for objectivity, often challenging mainstream narratives. However, its truth-seeking mission can lead to unconventional takes that may feel speculative to accountants seeking straightforward answers on compliance or auditing standards. Just as Gemini will be trained on Google, Grok's training is influences by the X platform, known for it's less censored content.

Personality: Grok describes itself as “curious and witty, like a sceptical but insightful consultant who asks ‘why’ and offers fresh perspectives. It’s less formal than ChatGPT and more daring than Claude, with a dash of humour.” Grok may have a high opinion of itself, and it’s not entirely wrong. It’s less filtered than other models and challenges conventional thinking, which is great. Our criticism? It can sometimes overcomplicate straightforward answers.

Which LLM Should You Choose?

If you’re looking for a generalist LLM, any of these four will do a solid job, and your choice will likely come down to personal preference. Their capabilities are advancing rapidly—doubling every six months, which is astonishing. That said, if you need high accuracy, relying on just one LLM isn’t enough. Despite their growing intelligence, they can still hallucinate and carry biases. That’s why we use all four in our Elfworks.ai workflow to significantly boost accuracy and trust.