In my last post, I committed to providing an overview of some of the currently available AI models and exploring some common use cases. Here goes.
They’re Not All the Same – And That Matters
As a standalone tool, artificial intelligence is most commonly being used as a smart assistant across all kinds of knowledge worker roles, from clerical functions to senior management decision-making. In the simplest form, many people have begun replacing their use of search tools with these AI engines to give them more immediate and direct answers to their daily queries. This use case alone is a tremendous time-saver but there are caveats.
At the time of writing, this is a short list of some of the more popular large language models (LLMs) -
OpenAI – ChatGPT 5.1 (proprietary)
Google Gemini 3 (proprietary)
Anthropic – Claude 4.5 (proprietary)
xAI’s Grok 4.1 (proprietary)
Each model typically offers a standard version and a deep-research or ‘thinking’ version. The standard version is typically adequate for quick responses, while the deep research version uses far more compute and takes the time to more thoroughly research and apply some critical thinking before presenting its results. In addition, most of these models now include multi-modality capabilities – text, images, audio and video – without the need to invoke a separate tool. They all have free and paid versions which can vary in terms of the features, usage volumes and versions available.
If you are wondering why I didn’t call out Microsoft Copilot or Perplexity in my list above, this is because these are not actually LLMs. Copilot is an application that is embedded in Microsoft’s Edge browser and Office tools that relies on their licensing agreement with OpenAI for access to different versions of ChatGPT. Perplexity is what I would call an aggregator for other LLM’s. That is, using the Perplexity chat interface, you are accessing a number of the models listed above, as well as other models (like Llama) and immediate web searches, to provide a consolidated result.
There are no limits to the possibilities but even with the simplest queries, I have found significant discrepancies and errors in the responses provided across the various models. If you’ve been following the news, there is no shortage of stories about people taking the results from their AI queries as fact and repeating them in published articles without verifying them first. In the best case, this may only cause them some embarrassment. In the worst case, it can cost them their job, their reputation and expose them to legal consequences. Risk mitigation, especially in business, has to be a consideration in any AI implementation and so any process that relies on outputs from an AI engine must include humans-in-the-loop (HITL) and must be considered standard practice.
Practical Use Cases That Deliver Results
Unless you are a true start-up and able to start fresh, unconstrained by existing systems and people, your approach to AI needs to be done within the context of your current business. That means finding areas where you can limit risk while still injecting tangible, measurable value. Your goal should be to introduce AI in non-threatening ways, applied against use cases where the returns can be easily seen and calculated.
Some typical use cases for AI-driven assistance include -
Customer service automation using AI-driven chat and agent-assisted tools. Done right, this will drive reductions in response times, improve the accuracy of responses provided and improve customer satisfaction scores.
Marketing and sales enablement. AI can be used to reduce campaign development time, assist with content creation, including visuals, and drive consistency with company messaging.
Supply chain analysis to reduce costs, consolidate suppliers, find anomalies, prevent errors and assist with auditing.
Cybersecurity. To you, it may be an embedded and invisible part of your IT system. However, in today’s threat landscape, you need to be sure that your cyber security systems include AI components to constantly monitor for vulnerabilities and prevent breaches.
These examples are not hypothetical. A quick search can provide you with multiple case studies of how AI has been used to improve accuracy, productivity and/or decision-making in these areas.
But there is a deeper point that requires consideration. These examples also highlight your dependency on application vendors and managed service providers. You need to understand what they are doing to integrate AI into their product and service offerings. It needs to be more than just bolting on some superficial AI with limited usefulness. You cannot afford to ignore this and should pay special attention to how they are leveraging one or more of the foundational LLMs and their plans to keep up with the staggering pace of innovation.
Running a Pilot – One Change, One Metric, 90 Days
To illustrate the practical nature of using these tools, and to understand why cross-model validation can be critical, let me share a personal example.
The Resume Test – When Models Disagree
I recently received a resume from somebody asking for my input. I decided to see what the different AI tools would have to say about it, in addition to providing my own thoughts. I fed the same job description and prompt to four of them. Without going into the details of the position being applied for or the candidate’s actual history, here’s a brief summary of the answers that I received. The variances aren’t an anomaly and they serve to validate my position that no single model is perfect and HITL is a non-negotiable part of the process.
Query – “Please play the role of an automated screening system and assess whether the attached resume would make it through to the first-line recruitment humans.”
Summarized Responses
Gemini – “SCREENING VERDICT: PASS .. The resume has successfully passed the initial automated screening and is recommended for review by the first-line Recruitment Specialist.”
ChatGPT – “Overall ATS Compatibility Score: 73 / 100 - Shows adaptability, digital system adoption, and measurable results. Uses concise bullet format; ATS readable. Would likely pass to a first human screen.
Perplexity – “would likely pass an automated screening, particularly at the first-line review stage. .. checks the boxes .. both direct and transferable skills, measurable outcomes, and tech proficiency.”
Claude – “Pass Rate: 25-35% - VERDICT: LIKELY FILTERED OUT .. Timeline inconsistencies .. Format that many ATS cannot parse correctly .. Missing current (employment) status .. Insufficient exact keyword matches.”
As you can see, three engines gave it a pass and the fourth was a fail. In addition to the excerpts above, each response provided details and suggestions for improvement, although all were different. Ultimately, it is up to the human to decide what’s best and to make whatever adjustments they think are most sensible and likely to provide a better outcome.
Cross Model Validation as Standard Practice
You really should try this for yourself. Pick a document - perhaps a presentation that you are working on and ask multiple AI models to critique it for you. Start your query by providing the context for it, who your audience is and the outcomes that you are looking for. In the query, you should also provide the guardrails for what you are asking it to do. For example, you could ask it how you might make it more compelling or easier to read or humorous. If you prefer to go the deep research route, you might ask it to check it against publicly available papers and presentations that could support your position, and to provide links that you can embed in your work as references. To further your game-time preparation, you could ask it to provide you with a series of questions that your presentation might provoke and to help you draft possible responses. Doing this across multiple engines will provide a wide range of outputs, allowing you to choose those that are most relevant and make the most sense to you. In the end, you’ll have a stronger, well-informed position and call-to-action that is likely to achieve better results.
In my next instalment of this blog post series, I’ll walk through a design-led readiness process that will provide a practical framework for moving forward. After that, we’ll explore agentic AI and the use of APIs – tools that can help you move towards task automation.
I hope you’ve enjoyed this series so far, and I invite you to provide your feedback. I’m always looking for ways to improve. Thanks for reading.
© 2026 by Roy Gowler. All rights reserved.
This article was originally published in November 2025 and posted on Medium.com. As its author, I have updated it and posted it to my own website to increase visibility and reach.
Back to Insights.