Our Testing Philosophy
AI Herald evaluates AI models, tools, and platforms based on real-world use — not marketing claims. Every model we review is tested hands-on by our editorial team using actual production tasks: writing code, analysing documents, answering complex questions, and comparing outputs side by side.
We are editorially independent. We have no affiliate relationships with AI labs and receive no payment for coverage. When we say a model is better or worse, it reflects our genuine testing results.
How We Evaluate LLMs
When comparing large language models, we test across five dimensions:
- Coding accuracy — we run the same coding prompts across models and compare output quality, error rates, and whether the code actually runs
- Instruction following — complex multi-step instructions with specific constraints to test how well a model follows directions
- Reasoning — mathematical problems, logical puzzles, and multi-step analysis tasks
- Writing quality — long-form content, tone consistency, and avoiding AI clichés
- Speed and cost — real API latency measurements and cost calculations for standard workloads
We reference published benchmarks (SWE-bench Verified, MMLU, HumanEval, GPQA Diamond) but always note their limitations and supplement with our own testing.
Pricing and Specification Accuracy
All pricing and technical specifications are verified against official documentation at the time of publication. We include the verification date on articles covering pricing, as API costs change frequently. If you find outdated information, contact us and we'll update within 24 hours.
AI-Assisted Content
Some articles on AI Herald are produced with AI writing assistance, particularly news roundups and model specification summaries. All AI-assisted content is:
- Reviewed by a human editor before publication
- Fact-checked against primary sources (official documentation, research papers, announcements)
- Labelled with a disclosure note at the bottom of the article
- Never published without human oversight
In-depth model reviews, comparison articles, and tool evaluations involving hands-on testing are written by our editorial team without AI assistance in the drafting process.
Corrections Policy
We correct factual errors promptly. If an article contains incorrect information — wrong pricing, outdated benchmarks, inaccurate release dates — we update the article, add a correction note at the top, and update the publication date. Email editor@artificialintelligenceherald.com with the URL and the specific error.
About the Author
AI Herald was founded by Eric Samuels, a Software Engineering graduate and certified Python Associate Developer with 5+ years of experience building production applications with large language models and AI agents. Eric personally tests every model reviewed on this site.