AI-Based Creativity Tests: Université de Montréal Shows LLMs Outperform Average Humans…

10 photos

Researchers at Université de Montréal introduced a large-scale experiment that compared leading large language models with humans on creativity benchmarks, featuring the Divergent Association Task (DAT) as its core metric. The study pitted models including ChatGPT, Claude and Gemini against more than 100,000 participants, asking each to list 10 words in four minutes to quantify divergent linguistic creativity.

The paper reported that LLMs scored above the average human on the DAT, though roughly half of participants still outperformed the models and the top 10% far exceeded AI results. The team also evaluated creative writing (haiku, synopses, short stories) and found models performed best when given precise human guidance, highlighting prompt quality as a factor.

For consumers and creators this matters because it frames generative AI as a productivity and ideation tool rather than a replacement; it can elevate average output and accelerate exploration while top human creators remain unrivaled on several creative measures. The study urges nuanced comparisons and continued human–AI collaboration.

Image Credit: DK Studio21 / Shutterstock

Why This Trend Is Growing

AI-augmented Ideation: Generative models bolstering average creative output present opportunities for platforms that blend human curation with AI-suggested concepts to scale ideation workflows.
Prompt Engineering Premiumization: As model performance becomes highly sensitive to instruction quality, specialist tools and services that craft, refine, and standardize high-impact prompts could redefine creative production value chains.
Benchmarking Creativity Metrics: The emergence of large-scale, model-to-human creativity comparisons highlights demand for reproducible evaluation frameworks and analytics that quantify divergent and narrative originality.

Industries Being Reshaped

Advertising and Marketing: Creative agencies and campaign teams may incorporate LLM-assisted ideation layers to increase concept throughput while maintaining human-led strategic differentiation.
Education and Assessment: Testing and learning providers could leverage AI-evaluated creativity benchmarks to personalize curricula and measure divergent thinking at scale.
Creative Software and Tools: Design and writing platforms integrating advanced prompt templates and model-tuning features might shift how professionals prototype narratives, visuals, and brand content.