ChatGPT Eval: The Most Requested Breakdown Yet
You’ve seen the hot takes now here’s the strategic breakdown: what ChatGPT gets right, where it falls short, and what PMs can learn from it
Hi 👋! You are reading one of my free articles - I hope it helps you level up your PM skills. Want to go deeper? My premium subscribers get exclusive frameworks and strategies I don’t share anywhere else - content that has helped hundreds of product managers land better roles and make bigger impact.
Ready to accelerate your product career? Subscribe now for premium access and join a community of exceptional PMs gaining an edge every week.
Now, let’s dive into today’s insights…
Last week, I dropped my Threads Eval and my DMs blew up.
I got messages from everyone- PMs, founders, designers, even people I hadn’t heard from in years. Some just dropped a “🔥” emoji, while others sent their own hot takes. Two of my premium subscribers even booked 1:1s to walk through their product ideas and run live Evals with me.
One founder told me, “I’ve been building in stealth for 8 months. That Eval made me realize I’ve been thinking in features, not strategy.” This is the power of a well-done product Eval. It doesn’t just explain a product, it reveals how product thinkers think.
If you’re new here, an Eval is a structured, opinionated teardown of a product. You look under the hood. Dissect UX choices. Speculate on the strategy. You ask:
Why was this built this way?
What’s working?
What’s missing?
What would you do differently and why?
This has become one of the top things I look for when hiring PMs. It’s also how I refine my own product sense in public.
Today, we’re going deeper, not into just any app, but into the one that has gradually and suddenly redefined how we all work, write, and brainstorm: ChatGPT.
1. What It Is
ChatGPT is OpenAI’s flagship consumer product. It opens as a chat interface built on top of its large language models (GPT-3.5 and GPT-4). In part, it starts off like a messaging app, but it behaves like a research assistant, a writing partner, a search engine, and a creative co-pilot.
I remember it starting as a novelty in late 2022, and as I write this in mid-2025, it has quickly become a utility and for many, a daily habit.
2. Who It’s For
Knowledge workers chasing leverage.
Creators drafting, brainstorming, or outlining.
Curious users just playing around.
Developers building with the GPT API or creating custom GPTs.
Enterprise teams adopting AI in productivity workflows.
It’s rare for a product to span that many personas without diluting itself. And yet, ChatGPT sort of does.
3. What It Gets Right
Frictionless onboarding – There is almost no learning curve. No tutorial. All you need to do is type, and within seconds, you get value.
Tone and personality - It feels helpful, and to an extent, not robotic. The right level of friendliness. It doesn’t “wow” you with AI, it just works.
Progressive complexity – If you are a casual user with everyday tasks, it is simple to use. However, when you dig deeper, the tools, memory, and custom GPTs reveal themselves.
Interface- The interface is a blank canvas, which implies you decide what kind of product it is. Either a word processor, research tool, tutor, or a thought partner. I believe this flexibility doubles as a strength and one of its challenges.
4. What Still Feels Off
Lack of UX modes - One text box serves everyone. But your needs are different when you are summarizing a document vs. planning a trip vs. debugging code. The interface doesn’t adapt to context.
Memory UX is confusing - What does it remember about me? When does it forget? Where’s the line between helpful and creepy? Is it even personalized in a safe way?
Trust gap - Where is this answer coming from? Why does it say this? There is no attribution or sourcing, which is a major concern for professionals using it for real work.
No collaboration model – I discovered you can’t share a session easily with your friends. No real-time co-chat. No “project mode”. For something used at work, it still feels like a solo tool.
5. What they are optimizing for (My Take)
Token throughput per session (engagement proxy).
DAU/MAU across free vs. paid tiers.
GPTs built and reused.
Enterprise conversion, especially ChatGPT Teams.
Memory activation and stickiness.
OpenAI needs ChatGPT to move from “cool toy” to “default assistant” fast. This is why we are noticing high shipping velocity. They literally ship new updates weekly.
6. What would I do differently?
Mode-specific UX - Let me opt into “writing mode,” “planning mode,” or “debug mode” with custom UI affordances and toolkits. Let the UX be adaptable to the need.
Make sessions shareable - Imagine if ChatGPT chats worked like Google Docs with comments, revision history, and versioning. The future of work is multiplayer. At least for now, this isn’t it.
Explain memory, visually – Give me more control, think something like a dashboard or control center. Let me inspect what it knows and tweak it. Memory shouldn't be a mystery box.
Trust-enhancing overlays - Let me toggle sources, confidence levels, or footnotes. Especially for research and fact-heavy prompts.
7. Where It Fits Strategically
This is what I think most people miss: ChatGPT isn’t just a product. Looking forward, one can describe it as a distribution engine.
This is representative of how OpenAI:
Collects user feedback to improve the model
Monetizes with predictable SaaS revenue (ChatGPT Plus)
Funnels developers into its GPT Store and API platform
Train user behavior before LLMs go OS-wide
Think of ChatGPT as the Google Search bar of the LLM era, but smarter, more adaptable, and (eventually) embedded everywhere.
In Summary
For many, ChatGPT was the first real taste of AI. But more than that, it’s a masterclass in behavior change, abstraction design, and platform ambition.
As a product manager, it’s a goldmine to study. Not because it’s flawless, but because it’s defining a new category while juggling user expectations, ethical concerns, and cutting-edge tech.
Final Word
If the Threads Eval opened your mind, this one might stretch it even further.
And yes, we’re making this a series now. I’m already working on the next one (hint: it’s the product you open right after ChatGPT).
Got a product you want me to Eval next? Drop it in the Eval Club (join the conversation here) or leave a comment below.
If you’re building something AI-first and want your own product Eval, my DMs are open, or you can book a 1:1 like a couple other readers have.
ICYMI: Evals are a key part of building strategic thinking. They’re one of the first ways a PM can start digging into the why behind any product. That’s why I’ve written the articles below to help product builders run better Evals and ask sharper questions that get to the root of real problems.
I wrote this piece as a starting point for running quality Evals. It outlines a 7-point framework PMs can use to evaluate any product. There’s also a bonus case study on the Threads app. Read it here 👇🏾
Ever wondered what really happens behind the button? When you tap “Log in” on Netflix, what’s going on under the hood that makes streaming your movie possible? I broke it down in this product case study. Read it here 👇🏾