ChatGPT Eval: The Most Requested Breakdown Yet

You’ve seen the hot takes now here’s the strategic breakdown: what ChatGPT gets right, where it falls short, and what PMs can learn from it

Jul 18, 2025

Hi 👋! You are reading one of my free articles - I hope it helps you level up your PM skills. Want to go deeper? My premium subscribers get exclusive frameworks and strategies I don’t share anywhere else - content that has helped hundreds of product managers land better roles and make bigger impact.

Ready to accelerate your product career? Subscribe now for premium access and join a community of exceptional PMs gaining an edge every week.
Now, let’s dive into today’s insights…

Last week, I dropped my Threads Eval and my DMs blew up.

I got messages from everyone- PMs, founders, designers, even people I hadn’t heard from in years. Some just dropped a “🔥” emoji, while others sent their own hot takes. Two of my premium subscribers even booked 1:1s to walk through their product ideas and run live Evals with me.

One founder told me, “I’ve been building in stealth for 8 months. That Eval made me realize I’ve been thinking in features, not strategy.” This is the power of a well-done product Eval. It doesn’t just explain a product, it reveals how product thinkers think.

If you’re new here, an Eval is a structured, opinionated teardown of a product. You look under the hood. Dissect UX choices. Speculate on the strategy. You ask:

Why was this built this way?
What’s working?
What’s missing?
What would you do differently and why?

This has become one of the top things I look for when hiring PMs. It’s also how I refine my own product sense in public.

Today, we’re going deeper, not into just any app, but into the one that has gradually and suddenly redefined how we all work, write, and brainstorm: ChatGPT.

1. What It Is

ChatGPT is OpenAI’s flagship consumer product. It opens as a chat interface built on top of its large language models (GPT-3.5 and GPT-4). In part, it starts off like a messaging app, but it behaves like a research assistant, a writing partner, a search engine, and a creative co-pilot.

I remember it starting as a novelty in late 2022, and as I write this in mid-2025, it has quickly become a utility and for many, a daily habit.

2. Who It’s For

Knowledge workers chasing leverage.
Creators drafting, brainstorming, or outlining.
Curious users just playing around.
Developers building with the GPT API or creating custom GPTs.
Enterprise teams adopting AI in productivity workflows.

It’s rare for a product to span that many personas without diluting itself. And yet, ChatGPT sort of does.

3. What It Gets Right

Frictionless onboarding – There is almost no learning curve. No tutorial. All you need to do is type, and within seconds, you get value.
Tone and personality - It feels helpful, and to an extent, not robotic. The right level of friendliness. It doesn’t “wow” you with AI, it just works.
Progressive complexity – If you are a casual user with everyday tasks, it is simple to use. However, when you dig deeper, the tools, memory, and custom GPTs reveal themselves.
Interface- The interface is a blank canvas, which implies you decide what kind of product it is. Either a word processor, research tool, tutor, or a thought partner. I believe this flexibility doubles as a strength and one of its challenges.

4. What Still Feels Off

Lack of UX modes - One text box serves everyone. But your needs are different when you are summarizing a document vs. planning a trip vs. debugging code. The interface doesn’t adapt to context.
Memory UX is confusing - What does it remember about me? When does it forget? Where’s the line between helpful and creepy? Is it even personalized in a safe way?
Trust gap - Where is this answer coming from? Why does it say this? There is no attribution or sourcing, which is a major concern for professionals using it for real work.
No collaboration model – I discovered you can’t share a session easily with your friends. No real-time co-chat. No “project mode”. For something used at work, it still feels like a solo tool.

5. What they are optimizing for (My Take)

Token throughput per session (engagement proxy).
DAU/MAU across free vs. paid tiers.
GPTs built and reused.
Enterprise conversion, especially ChatGPT Teams.
Memory activation and stickiness.

OpenAI needs ChatGPT to move from “cool toy” to “default assistant” fast. This is why we are noticing high shipping velocity. They literally ship new updates weekly.

6. What would I do differently?

Mode-specific UX - Let me opt into “writing mode,” “planning mode,” or “debug mode” with custom UI affordances and toolkits. Let the UX be adaptable to the need.
Make sessions shareable - Imagine if ChatGPT chats worked like Google Docs with comments, revision history, and versioning. The future of work is multiplayer. At least for now, this isn’t it.
Explain memory, visually – Give me more control, think something like a dashboard or control center. Let me inspect what it knows and tweak it. Memory shouldn't be a mystery box.
Trust-enhancing overlays - Let me toggle sources, confidence levels, or footnotes. Especially for research and fact-heavy prompts.

7. Where It Fits Strategically

This is what I think most people miss: ChatGPT isn’t just a product. Looking forward, one can describe it as a distribution engine.

This is representative of how OpenAI:

Collects user feedback to improve the model
Monetizes with predictable SaaS revenue (ChatGPT Plus)
Funnels developers into its GPT Store and API platform
Train user behavior before LLMs go OS-wide

Think of ChatGPT as the Google Search bar of the LLM era, but smarter, more adaptable, and (eventually) embedded everywhere.

In Summary

For many, ChatGPT was the first real taste of AI. But more than that, it’s a masterclass in behavior change, abstraction design, and platform ambition.

As a product manager, it’s a goldmine to study. Not because it’s flawless, but because it’s defining a new category while juggling user expectations, ethical concerns, and cutting-edge tech.

Final Word

If the Threads Eval opened your mind, this one might stretch it even further.

And yes, we’re making this a series now. I’m already working on the next one (hint: it’s the product you open right after ChatGPT).

Got a product you want me to Eval next? Drop it in the Eval Club (join the conversation here) or leave a comment below.

If you’re building something AI-first and want your own product Eval, my DMs are open, or you can book a 1:1 like a couple other readers have.

ICYMI: Evals are a key part of building strategic thinking. They’re one of the first ways a PM can start digging into the why behind any product. That’s why I’ve written the articles below to help product builders run better Evals and ask sharper questions that get to the root of real problems.

I wrote this piece as a starting point for running quality Evals. It outlines a 7-point framework PMs can use to evaluate any product. There’s also a bonus case study on the Threads app. Read it here 👇🏾

Why Every PM Should Master Product Evals (feat. Threads App Case Study)

Salem - Product Buddy

Jul 11

Read full story

Ever wondered what really happens behind the button? When you tap “Log in” on Netflix, what’s going on under the hood that makes streaming your movie possible? I broke it down in this product case study. Read it here 👇🏾

How Netflix Works- A Product Teardown for Curious People

Salem - Product Buddy

Mar 31

Read full story

The Product Newsletter

Why Every PM Should Master Product Evals (feat. Threads App Case Study)

How Netflix Works- A Product Teardown for Curious People

Discussion about this post