You’re in a product review, walking through results of a recent experiment.
Someone leans forward, unmutes, and says:
“But is that even a real experiment?”
“Where’s the control group?”
“This isn’t statistically significant. What can we really trust here?”
I’ve been in that meeting. I’ve probably asked those questions. They’re not all bad questions (okay, they are pretty bad) but they reflect a deeper confusion.
Experiments aren’t just process to perform, they’re a tool for reducing uncertainty.
Sometimes you need a “Capital-E” (E)xperiment. But often, something I call a “lowercase -e” (e)xperiment will get you there just as well and faster.
That’s the language I’ve come to use to describe activities you learn from but wouldn’t qualify as formal experiments. It’s not perfect, but it helps me separate the form of an experiment from the function of experimentation because in modern software development, I’ve found we talk a lot about experimentation but rarely inspect how we experiment or why.
Somewhere along the way, “experimentation” became a ritual characterized by formal hypotheses, control groups, and scientific theater. That level of rigor has its place, but not every product decision needs to be treated like a peer-reviewed journal submission.
The function of experimentation: learning under uncertainty
I hear A/B tests described like gold standard truth machines. As if anything less structured is too flimsy and doesn’t “count.”
But in the real world rigor doesn’t guarantee truth or success. It just guarantees structure. And structure isn’t the same as insight.
The purpose of experimentation is not to check a box. It’s to reduce uncertainty. To discover value. To get smarter.
When we confuse the tool (an experiment) with the goal (learning), we start optimizing for the wrong things.
I think of the spectrum of experimentation something like this:
When “Capital-E” (E)xperiments shine
Use (E)xperiments for your high-stakes bets. Think science vs art. Think pricing changes, new customer retention strategies, or anything that could meaningfully affect the business at scale. (E)xperiments stand up to objective scrutiny in contrast to scrappier “just ship it and see” efforts. You need strong signal with low noise. You’re trying to validate or kill an idea with confidence.
But (E)xperiments come with trade-offs. They take more time, coordination, and investment. They need stable traffic and clean attribution, without which you risk false precision and can cause you to delay action waiting for statistical significance.
Examples of Capital-E (E)xperiments: A/B tests, Multivariate testing, time bound baseline testing, etc
You earn the right to run these types of tests after you’ve eliminated shaky underlying assumptions and explored the edges of the idea. Not before. If you don’t already have an opinion you want to test, you are not ready to run an (E)xperiment.
The power of “lowercase-e” (e)xperiments
These can be everyday sense-making activities. Think having a conversation with a few users, sketching a janky prototype on a literal piece of paper, or doing a hallway test with a colleague. They're about directional learning and momentum towards removing shaky assumptions.
They often don’t look like experiments at all. But that’s kind of the point. You don’t need to be “doing science” to learn something useful.
If you’re trying to answer:
“Is there some signal here?”
“Are we on the right track?”
Then these small (e)xperiments are invaluable. They won’t give you confidence intervals but they might give you 80% of the learning in 10% of the time.
The real risk: performing science instead of practicing learning
This is where teams get stuck: they run (E)xperiments™ without learning. A/B tests are launched with fuzzy hypotheses. Success is measured with metrics that don’t map to value. And when the experiment ends…nothing changes. Let’s even say say statistical significance is declared but did not account for the other ongoing tests as influencing factors. Once it’s implemented the expected impact…never materializes.
A successful A/B The point is, don’t confuse ceremony with substance. A clean room doesn’t mean you’re learning the right thing.
This is why I agree with Ash Maurya, who says “Running experiments is not the most important thing scientists do. Explaining what they observe is.”
And that’s the thing, real learning often happens after the test ends.
How to Choose the Right Tool
When you’re deciding how to approach a problem, start with the uncertainty before selecting the method.
Ask yourself:
What decision am I trying to make?
How much uncertainty exists?
What’s the cost of being wrong?
How fast do I need a signal?
If you need direction, start small. If you need confidence, scale up. Instead of striving for purity, we should strive for usefulness.
This matches what Itamar Gilad points out: “Experimentation is a learning engine, not a performance review.”
When rigor becomes a roadblock
(E)xperiments sound great in theory. But in practice? They’re fragile. They require clean starting conditions to work well.
Messy attribution? Leaky funnel? Flawed measurement? You might end up trusting noise dressed up as certainty or running a test you’ll never implement. Or worse, the perfect experiment set up can take so long that momentum dies before the learning even begins.
That’s the irony: chasing rigor for rigor’s sake often slows down the very thing experimentation is meant to accelerate: learning.
Think through the tradeoffs of the tool you select
Every experiment is a tradeoff between confidence and speed, thoroughness and momentum, the desire to be right and the need to move.
The simplest way I’ve found to frame it: (E)xperiments favor confidence over speed while lowercase-e experiments prioritize speed over certainty. Neither is better. But both come at a cost.
Is confidence more important than speed right now?
Will the precision of the answer actually change the outcome?
Are we in explore mode or refine mode?
The best teams don’t chase rigor, they chase relevance. They know exactly what kind of learning they need right now.
Experiments are tools, not identities
There’s a subtle pressure to look rigorous. To be seen as someone who “does things right.” But experimentation isn’t a badge of craft. It’s a tool. And tools only work if you use the right one for the job.
Next time someone says:
“But that’s not a real experiment,”
Try reframing:
“It might not be rigorous, but it helped us learn something we didn’t know yesterday.”
“It’s not conclusive but it’s enough to take the next step.”
Sometimes, that’s all you need.
The real goal: reduce uncertainty, not prove a point
If you’re chasing whispers of demand or validating a hunch, you probably don’t need p-values. You need direction. If something seems promising then yes - tighten the variables. Run a cleaner test. Break out your lab coat.
But not before.
The best teams are the ones learning fastest, with the least effort, and acting on what they find.
Final Thought: constant learning rarely looks like science
Great product teams don’t just run (E)xperiments. They think experimentally. They build feedback loops into everything they do. They care less about looking scientific, and more about getting smarter, faster.
And that mindset really compounds over time.
You don’t need a lab coat to reduce uncertainty. You need curiosity, context, and the humility to say: we don’t know…yet.