970x125
Picture this: You’re a program manager at Microsoft in 2012. Your inbox is flooded with hundreds of feature requests from engineers across the company. Most are complex, resource-intensive ideas that promise revolutionary changes to how Bing displays search results.
Then there’s that email—a simple suggestion about changing how ad headlines appear. A few days of coding work, maybe less. You glance at it, shrug, and mark it “low priority.” After all, how much impact could tweaking a headline format possibly have?
Six months later, that dismissed idea would generate over $100 million in annual revenue.
The Signal
Here’s what actually happened: A Microsoft employee working on Bing proposed a seemingly minor change to how the search engine displayed ad headlines. The idea required minimal effort—just a few days of engineering time—but it was buried among hundreds of other proposals. Program managers deemed it insignificant and let it languish.
The breakthrough came when an engineer, recognizing the low development cost, decided to test the idea anyway. He launched a simple A/B experiment to measure its impact. Within hours, the modified headline format was generating revenue at such unusually high rates that it activated the system’s automated “too good to be true” warning.
The results were staggering: a 12% increase in revenue, translating to more than $100 million annually in the United States alone.
But here’s the twist that should keep every business leader awake at night: This became Bing’s most successful revenue-generating idea in company history, yet until the experiment ran, its value was completely overlooked by the experts responsible for recognizing promising concepts.
The Humbling Truth About Expert Intuition
This story reveals something uncomfortable about business decision-making: Even experts consistently tend to misjudge which ideas will succeed.
Leading search engines and digital platforms see success rates where only a fraction of experiments—typically 10-20%—actually improve key metrics. The broader pattern across major technology companies shows that roughly one-third of tested ideas prove beneficial, one-third remain neutral, and one-third create negative outcomes. This means approximately 80% of seemingly promising concepts fail to deliver measurable improvements.
The Bing headline story illustrates why the traditional approach of having program managers rank ideas by potential impact is fundamentally flawed. The cognitive bias toward complex, resource-intensive projects blinds us to the transformative power of simple changes. Meanwhile, the ideas we’re certain will revolutionize our business—like Bing’s $25 million-plus integration with social media—often produce negligible results.
John Wanamaker’s famous marketing insight applies perfectly here: “Half the money I spend on advertising is wasted; the trouble is that I don’t know which half.” The same principle governs innovation: most of our ideas fail, and even seasoned experts can’t predict which ones will succeed.
The Microsoft Experimentation Machine
What separates companies like Microsoft from their competitors isn’t the quality of their initial ideas—it’s their ability to test everything quickly and cheaply. Microsoft and similar technology leaders now run over 10,000 online controlled experiments each year, with many tests involving millions of users.
This “experiment with everything” methodology has enabled Bing to uncover dozens of revenue-boosting changes monthly, with combined effects that increase revenue per search by 10-25% each year. These improvements, alongside hundreds of user experience enhancements, transformed Bing from capturing 8% of U.S. desktop searches in 2009 to commanding 23% market share as a profitable search engine.
The key insight: breakthrough discoveries can emerge not from betting big on a few supposedly brilliant ideas, but from testing lots of small ones. Consider another Microsoft example: opening Hotmail links in new tabs instead of the same window. This simple coding modification—requiring just a few lines of code—boosted user engagement by 8.9% and became one of Microsoft’s most effective user retention techniques.
The Compound Effect of Small Wins
The real power of systematic experimentation lies in accumulation. While each test might seem insignificant, the compound effect creates substantial competitive advantages. Amazon discovered that moving credit card offers from its homepage to the shopping cart page increased annual profits by tens of millions of dollars. Netflix’s recommendation algorithm improvements, built through thousands of small experiments, fundamentally changed how people discover content.
These companies understand something crucial: in the digital world, success can come from getting lots of small changes right, not from implementing one transformative idea.
Actionable Takeaways
1. Audit Your Idea Evaluation Process (Week 1)
- List your last 20 major business decisions and categorize them by resource investment level.
- Identify how many “big” ideas were implemented without testing and how many “small” ideas were dismissed without testing.
- Calculate the total resources spent on “big bets” versus incremental improvements.
2. Establish Minimum Viable Testing (Week 2-3)
- Set up a simple A/B testing capability for at least one customer touchpoint.
- Create a bias toward testing rather than debating ideas that cost less than X hours to implement.
- Document your prediction for each test before seeing the results.
3. Build Your Experimentation Pipeline (Month 1-2)
- Commit to testing one “obvious” improvement per week for 30 days.
- Track both your prediction accuracy and the actual business impact.
- Share results across your organization to build an experimentation culture.
4. Scale Based on Learning (Month 3+)
- Invest more heavily in experimentation infrastructure only after proving its value.
- Train team members to propose testable hypotheses rather than feature requests.
- Celebrate both successful experiments and valuable failures.
The Experiment
Test this approach in your organization:
Hypothesis: “Small, easily-implemented changes tested systematically will generate more measurable business value than fewer, larger initiatives over 90 days.”
Your experiment: Identify 10 minor customer experience improvements that each require less than one day of implementation time. Test 5 of them using simple A/B methodology while continuing your normal approach for the other 5.
Success metrics: Measure what matters. Compare total business impact (revenue, engagement, satisfaction) generated by the tested minor changes versus the untested ones.
Timeline: 90 days
Complexity level: Beginner
The Bing headline story should remind us that the most valuable business insights often hide in plain sight, dismissed by the very expertise we trust most. In a world where every company claims to be data-driven, the organizations that actually test their assumptions systematically will discover advantages their competitors never see coming.
Sometimes the ideas worth $100 million are the ones sitting in your “low priority” pile.

