The bug fix that almost cost LinkedIn millions
Lessons on drilling deep into data, navigating a crisis and proactive collaboration
Context
Small changes can have a big impact. This is the story of how the LinkedIn Ads team faced a sudden decline in revenue on one of their most profitable ad slots. This post will take you through the investigative process, the discovery of the underlying issue, and the challenging decision the team ultimately had to make.
Crisis
In 2013, I was the lead PM for the LinkedIn Ads team. These ads are the small text ads that you still see around LinkedIn today. An ad slot, located at the top of the LinkedIn profile page (see example below), generated around $20M in revenue annually and contributed to ~25% of the total revenue for the LinkedIn Ads product. One day, we observed a 40% lower click-through rate (CTR) on this ad slot, which threatened a $8M loss per year.
Problem
Once we confirmed the issue wasn't an anomaly, the cross-functional LinkedIn Ads team began investigating potential causes. We examined various factors such as geo, region, ad formats, and seasonality but found no insights. We also involved the ads recommendation engineering team to understand if any changes were made to the algorithms which would have lowered ad quality but found no significant issues.
A week into the crisis, the team still hadn't identified the cause of the problem. My manager (Gyanda) and skip (David) were asked to give daily updates to our CEO (Jeff). We held daily meetings with cross-functional leaders from both the Ads and Profile teams to brainstorm and test hypotheses. Eventually, Josh Clemm on the profile team hypothesized that the ad CTR was affected by the browser used. He then confirmed it by checking in by running IE on a Windows machine. We confirmed that the majority of the drop came from Internet Explorer (IE) users, who also tended to click on those ads more often, making the problem worse (see sample data below)
Compromise
After finding the culprit, the team rolled back all the changes on the profile page but still couldn't fix the specific issue. They then rolled back the changes of the global navigation team, whose shared code affected the area of the page in question. This rollback revealed that the global nav team had fixed a bug for users of the affected browser, which had originally added 15 pixels of extra padding from the top of the nav to the LinkedIn ad. We realized that someone months prior wrote the code that had a syntax error only on IE. And IE handled that by adding a new line. Fixing the bug made the experience consistent with other browsers, but it also made the ad less visible, resulting in fewer clicks.
Dilemma
The team encountered a challenging dilemma - should they fix the bug to ensure a uniform user experience across all browsers or maintain the status quo to secure revenue generation? They meticulously evaluated the advantages and disadvantages of each choice, taking into account both customer satisfaction and the company's financial health. In the end, the team reached a compromise, opting to keep the padding slightly higher for all browsers, resulting in a more consistent user experience without sacrificing the $8M in annual revenue.
Learnings
Emphasize collaboration: The daily meetings with cross-functional leaders helped identify and resolve the issue quickly.
Test hypotheses continuously: Persistence in testing various hypotheses, such as examining geo, region, and finally browsers, led to the discovery of the underlying issue.
Leverage data wisely: Data-driven insights helped the team make informed decisions and stay on track. Additionally, it's crucial to recognize when metrics improve, not just when they decline.
Foster a no-blame culture: Focusing on testing and finding solutions, rather than blaming anyone for the issue, fostered a productive environment.
Conclusion
The LinkedIn Ads story illustrates the importance of collaboration, persistence in testing hypotheses, and a data-driven approach when solving complex problems. By applying these lessons, you can navigate similar challenges and strike a balance between user experience and revenue generation. Feel free to share any similar experiences or thoughts in the comments below!
Revised the post to correct inaccuracies. Thanks, Josh, for bringing them to my attention!
this reminds me of the time facebook messenger search quality degraded for no apparent reason and we had to mobilize multiple teams to figure out what happened (the reason ended up being ramadan lmao) - and also led to the first domino that led to me getting fired