Fulljoin
...
Analytics ABC

Retention cohorts

Understanding retention cohorts and using them effectively

Retention cohorts are a valuable analytical tool for understanding entity behavior over time. They help provide insights into the effectiveness of marketing campaigns, product changes, or other business initiatives. To have effective communication throughout the rest of the article, let’s define core concepts first.

Cohorts?

A cohort is a group of entities that share a common characteristic or experience during a specific time frame. In the context of retention, cohorts are groups of customers or users who make their first interaction or purchase within a given time frame, often referred to as the "cohort period."

Retention period?

It defines the duration for which you want to monitor the cohort’s engagement or activity after they have performed a specific initial action, such as signing up for a service, making a first purchase, or downloading an app.

Practical example - analyzing Hacker News author retention

Hacker News is a social news website focusing on computer science and entrepreneurship. Retaining productive authors over time is vital for maintaining a steady stream of fresh and engaging content as well as building a healthy community of readers around the platform. By understanding author retention, the platform can recognize and incentivize the most valuable contributors. Let’s look at the author retention cohort below.

...
...
...
...
...
  • From the visualization, we can see that around 20% of authors who started posting on Hacker News in 2007, kept doing it for 12(!) quarters or 3 years straight. Overall, the first cohorts of 2007 retained quite well, with ~10% of them posting 28 quarters (7 years!) after the first post.
  • Throughout the 2008/2009 cohorts, the novelty of the platform slowly diminished and new authors were struggling to stay productive/gain traction. A few of the many reasons behind such change could be the initial hype of the platform launching has worn off, competitive forces, or product changes.
  • Later, around 2010 there was the first wave of more consistent authors. One of the reasons for this could be - social media expansion: Facebook continued to grow rapidly, surpassing 500 million users in 2010. Social networking was a major focus, with Twitter also gaining in popularity. The idea of "social media" was on everyone's lips, and it was seen as a powerful tool for communication and marketing. Perceived value in building an online presence can be one of the reasons for this wave.
  • The second wave around the end of 2012/ start of 2013 could be partially attributed to some of these trends: Cloud Computing Maturity. Cloud computing services, such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform, were gaining traction. The cloud was becoming a mainstream solution for businesses, offering scalable and cost-effective infrastructure; Big Data and Analytics. The concept of big data was gaining attention, and companies were investing in data analytics tools and platforms to gain insights from large datasets. This trend laid the foundation for the growth of data-driven decision-making.
...

What’s next?

This perspective has a significant drawback, as it simplifies user categorization into a single, unhelpful batch: the timing of the first article. Understanding that older authors tend to stick around more isn't particularly valuable because you can't magically make more of them sign up in the past month. Additionally, a decline in author retention doesn't automatically imply that reverting to an older version of your product or business process is the right solution.

Segmenting our cohorts is the right approach to figuring out what to do next.

Just as authors who started posting last month are different from those who started today, authors who write about crypto might be different from those who write about analytics. Similarly, some users might find the platform through an ad while others might sign up after reading an article. Some folks probably enter their work email addresses, while others use a Gmail account. And then there are geographical differences.

The Hacker News dataset on BigQuery contains limited options for author segmentation, therefore we have to innovate. Articles on HN can be scored by the readers, which could be a valuable segmentation angle. Getting the first article rated highly, might lead to better author retention down the line. This segmentation is applied to the overall retention rate in the image below.

...

From the visualization, it is clear that a high initial article score leads to better retention. This is a good signal for the product team to innovate in this direction. A few ideas to be tested come to mind:

  • Authors could be nudged to evaluate each other’s content;
  • A dedicated page for new authors’ articles, to bring more exposure to their content;
  • Adjusting recommendation algorithms to surface more of the newcomers’ content.

Conclusion

In this article, I have provided a glimpse into cohort analysis. Explained main concepts and provided instructions on how to read and interpret it. The main point with cohort analysis (as with any analytical tool) is to use it proactively to make informed decisions, evaluate the effectiveness of those decisions, and course correct accordingly.


Related Readings

We got more in Analytics ABC series

...

MRR bridge

Understand the drivers (blockers) of your growth

...

Algirdas Rumšas

March 04, 2024

...

Sales pipeline

How to construct reporting around CRM data

...

Algirdas Rumšas

March 11, 2024

...

Product funnel

Leveraging product analytics for better customer experience

...

Algirdas Rumšas

March 18, 2024


Get your news delivered

Only important news and sales. Never spam.