How to prepare your data for user segmentation — tips for early-stage startups

8 min readJun 12, 2020

Originally published at the mobile spoon:

So your product starts to show traction and you want to collect some data and start making data-driven decisions.

It’s time to think about segmentation.

You know the drill:
Different types of users => different usage => different needs => different priorities.
Different types of users => different spend => different priorities.
Different types of users => different pain points => different messaging.
The list goes on…

You need a systematic, scalable way to divide your users into smaller segments based on the characteristics they share.

Here’s how I think you should do it:

1. Collect as much data as possible

At the early stages of a product — scale, performance, and data efficiency shouldn’t be a concern.
The volumes are still relatively low, so you can store as much data as possible, even if you don’t need it right now.

Don’t worry, you’ll have plenty of time to optimize when you’re successful and rich…

2. Make sure it’s all in one place

Whether you’re using a professional analytics tool or storing all the information in your own system of truth — make sure everything stays in one place so it’s easily accessible, comparable, and used.
If different information is stored in different systems — you’ll end up spending a lot of effort trying to sync them all up to understand the big picture.

BTW, I personally prefer to have everything integrated back to the core system. This approach ensures that useful data is not only available for analytics, sales, and marketing purposes but it’s also available for the application itself to use (i.e. support personalized user flows, in-app promotions, etc.).

Now let’s talk about the data you need to collect:

3. Demographic data:

That’s the old-school stuff: gender, birth-date, location, language, occupation, device, and so on.
You might not have everything available at first, but as your relationship with your users evolves, you’ll find elegant ways to collect this information without asking upfront and dropping conversion rates.

If you’re developing a B2B product — collect firmographic data such as company age, number of employees, and industry. At some point, you will need this to qualify and prioritize leads (more about qualification below).

Why is it important?

Some of it is basic functionality: for example, your push notification microcopy should fit the user’s language and gender.
But there are plenty of other examples, where product workflows and messaging might work differently for each segment. When it happens, you need to know about it and use it for the benefit of your product.

For example, you might find that parents use the product differently than younger ones; different hours, different features, different deal sizes. This can help you tune the workflows a little bit, emphasize different things, based on the different segments.
It can also convince you to focus on the segment that generates more revenue, and focus, as you know — is very important in early stages.

4. When did they join?

If you want to measure your improvement (week over week or month over month) — you must work with cohorts.
The most basic cohorts are based on activation dates. This can be different than the installation date or sign-up, depending on the product’s onboarding flow.

Why is it important?

Conversion rate: when optimizing the onboarding experience for maximum conversion rate you will need a way to measure the improvement. Working with cohorts will help you compare apples to apples and track the improvements in your KPIs as the product evolves.

In addition, as your product evolves, you will notice some differences in users’ behavior.
For instance, early adopters are usually more engaged and less sensitive to technical limitations or prices than the majority of users. This will probably result in different engagement numbers, which will be easier to explain through a cohort-to-cohort comparison.

Retention: if you want to measure retention and perform a cohort analysis — you will need this information available.

For example, e-commerce products might see a spike in new user’s conversion rates during holidays. Those users come with a clear intent to buy, which means they’ll convert faster and better but then they’ll more likely to churn.
Is that a product problem? A bug in the latest version? No, it’s just how the market behaves, and you need data to help you understand it.

5. What’s the source of those users?

This one relates to users’ origins and attribution.
Basically, you want to know what drove those users to your product.

Was it organic or paid acquisition?
If it’s paid — what channel was it?
If it’s social — what social network was it and what was the campaign?
If it’s a user invite — who sent the invitation?

Every user should be “labeled” with this type of information (i.e 3 fields for acquisition type, channel, and campaign).

Why is it important:

Business-wise: you’ll need a way to calculate the ROI and understand the unit economics of your business. If your user acquisition doesn’t pay-off, the machine is broken.

Marketing-wise: you’ll want to double down on the channels that are converting well and stop/change the ones that are not.
Keep in mind that conversion is not about installations and sign-ups, so it’s important to measure how those users behave over time, and what’s their lifetime value (LTV).

Product-wise: knowing the users’ origin will help you understand their behavior.

For example, users that came through an aggressive marketing campaign with a big discount coupon might convert better into making the first purchase, but if they came mostly because of that one-time discount, they’ll use it and churn faster than other users.
You will need this data to figure out what the hell happened and evaluate the success of this campaign.

6. Behavioral data

This is where the fun begins.
Significant user actions (or milestones) must be collected and stored, so you can analyze your users based on their actual behavior.
It’s a game of flags, dates, and frequency: first purchasing order date, the total number of orders, last order date, first share, total shares, total invites, average spend, advanced features used, support tickets, uninstalls, etc.

Why is it important?

Actions are stronger than any other data available for segmentation.
Knowing what those users are up to (also during trials) can lead to personalized product experience, plus it can help the marketing team personalize the communication ( drip emails, in-app messaging) and personalization leads to better user experience and better results.

Knowing your users based on their product usage can lead to highlighting different things for different segments. It can be content (i.e. movies they are more likely to watch) or features (i.e. exposing experimental capabilities only to the highly addicted users).

A word about SaaS products: a good technique to qualify leads, especially if you’re using PQLs (product qualified leads — based on the product-led growth methodology), is to evaluate the quality of leads during free trials (or while using a freemium version) based on the performed actions.
Unlike traditional MQLs that increase engagement outside of the product (by downloading content and opening drip emails), here the evaluation is based on actual usage inside the product, hoping that users will reach their aha moment and understand the value of the product.

To evaluate your leads using the product-led growth model you’ll have to measure everything those trial users did and classify them according to their firmographic information, roles, behavior, engagement with certain activities that happened during the trial, and more.

7. Inferred data:

This one is for later stages when you have enough behavioral data to identify patterns.
Those patterns can help you generate inferred/speculative data that will help you segmentize your customers even further, predict their behavior and identify anomalies.

For example, at Missbeez (a marketplace for lifestyle services on demand), we knew some customers were sensitive to time and availability, while others were more concerned about the service quality.
By analyzing their behavior (time of order, time to select a service provider, order size, reactions to special sales, rating, complaints) we were able to speculate their main sensitivity factor: time, quality, or price. Once we knew what’s more important for each segment, we could tailor the product and provide a much better experience.

We used a similar approach to improve our rating system and minimize our marketplace leakage.

Summary:

The more you segment your data, the more you’ll be able to learn from it.
In early stages — learning is extremely important in order to make the right product decisions, focus on the right customers, and understand how to bring them in.

Here’s one example (illustrated below) of how segmentation can help you understand your data:

And here’s another example, this time involving paid acquisition vs. organic (below).

Without understanding the marketing source of each user group it might seem like “bad” leads are converting better than “good” leads (targeted audience).

This, of course, might lead to false conclusions and bad decisions.
However, if you split users group into 2 segments: paid vs. organic — you will see why this is happening: the paid group brings only good leads, but paid acquisition doesn’t convert as good as organic, causing this phenomenon:

Collecting the data itself is not a big effort, but the benefits are huge.
That’s why I think it’s a good practice to start thinking about it before entering the growth phase, where having this data can be very useful.

Hey, before you drop off and leave me all alone in the dark, check out the best of the mobile spoon or subscribe to my newsletter and become 23% more awesome than average.

BTW, I’ll be happy to know who drew the superhero characters I used in my images (couldn’t find a reference), so I can credit them.

Originally published at https://www.mobilespoon.net on June 12, 2020.

How to prepare your data for user segmentation — tips for early-stage startups

1. Collect as much data as possible

2. Make sure it’s all in one place

3. Demographic data:

Why is it important?

4. When did they join?

Why is it important?

5. What’s the source of those users?

Why is it important:

6. Behavioral data

Why is it important?

7. Inferred data:

Summary:

Written by Gil Bouhnick