The Analytical Skills No One Teaches You
Estimation, Baselines, Root Cause Analysis, and Metrics That Actually Matter
Hi, fellow future and current Data Leaders; Ben here 👋
Today we’ve got an amazing guest author, Olga Berezovsky!
Olga is an analytics and data science leader focused on building impactful data products and helping businesses turn insights into better business decisions. She brings a strong ability to bridge business and technology and is passionate about mentoring analysts and growing high-performing data teams.
If you’re looking to learn more about everything analytics, then you should check out her newsletter. It’s filled with great content and how-tos.
Now let’s jump into the article!
When I asked Olga to write an article, I wanted it to be focused on skills that data professionals don’t get taught explicitly.
There aren’t a lot of videos out there on how to deliver an impactful analysis to executives.
Even when it comes to running an analysis, many of us likely had to feel around in the dark a bit. I was just speaking to another data science leader who said they had to have an executive essentially take them aside and let them know their analysis weren’t great.
There are so many of these skills that analysts and engineers alike have to pick up on the job and no one tends to tell you what is good or bad.
So let’s talk about some of those skills you need to start working on!
1. How To Develop Analytical Intuition
Many companies will ask candidates questions that might seem out of the blue. Like, how many dentists are in the world?
What they are trying to gauge is your analytical intuition.
Essentially, given problem with limited information, can you come up with a reasonable framework or approach to answer the question or at least know what information you’d need to look for in the future.
Here are some tips if you’re working to improve your analytical intuition.
Ability to set reasonable ranges and break down sampling:
Example: How many windows are in NY city? How many teachers are there in the world?
How to: Start with an educated guess based on something related to the question — a proxy value that you do have some intuition about. Then, work your way toward a ballpark estimate using averages and scaling logic.
Critical thinking: What goes up must come down
If a value goes up, it should come down by a similar degree at some point, and vice versa. That’s why we use continuous distributions and probabilities - to account for natural variance, not just isolated changes.
If you’re new to a dataset or project, the first thing you should do is understand the degree of natural traffic fluctuations and baseline variance. This helps you separate expected changes from those driven by external factors and understand how much deviation is “normal.”
Math and fractions: Part of a whole
Every metric is a fraction. It’s just one piece of a broader ecosystem made up of other interconnected parts. Let’s say you confirm that the payment success rate is 25%. That means 25% of users successfully complete a transaction. But it also means that 75% do not. The more related metrics you identify, the easier it is to cross-check them.
This same logic applies to funnels and conversions. Every metric is a fraction of a whole. If one thing is rising, something else (a) should be falling or (b) should be rising as well. If you don’t see (a) or (b), question everything. Once you understand the relationships between metrics, figuring out what is declining and why becomes much easier.
Develop a habit of checks: random users, session flow, against diff tools
For every sample or report, pull 5-10 random users from the dataset and manually check their attributes - paid price, invoice details, subscription plan, country, number of transactions, etc.
Build the habit of manual spot checks and cross-checks in every report or dashboard. Don’t trust tooling or automation.
2. How To Do a Root Cause Analysis
We apply root cause analysis when a metric shows unexpected movement. For example:
Average DAU starts gradually declining M/M, but MAU stays flat.
Net new transactions increase 5% W/W, but total revenue doesn’t change.
Customer churn for annual subscriptions doubles.
The trial-to-paid rate drops by 10%.
The keyword is unexpected. Analysts own the concept of a baseline - an estimated (modeled or forecasted) value with applied seasonal, M/M, and Y/Y adjustments.
For example, you may notice that total transactions sharply decline in September after August. This decline may be expected if the same pattern appears each year during this period. Or it may point to a bug. Or it may be a mix of both. To break it down, you need to know your baseline -how many transactions you typically expect at this time of the month and year, and how much that number has changed.
First things first: ensure the data you see is correct.
Find at least two or more other data sources showing a similar decrease to confirm it’s true.
Possible issues: broken ETL, holiday schedules that break during long weekends, etc.
If you’re confident the data is accurate, and this is an actual decline, proceed with modeling different hypotheses on what the issue may be.
Analysis: Generate multiple hypotheses that you have to confirm or reject.
Product hypothesis: the drop is related to a product bug or a specific product launch. There is usually a sharp drop if it’s a bug. The drop can also be gradual for releases because teams often do slow rollouts with 1% traffic release, then 20% → 50% → 100%. It can also be for a segment of users (e.g., new users only or paid).
Market or competition hypothesis: it can be a new tool that takes the market share. Or slowed down spending, or a shift in user acquisition strategy. You will notice it by a gradual decline, not necessarily tied to a specific campaign or promotion.
User hypothesis: This is typically a gradual, inconsistent decline. Given that (a) the proportion of different personas tends to change, and (b) they are not necessarily tied to seasonality (rather to marketing campaigns that brought them in), it can be challenging to capture the beginning of a decline. You would need to check the drop against different user cohorts - registered but not active, active but lapsing, power users, premium or free, business or individuals, or whatever personas you work with.
External factors: pandemic, war, or big social movement (#MeToo, BLM, abortion law, etc.). These may cause a sharp decline (or an increase) in usage across many top user actions, features, and platforms.
Once you’ve proven the hypothesis based on the data, whether it’s a product bug, marketing issue, or user behavior, escalate it to the appropriate team to gather more context. Ideally, you’re already a few steps ahead, with hypotheses and analyses prepared on which metric is declining and by how much.
3. How To Develop A KPI And Connect It To Action
KPIs can be financial, customer-focused, and process-focused.
There are many types of metrics:
Top-level metrics - a measure of strategic direction and performance. Each top-level metric is tailored and adjusted to the specific business model, environment, and strategy. This is when context and nuances really matter. Top-level metrics and KPIs are usually reported monthly and quarterly.
North Star Metric - represents one company goal. From Mixpanel North Star Metric: “To qualify as a “North Star,” a metric must do three things: lead to revenue, reflect customer value, and measure progress” It can be DAU, or LTV, or MRR, or other measurements, and its main purpose is to align teams around one main goal.
Secondary metrics - more granular health indicators and product targets. They measure how successful the product and process are. Secondary metrics are sensitive to any product changes. Therefore, you would measure A/B tests, feature adoptions, or bug impacts against secondary metrics. Due to their sensitivity, they are also usually monitored and reported weekly.
Vanity metrics - impressive but not useful or actionable metrics that don’t lead to growth or revenue and aren’t relevant to anything you can do to improve them. They are often too simple and ignore the context. Examples: the number of social media followers or total registered users. It’s like only working out your arms when you go to the gym and ignoring your core. More about Vanity metrics.
OMTM - One Metric That Matters. This is different from the North Star Metric and meant to be a temporary goal unifying all the teams at the company towards one issue. An example: when your software got hacked and all user accounts got deleted, you would set OMTM as a number of reinstated accounts. For a less dramatic example, when you begin a migration, your OMTM can be the number of successfully migrated accounts. Or when your churn significantly overweights new accounts and renewals, then you have to pause everything and focus on retention.
The right metric should be:
Relevant - represent the result you want to see. If you make a change to a user flow, you should measure user steps and following actions, not net new revenue.
Measurable - do you even have the data to get the metric? Do you trust the source?
Specific - detailed to illustrate the right product movement. User retention is not the right metric to measure the A/B test. But the frequency and/or type of actions are.
Prioritized - what stands out from other metrics as the highest priority. How to differentiate nice-to-have metrics from must-have metrics reporting.
Balanced - meant to measure positive and negative outcomes. If you notice a traffic increase for using one feature, most likely there is a decrease somewhere else.
Metrics types
You probably know it already, but there 4 main categories of metrics that are meant to capture different purposes:
Sums and counts - Daily Active Users, the sum of sales, unique number of unsubscribers, etc.
Distribution (mean, median, mode, percentiles) - average memory used, % of MAU, a median session length, or others.
Probability and rates - if you change a screen layout, you have to measure click-through rate or click-through probability.
Ratios - monthly/annual subscription ratio, male/female usage ration, or etc.
Here are a few examples of common metrics across different domains:
Growth & Marketing: Unique Visitors, First Visits, Returning Visitors, Bounce Rate, Installs, Signups, Customer Acquisition Cost (CAC), Click-Through Rate (CTR), Cost Per Impression (CPI), Cost Per Action (CPA), Time to Value, Visitor-to-Signup Rate, Signup-to-Payment Rate, Product or Feature Adoption Rate, Virality, Network Effect Score, Return on Advertising Spend (ROAS), Number of Qualified Leads, Lead Conversion Rate, Average Lead Score, Cost Per Lead (CPL), Unsubscribes.
Revenue: Monthly Recurring Revenue (MRR), Annual Recurring Revenue (ARR), Net Revenue, Net Revenue Retention, Paid Customers, Activated Trials, Free-to-Paid Conversions, Paid-to-Free Downgrades, Revenue Churn, Customer Churn, Monthly/Weekly Customers Completing Their First Order, Daily/Monthly Total Purchase Value, Lifetime Value (LTV), Average Revenue Per Account (ARPA), Upsell-to-Payment Rate, Expansion Revenue, Return on Investment (ROI), and others.
Engagement: MAU, WAU, DAU, Adjacent Users, Day 0, Day 1+, Day 7+, and Day 28 Retention, 1-Year or 2-Year Retention %, Number of Returning Users, Daily/Hourly Number of Actions, Total Watch Time, Total Time Spent, Frequency of Visits, Pages Per Session, Scroll Depth, Average Session Duration, Exit Rate, Product Abandonment Rate, and others.
Customer Success: Customer Satisfaction Score (CSAT), Net Promoter Score (NPS), Customer Health Score, Ticket Resolution Rate, Average Resolution Time, Average Reply Time, Customer Effort Score (CES), First Response Time, Daily/Monthly Ticket Requests.
Platform / Engineering: Product Support Cost, R&D Engineering Cost, Outsourcing Rate, Cost Performance Index (CPI), Schedule Performance Index (SPI), Uptime, Average Downtime per Month/Year, Machine Downtime Rate, % Planned Maintenance, Number of Releases, Running Cost, Number of Bugs, Number of Pull Requests, Capacity Utilization, Memory Usage, Requests Per Minute (RPM), Errors Per Minute, and others.
If you’re like to read more on KPIs:
4. KPIs Done Wrong: Fixing Common Reporting Mistakes
Don’t overthink KPIs. If you’re unsure how to measure an initiative, stick to simple metrics like unique views, CTA clicks, and the % of users with a CTA click. Not everything needs to be tied to LTV or MRR.
Use an effective proxy metric that is both sensitive and independent. If your proxy requires complex calculations, it’s not a good proxy.
Don’t stress about benchmarks - focus on your Signup-to-Paid MoM growth.
Avoid bringing metrics definitions from your previous job into your current project. Every product is unique, with different user lifecycles. Some apps have monthly and annual subscriptions, while others have 18 payment plans. Churn calculations will vary. Develop KPIs that fit this specific business and product.
Final Thoughts
There are a lot of skills you’ll start to take for granted as you grow as a data analyst or engineer. But they all came from somewhere.
I hope this newsletter helps you put words to concepts or can be an article you share with someone just joining the data world.
As always, thanks for reading!
Articles Worth Reading
There are thousands of new articles posted daily all over the web! I have spent a lot of time sifting through some of these articles as well as TechCrunch and companies tech blog and wanted to share some of my favorites!
Apache Hudi™ at Uber: Engineering for Trillion-Record-Scale Data Lake Operations
Uber operates one of the most diverse and demanding data ecosystems in the world. Every trip taken, order delivered, ad served, or real-time arrival time recalculated generates an unending stream of data. These data points come from hundreds of microservices, thousands of cities, and millions of riders, each with its own velocity, shape, and business-criticality. At the heart of this ecosystem lies Uber’s data lake: a multi-hundred-petabyte repository that fuels operational decisions, machine learning models, experimentation platforms, and real-time business intelligence.
What It Actually Takes to Build a Data Pipeline System
When I first started in the data world, it was common that many data teams would build their own data pipeline solutions. There were still dozens of options in terms of off the shelf tools of course, nevertheless, you’d see custom pipelines developed everywhere.
In 2025, I saw less of this.
In fact, in many cases data teams would go straight to picking tools or solutions.
But let’s say you do want to go down this route. You want to build your own data pipeline solution?
How would you do it?
End Of Day 208
Thanks for checking out our community. We put out 4-5 Newsletters a month discussing data, tech, and start-ups.
If you enjoyed it, consider liking, sharing and helping this newsletter grow.






@Olga - Fantastic insights. Thanks
Super interesting read!
Do you have this article as an md file? Asking for an ai friend...