Questions to Ask When Analyzing Data
A practical checklist of questions for analysts and decision-makers to define the problem, validate data, choose methods, and translate findings into clear actions.
1What decision will this analysis inform—and who needs to make it?
Click to see why this works
What decision will this analysis inform—and who needs to make it?
Click to see why this works
Why this works
Anchors the work to a real decision owner and prevents aimless exploration.
2What is the minimal useful output (metric, threshold, ranking)?
Click to see why this works
What is the minimal useful output (metric, threshold, ranking)?
Click to see why this works
Why this works
Clarifies deliverables so scope, method, and timelines match the business need.
3How was the data generated and what biases could it contain?
Click to see why this works
How was the data generated and what biases could it contain?
Click to see why this works
Why this works
Surfaces selection, measurement, and survivorship biases that distort conclusions.
4What are the unit of analysis and time windows?
Click to see why this works
What are the unit of analysis and time windows?
Click to see why this works
Why this works
Avoids aggregation errors and mismatched denominators that mislead stakeholders.
5Where are missing values, outliers, and duplicates—and why?
Click to see why this works
Where are missing values, outliers, and duplicates—and why?
Click to see why this works
Why this works
Forces data quality triage before modeling, improving reliability of results.
6What baseline or prior should we compare against?
Click to see why this works
What baseline or prior should we compare against?
Click to see why this works
Why this works
Frames results relative to expectations or controls, not in isolation.
7What assumptions does our method require, and do they hold?
Click to see why this works
What assumptions does our method require, and do they hold?
Click to see why this works
Why this works
Prevents invalid inference by checking normality, independence, stationarity, etc.
8What are plausible alternative explanations?
Click to see why this works
What are plausible alternative explanations?
Click to see why this works
Why this works
Encourages counterfactual thinking and reduces false certainty in causal claims.
9What sensitivity checks or ablations will we run?
Click to see why this works
What sensitivity checks or ablations will we run?
Click to see why this works
Why this works
Tests robustness by varying inputs, features, or time slices.
10What’s the simplest model that would be good enough?
Click to see why this works
What’s the simplest model that would be good enough?
Click to see why this works
Why this works
Right-sizes complexity to interpretability and deployment costs.
11How will we validate results (holdout, cross-validation, backtesting)?
Click to see why this works
How will we validate results (holdout, cross-validation, backtesting)?
Click to see why this works
Why this works
Ensures generalization and guards against overfitting to historical noise.
12What error bars, confidence intervals, or prediction intervals matter here?
Click to see why this works
What error bars, confidence intervals, or prediction intervals matter here?
Click to see why this works
Why this works
Communicates uncertainty transparently for risk-aware decisions.
13If we’re wrong, how will it fail—and what’s the impact?
Click to see why this works
If we’re wrong, how will it fail—and what’s the impact?
Click to see why this works
Why this works
Focuses on downside risk and mitigation plans upfront.
14How will we make this explainable to non-technical stakeholders?
Click to see why this works
How will we make this explainable to non-technical stakeholders?
Click to see why this works
Why this works
Prioritizes simple narratives, visuals, and decision-ready artifacts.
15What decision thresholds trigger action—and who owns the next step?
Click to see why this works
What decision thresholds trigger action—and who owns the next step?
Click to see why this works
Why this works
Turns findings into a playbook with clear responsibilities.
16What data would most improve this analysis next time?
Click to see why this works
What data would most improve this analysis next time?
Click to see why this works
Why this works
Creates a learning loop that compounds value across projects.
17What are the privacy, security, or ethical considerations?
Click to see why this works
What are the privacy, security, or ethical considerations?
Click to see why this works
Why this works
Reduces legal and reputational risk by designing for responsible use.
18What’s the ROI of acting on these insights?
Click to see why this works
What’s the ROI of acting on these insights?
Click to see why this works
Why this works
Frames impact in dollars, hours, or risk reduction to prioritize execution.
19How will we monitor drift or model decay over time?
Click to see why this works
How will we monitor drift or model decay over time?
Click to see why this works
Why this works
Plans for maintenance, alerts, and retraining to preserve performance.
20What is the one-slide summary a leader needs to say yes?
Click to see why this works
What is the one-slide summary a leader needs to say yes?
Click to see why this works
Why this works
Forces ruthless synthesis so the recommendation is easy to approve.
From Question to Decision: A Data Analysis Playbook
Expert tips and techniques for getting the most out of these questions.
Analysis That Drives Action
Start With the Decision
Define decision, owner, and deadline before opening a notebook.
Make Assumptions Explicit
List and test assumptions; document what breaks if they fail.
Quantify Uncertainty
Leaders trust ranges more than point estimates—show your error bars.
Stakeholder-Ready Artifacts
Decision Memo Template
Common Pitfalls
Analysis Paralysis
Timebox exploration; ship the minimal useful answer first.
Causal Overreach
Avoid claiming causality without design or instruments to support it.