100 Types of Data Analysis

Category 01

Foundational Analysis

The core analytical hierarchy every data professional must master — from understanding the past to recommending the future.

#01

Descriptive Analysis

Summarizes historical data using measures like mean, median, mode, and standard deviation to understand what has already happened.

Example → A retail chain reports average monthly sales of $2.4M, median transaction value of $47, and identifies that December accounts for 28% of annual revenue.

#02

Exploratory Data Analysis (EDA)

An open-ended investigation using visual and statistical techniques to uncover patterns, spot anomalies, and test assumptions before formal modeling.

Example → A data scientist creates scatter plots and correlation matrices on healthcare data, discovering BMI and blood pressure are strongly correlated (r=0.74).

#03

Diagnostic Analysis

Drills into data to identify the root cause of a specific outcome. Answers: "Why did this happen?"

Example → Website traffic fell 35% in March. Diagnostic analysis reveals a Google algorithm update penalized thin content, affecting 120 blog posts specifically.

#04

Predictive Analysis

Uses statistical models and machine learning to forecast future outcomes based on historical patterns and trends.

Example → A bank trains a Random Forest model on 5 years of loan data to predict default with 89% accuracy, saving an estimated $12M annually in bad debt.

#05

Prescriptive Analysis

The highest tier — recommends specific actions to achieve desired outcomes, often using optimization and simulation techniques.

Example → An airline's pricing system recommends real-time seat prices to maximize revenue per flight, adjusting 50,000 prices per second across 500 routes.

#06

Inferential Analysis

Draws conclusions about a large population by analyzing a representative sample, using probability theory and statistical tests.

Example → Sampling 2,000 patients from a 500,000-person database infers 23% have pre-diabetic markers, with ±2% margin of error.

#07

Causal Analysis

Establishes genuine cause-and-effect relationships — not just correlations — using controlled experiments or quasi-experimental methods.

Example → A randomized controlled trial proves that one-click checkout caused an 18% conversion increase, eliminating alternative explanations through proper design.

#08

Mechanistic Analysis

Explains the precise biological, chemical, or physical mechanism behind an observed relationship — common in scientific research.

Example → Researchers trace the exact molecular pathway: drug → inhibits COX-2 enzyme → reduces prostaglandin synthesis → less inflammation.

Foundational Analysis Visualizations

Category 02

Statistical Analysis

Rigorous mathematical methods for understanding data relationships, distributions, and significance — the backbone of evidence-based decision making.

#09

Regression Analysis

Models the relationship between a dependent variable and one or more independent variables to predict outcomes and quantify relationships.

Example → Linear regression predicts each additional $1,000 of ad spend generates $4,200 in revenue (R²=0.91), establishing a clear ROI model.

#10

Correlation Analysis

Measures the strength and direction of the linear relationship between two variables using Pearson's r or Spearman's rho coefficients.

Example → Ice cream sales and drowning incidents show r=0.85 — a classic example of correlation ≠ causation (both driven by hot weather).

#11

Hypothesis Testing

Formal statistical framework to test assumptions about data using p-values, t-tests, chi-square tests, ANOVA, and other procedures.

Example → A t-test confirms the new pricing page generated significantly higher conversions (p=0.003), well below the 0.05 significance threshold.

#12

Variance Analysis

Compares actual results to planned or expected results, breaking down differences into price variance, volume variance, and efficiency variance.

Example → Production costs exceeded budget by $80K: $50K is price variance (materials cost more) and $30K is efficiency variance (machine downtime).

#13

Distribution Analysis

Examines the shape, spread, and statistical properties of data — testing for normality, skewness, and kurtosis to inform analytical approach.

Example → Customer spend data shows right-skewed distribution (skewness=2.3) — log transformation normalizes it, enabling valid parametric testing.

#14

Quantile Analysis

Divides data into equal-frequency groups (quartiles, deciles, percentiles) to understand distribution and identify threshold effects.

Example → The top 10% of customers account for 52% of revenue — justifying a dedicated VIP program investment.

#15

Frequency Analysis

Counts occurrences of each value or category, producing frequency tables and histograms to understand data distribution across categories.

Example → Support ticket frequency analysis: 42% billing issues, 28% technical, 18% account access — informing team staffing ratios.

#16

Cross-Tabulation

Creates contingency tables showing joint frequency of two or more categorical variables, often with chi-square independence tests.

Example → Crosstabbing region × product preference: 71% of West customers buy Product A vs. 38% in the East — informing regional marketing.

#17

Autocorrelation Analysis

Measures how a variable correlates with its own past values (lags), essential for identifying temporal patterns before time series modeling.

Example → Web traffic autocorrelation reveals a strong weekly cycle (lag-7 r=0.92), confirming Monday traffic reliably predicts next Monday.

#18

Multicollinearity Analysis

Detects when independent variables in a regression model are highly correlated, inflating standard errors and destabilizing coefficients.

Example → VIF scores reveal "age" and "years of experience" have VIF=12.4 (above threshold of 10) — one is removed before finalizing the salary model.

Statistical Analysis Visualizations

Category 03

Machine Learning & AI Analysis

Algorithmic approaches that let computers learn patterns from data to make predictions, classifications, and generate insights at scale.

#19

Cluster Analysis

Unsupervised learning that groups data points into clusters based on similarity, with no predefined labels.

Example → K-Means clustering segments 200,000 app users into 6 behavioral groups: power users, casuals, nighttime browsers, weekend warriors, deal-hunters, and dormant.

#20

Classification Analysis

Supervised learning that assigns each data point to one of several predefined categories using trained models like decision trees, SVMs, or neural networks.

Example → A CNN classifies X-ray images as "normal," "pneumonia," or "COVID-19" with 94.2% accuracy, assisting radiologists in high-volume screening.

#21

Anomaly Detection

Identifies data points that deviate significantly from expected patterns — used for fraud detection, quality control, and system monitoring.

Example → Isolation Forest flags 0.3% of transactions as fraudulent, detecting overseas card-not-present transactions following new account creation within 4 hours.

#22

Principal Component Analysis

Dimensionality reduction that transforms correlated high-dimensional data into uncorrelated principal components retaining maximum variance.

Example → PCA compresses a 200-gene expression dataset into 8 principal components explaining 87% of variance, enabling tractable downstream analysis.

#23

Factor Analysis

Identifies underlying latent factors explaining observed correlations — widely used in psychology, social science, and survey research.

Example → Factor analysis of a 25-question employee survey reveals 4 latent factors: management trust, role clarity, work-life balance, and career growth.

#24

Dimensionality Reduction

Techniques like t-SNE, UMAP, and autoencoders that compress high-dimensional data for visualization and modeling.

Example → t-SNE projects 768-dimensional BERT embeddings of 50,000 news articles into 2D, revealing 12 distinct topic clusters without any labels.

#25

Survival Analysis

Analyzes time until an event occurs (death, churn, failure), handling censored data where the event hasn't yet happened.

Example → Kaplan-Meier curves show SaaS customers not using the collaboration feature have 60% 12-month retention vs. 90% for feature users.

#26

Reinforcement Learning

Analyzes outcomes from agents that learn by trial-and-error, optimizing for cumulative reward — used in robotics, game AI, and dynamic pricing.

Example → A RL agent learns optimal inventory restocking policies via supply chain simulation, reducing stockouts by 34% vs. rule-based systems.

#27

Ensemble Analysis

Combines multiple models (bagging, boosting, stacking) to improve prediction accuracy and reduce overfitting beyond any single model.

Example → XGBoost ensemble of 500 decision trees wins a Kaggle churn competition with 96.1% AUC, vs. any individual tree at 85% AUC.

#28

Explainability (XAI)

Interprets "black box" ML models to understand which features drive predictions — critical for regulated industries and trust-building.

Example → SHAP values reveal a mortgage model's top features: credit score (38%), debt-to-income (27%), and loan-to-value ratio (19%), enabling fair lending audits.

ML & AI Analysis Visualizations

Category 04

Behavioral & User Analysis

Techniques focused on understanding how users, customers, and cohorts behave over time — the foundation of modern product and growth analytics.

#29

Cohort Analysis

Groups users by a shared characteristic or start date and tracks behavior over time, revealing how different acquisition periods perform.

Example → Users acquired via organic search have 45% 90-day retention vs. 22% for paid social — revealing the sustainable growth channel.

#30

Funnel Analysis

Maps conversion rates across sequential steps in a user journey, identifying where and why users drop off at each stage.

Example → Visit (100%) → Browse (68%) → Cart (32%) → Checkout (18%) → Purchase (11%). The Cart→Checkout step loses 44% — fixed with one-click checkout.

#31

Churn Analysis

Identifies customers likely to stop using a product or service and models the key drivers of departure.

Example → Logistic regression finds customers with <2 logins in 30 days, no mobile app, and declining adoption have 78% churn probability within 60 days.

#32

RFM Analysis

Segments customers by Recency, Frequency, and Monetary value to prioritize marketing efforts and personalize outreach.

Example → RFM scoring identifies 3,200 "Champions" — offered exclusive early product access with 65% acceptance rate and 3.8x higher spend vs. average.

#33

Customer Lifetime Value

Projects total net profit from a customer across their entire relationship with the business, informing acquisition and retention investment.

Example → CLV modeling reveals premium subscribers are worth $2,800 over 3 years vs. $380 for free-tier upgraders — justifying higher CAC for premium channels.

#34

Behavioral Analysis

Studies patterns in how users interact with products — clicks, scrolls, session duration, feature usage — to identify friction and opportunity.

Example → Session recording of 10,000 sessions reveals 73% of users who see the pricing table spend >2 minutes comparing plans vs. 8-second average elsewhere.

#35

Clickstream Analysis

Analyzes the sequential path of clicks a user makes through a website or app, revealing navigation patterns and common journeys.

Example → Clickstream analysis reveals a high-converting path: Home → Blog → Case Studies → Pricing → Demo. This "research path" converts at 4.2x the site average.

#36

Customer Journey Analysis

Maps the complete end-to-end experience across all touchpoints and channels from awareness to advocacy.

Example → Journey mapping shows 60% of enterprise customers touch 7+ channels over a 3-month sales cycle, with LinkedIn as first touchpoint for 45% of deals.

#37

Demographic Analysis

Segments data by demographic characteristics to understand how different groups behave differently.

Example → Gen Z users (18–26) have 3x higher daily active usage but 60% lower willingness to pay compared to Millennials (27–42).

#38

Workforce Analytics

Applies data analysis to HR data — hiring, performance, attrition, engagement — to optimize people decisions.

Example → Workforce analytics finds engineers who skip week-3 onboarding activities have 2.3x higher 90-day attrition, enabling targeted redesign.

Behavioral Analysis Visualizations

Category 05

Text & NLP Analysis

Techniques for extracting structured insight from unstructured text, audio, and language data — powering everything from chatbots to market research.

#39

Sentiment Analysis

Uses NLP to determine emotional tone (positive, negative, neutral) of text at document, sentence, or aspect level.

Example → Analyzing 120,000 app reviews finds 81% positive overall, but negative sentiment spikes to 67% for "battery performance" specifically — guiding the engineering roadmap.

#40

Text Mining

Extracts meaningful patterns and structured information from large collections of unstructured text using statistical and linguistic methods.

Example → Mining 5 years of support tickets automatically identifies "slow loading" complaints increased 340% after a platform migration, flagging a performance regression.

#41

Topic Modeling

Unsupervised technique (LDA, NMF) that discovers hidden thematic structure in a document corpus without predefined categories.

Example → LDA on 50,000 customer emails uncovers 8 latent topics: pricing, delivery speed, quality, service, packaging, returns, mobile app, and checkout.

#42

Text Classification

Assigns text documents to predefined categories using trained ML models, automating manual tagging at scale.

Example → BERT-based classifier routes 10,000 daily support emails to 8 departments with 96% accuracy, reducing manual routing from 4 hours to 2 minutes.

#43

Named Entity Recognition

Identifies and classifies named entities (people, organizations, locations, dates) in text — enabling structured extraction from documents.

Example → NER extracts company names, deal values, and dates from 100,000 financial news articles, populating a competitive intelligence database automatically.

#44

Content Analysis

Systematic, replicable technique for categorizing and quantifying content in text, images, or media for research.

Example → Coding 1,000 political news articles as positive/negative/neutral toward a candidate reveals significant media bias differential between outlets.

#45

Social Media Analysis

Mines social platforms for brand mentions, trending topics, influencer networks, and public opinion at scale.

Example → Twitter brand monitoring detects a viral complaint thread 8 minutes after posting, triggering automated customer service escalation before mass coverage.

#46

Speech & Audio Analysis

Extracts insights from audio recordings through speech-to-text, speaker identification, emotion detection, and acoustic feature analysis.

Example → Call center voice analysis detects customer frustration and automatically escalates calls to senior agents, reducing average escalation time by 4 minutes.

#47

Semantic Search Analysis

Uses vector embeddings to find semantically similar content beyond keyword matching, enabling conceptual search across large corpora.

Example → Vector search on a legal database returns all contracts mentioning "force majeure concepts" even without those exact words — 340% more relevant results.

#48

Document Summarization

Automatically condenses long documents into key points using extractive or abstractive summarization techniques.

Example → Auto-summarization of 500-page annual reports into 2-page summaries with 94% metric coverage, reviewed by analysts in 8 min vs. 3 hours.

Text & NLP Visualizations

Category 06

Time-Based Analysis

Methods for analyzing data that unfolds across time — identifying trends, seasonality, cycles, and structural changes in temporal datasets.

#49

Time Series Analysis

Analyzes sequentially ordered data to decompose trends, seasonality, and residuals — enabling forecasting and anomaly detection.

Example → SARIMA on 3 years of hourly electricity data decomposes daily cycles, weekly patterns, and annual seasonality — forecasting next week's demand within ±3.2%.

#50

Trend Analysis

Identifies consistent directional movement in data over time, distinguishing genuine trends from noise using moving averages.

Example → 12-month moving average reveals an underlying 8% annual subscription growth trend, despite significant month-to-month noise in raw data.

#51

Demand Forecasting

Predicts future demand for products or services using historical data, promotional calendars, economic indicators, and ML models.

Example → Facebook Prophet forecasts daily demand for 2,000 SKUs at 94% accuracy, reducing inventory costs by $2.1M annually through optimized stock levels.

#52

Longitudinal Analysis

Studies the same subjects across multiple time points to track change over long periods, controlling for individual differences.

Example → 10-year longitudinal study of 5,000 employees finds mentorship in Year 1 leads to 31% higher earnings by Year 10, controlling for education and role.

#53

Cross-Sectional Analysis

Studies multiple subjects at a single point in time — a snapshot establishing current-state benchmarks and cross-group comparisons.

Example → Cross-sectional survey of 15,000 companies in Q2 2024 finds firms with data teams of 5+ people are 2.4x more likely to exceed revenue targets.

#54

Panel Data Analysis

Combines longitudinal and cross-sectional dimensions, enabling fixed-effects and random-effects econometric models.

Example → Fixed-effects panel model across 300 retailers over 5 years controls for store-level differences, finding self-checkout reduces staffing costs by 18% net.

#55

Intervention Analysis

Measures the impact of a specific event or policy change on a time series, modeling the step-change effect of an intervention.

Example → Intervention analysis quantifies that a major press mention caused a permanent step-change of +2,400 weekly signups persisting for 6 weeks.

#56

Lead-Lag Analysis

Studies whether changes in one variable consistently precede changes in another by a predictable time offset — identifying leading indicators.

Example → LinkedIn job postings for "data scientist" lead company stock increases by 6 months (r=0.71) — an alternative investment signal.

#57

Seasonality Analysis

Identifies and quantifies regular periodic patterns in time series data, separating seasonal effects from underlying trend.

Example → December sales are 2.8x the annual average — justifying 3x inventory pre-build and dedicated seasonal hiring of 400 staff.

#58

Event-Driven Analysis

Analyzes data patterns triggered by specific events, studying behavior before and after defined trigger points.

Example → Event study around earnings announcements shows stocks drift upward 3 days before positive surprises — a pattern worth investigating for information leakage.

Time-Based Analysis Visualizations

Category 07

Business & Financial Analysis

Quantitative frameworks for evaluating business performance, financial health, strategic decisions, and operational efficiency.

#59

Financial Ratio Analysis

Evaluates financial health by computing liquidity, profitability, efficiency, and leverage ratios from financial statements.

Example → A startup's current ratio of 0.7 (below 1.0) and quick ratio of 0.4 signals critical liquidity risk — triggering a $3M emergency credit line negotiation.

#60

Break-Even Analysis

Calculates the exact point where total revenue equals total costs, establishing the minimum sales volume needed for profitability.

Example → A new restaurant needs 87 covers/day at $42 average spend to break even — informing location selection based on foot traffic data.

#61

Waterfall Analysis

Visualizes how sequential positive and negative values contribute to a final total, tracing the path from starting to ending value.

Example → Revenue waterfall: Q1 base $10M → new customers +$3.2M → upsells +$1.1M → churn -$2.4M → price increases +$0.8M = Q4 total $12.7M.

#62

Portfolio Analysis

Evaluates a collection of assets, products, or business units to optimize allocation, balance risk, and maximize portfolio-level returns.

Example → BCG matrix classifies 24 products: 3 Stars, 8 Cash Cows, 6 Question Marks, 7 Dogs — informing investment and divestment decisions.

#63

Risk Analysis

Identifies, quantifies, and prioritizes risks using probability-impact matrices, value-at-risk models, and stress testing.

Example → Credit risk model assigns each loan PD × LGD × EAD to compute expected loss — enabling risk-adjusted pricing and capital allocation.

#64

Payback Period Analysis

Calculates time required for an investment to recover its initial cost from net cash inflows — a simple, widely-used capital budgeting tool.

Example → A $240K CRM generates $60K/year in savings = 4-year payback. The competing $180K system has 2.5-year payback — different decisions by time horizon.

#65

Gap Analysis

Systematically compares current state to desired state, quantifying gaps in performance, capability, or market position.

Example → NPS gap: current score 28 vs. leader's 67 — decomposed into service response (-12 pts), product reliability (-18 pts), pricing perception (-9 pts).

#66

Pareto Analysis

Applies the 80/20 rule to identify the vital few causes responsible for the majority of effects, focusing improvement efforts optimally.

Example → 18% of defect types cause 83% of production losses — these 360 defect types are prioritized for Six Sigma improvement projects.

#67

Benchmark Analysis

Compares performance metrics against industry standards, best-in-class competitors, or internal best performers.

Example → Page load time 4.8s vs. industry median 2.3s and top quartile 1.1s — a performance improvement program targets the 1.5s threshold.

#68

Supply Chain Analytics

Optimizes procurement, inventory, logistics, and supplier relationships using data to reduce costs and increase reliability.

Example → Supplier risk analysis scores 340 vendors on reliability, financial health, and geopolitical exposure — flagging 23 critical single-source suppliers for dual-sourcing.

Business Analysis Visualizations

Category 08

Geospatial & Network Analysis

Techniques for analyzing data with geographic, spatial, or relational network structure — unlocking insights invisible in tabular data.

#69

Geospatial Analysis

Analyzes data with geographic coordinates to reveal spatial patterns, proximity effects, and location-based insights using GIS tools.

Example → Geospatial analysis of store sales + demographics + competitor locations identifies 12 optimal new store sites with predicted revenue within 8%.

#70

Spatial Analysis

Examines relationships between geographic entities — distance, adjacency, containment, and spatial autocorrelation (Moran's I).

Example → Spatial autocorrelation finds COVID infection rates are spatially clustered (Moran's I=0.68), confirming local transmission and informing zone interventions.

#71

Hotspot Analysis

Identifies statistically significant geographic concentrations of events using Kernel Density Estimation and Getis-Ord Gi* statistics.

Example → Crime hotspot analysis identifies 4 census tracts with Gi* z-scores >3.0 — concentrating 28% of incidents in 3% of city area for targeted policing.

#72

Network Analysis

Studies entities (nodes) and their relationships (edges) using centrality, clustering, path length, and community detection metrics.

Example → Social network analysis of LinkedIn identifies 8 "broker" individuals with highest betweenness centrality — optimal targets for viral campaign seeding.

#73

Route Optimization

Uses graph algorithms (Dijkstra, A*, VRP solvers) to find optimal paths through transportation networks under constraints.

Example → Last-mile delivery optimization across 2,000 daily packages reduces total distance by 23% and fuel costs by $180,000 annually vs. manual dispatch.

#74

Image Analysis (CV)

Extracts structured information from images and video using deep learning — classification, detection, segmentation, and tracking.

Example → Satellite image analysis counts vehicles in parking lots of 5,000 retailers weekly, predicting quarterly earnings with r=0.88 — an alternative data signal.

#75

Market Area Analysis

Defines the geographic catchment area of a retail location or service, measuring trade area penetration and competitive overlap.

Example → Gravity model shows new coffee shop's primary trade area is a 0.4-mile radius capturing 60% of customers, with competitor cannibalization reducing revenue by 22%.

#76

Proximity Analysis

Measures distances between geographic features to understand spatial relationships — buffer zones, nearest neighbor, and service area coverage.

Example → Proximity analysis shows 34% of the target population lives >5 miles from the nearest EV charging station — quantifying infrastructure gap for policy planning.

Geospatial & Network Visualizations

Category 09

Marketing & Product Analysis

Data-driven frameworks for measuring marketing effectiveness, optimizing product experiences, and attributing value across customer touchpoints.

#77

A/B Testing Analysis

Compares two variants through controlled experiments with proper statistical power, significance testing, and effect size estimation.

Example → A/B test with 50,000 users per variant shows red CTA button increases click rate 8.3% → 9.7% (p=0.001, +16.9% relative lift), generating $420K additional annual revenue.

#78

Multivariate Testing

Tests multiple page elements simultaneously using fractional factorial designs to find the best-performing combination.

Example → Testing 3 headlines × 2 hero images × 2 CTAs (12 combinations) identifies headline C + image 1 + CTA B as optimal — 31% better with 95% confidence.

#79

Attribution Analysis

Assigns credit for conversions across marketing touchpoints using models from last-click to data-driven multi-touch attribution.

Example → Markov chain attribution gives email 31% credit (vs. 0% last-click), reshaping budget allocation away from last-click bias.

#80

Market Basket Analysis

Uses association rule mining (Apriori, FP-Growth) to find products frequently purchased together, driving cross-sell strategies.

Example → {pasta, tomato sauce} → {parmesan} with 78% confidence, lift=3.4 — bundled together, parmesan sales increase 45%.

#81

Price Elasticity Analysis

Measures how demand changes in response to price changes, informing optimal pricing strategy and revenue maximization.

Example → Elasticity of -1.3 for a software tool means a 10% price increase reduces demand by 13% — elastic demand, so revenue-maximizing price is below current level.

#82

Web Analytics

Analyzes website traffic, user behavior, content performance, and conversion paths using tools like GA4, Mixpanel, and Amplitude.

Example → Mobile users have 3.1% conversion rate vs. 5.8% desktop — mobile UX audit identifies 6 friction points, fixing them recovers $1.2M in mobile revenue.

#83

Product Analytics

Studies user interaction with software products — feature adoption, engagement depth, north star metrics, and activation funnels.

Example → Product analytics identifies users who create their first project within 2 hours have 4.7x higher 30-day retention — triggering an "aha moment" onboarding redesign.

#84

Uplift Modeling

Measures the true incremental causal impact of a treatment, enabling targeting of "persuadable" customers only.

Example → Uplift model identifies 40,000 "persuadables" — targeting only these vs. all 200K reduces promotion costs 75% while maintaining 95% of incremental revenue.

#85

Conjoint Analysis

Reveals how customers value different product attributes by analyzing choices among product profiles and estimating willingness-to-pay per feature.

Example → Conjoint study shows customers value battery life (35%), price (28%), brand (22%), and weight (15%) — roadmap prioritizes battery improvements.

Marketing & Product Visualizations

Category 10

Advanced & Specialized Methods

Sophisticated techniques from econometrics, operations research, Bayesian statistics, and simulation — for complex, high-stakes problems.

#86

Bayesian Analysis

Updates probability estimates as new evidence arrives, combining prior beliefs with observed data — ideal for small samples and sequential learning.

Example → Bayesian A/B test allows stopping early when probability of variant B being better exceeds 95% — saving 40% of planned experiment duration.

#87

Monte Carlo Simulation

Models probability distribution of outcomes by running thousands of random simulations, generating ranges rather than point estimates.

Example → 50,000 Monte Carlo runs show 50% probability of on-time delivery, but only 23% probability of on-time AND on-budget — a far more honest risk picture.

#88

Sensitivity Analysis

Tests how robust model outputs are to changes in input assumptions, identifying which variables most influence outcomes.

Example → DCF model shows 1% change in discount rate swings valuation by $8M — more than a 15% revenue change — making it the key assumption to stress-test.

#89

Scenario Analysis

Evaluates multiple distinct future scenarios (bull/base/bear) under explicit assumptions, stress-testing strategies against plausible futures.

Example → Bull: TAM +30%, 15% share, NPV=$45M; Base: TAM +12%, 8% share, NPV=$18M; Bear: TAM flat, 3% share, NPV=-$4M.

#90

Econometric Analysis

Applies statistical methods to economic data to establish causal relationships, test economic theories, and evaluate policy impacts.

Example → Difference-in-differences model across 200 counties finds minimum wage increase caused 2.3% employment reduction in restaurants vs. control counties.

#91

Propensity Score Analysis

Controls for selection bias in observational studies by matching treated and control units on their probability to receive treatment.

Example → Propensity score matching pairs 5,000 loyalty card recipients with 5,000 similar non-recipients — finding a true causal effect of +$180 annual spend.

#92

Structural Equation Modeling

Tests complex theoretical models involving multiple dependent variables, latent constructs, and mediated effects simultaneously.

Example → SEM confirms: leadership quality → employee engagement → customer satisfaction → revenue. Engagement mediates 68% of leadership's effect on revenue.

#93

Optimization Analysis

Finds the best solution from a set of feasible alternatives using linear programming, integer programming, or meta-heuristics.

Example → Linear programming determines optimal product mix across 8 products and 5 constraints, increasing total margin by 19% ($2.3M) vs. heuristic planning.

#94

Decision Analysis

Structures complex decisions under uncertainty using decision trees, expected utility theory, and multi-criteria frameworks.

Example → Build factory (60% success, EV=$4.8M) vs. License tech (85% success, EV=$2.55M) vs. Do nothing (EV=$0) → Build factory maximizes expected value.

#95

Reliability & Quality Analysis

Uses Statistical Process Control — control charts, capability indices (Cpk), and FMEA — to monitor and improve process quality.

Example → Cpk of 0.87 triggers a Six Sigma DMAIC project, improving Cpk to 1.68 and reducing defect rate from 8,000 to 120 PPM.

#96

Simulation Analysis

Models complex systems with interacting components using discrete-event, agent-based, or system dynamics simulation.

Example → Hospital agent-based simulation finds adding 2 triage nurses reduces ER wait from 47 to 28 minutes — cheaper and faster than adding beds.

#97

Reliability (Cronbach's Alpha)

Assesses the internal consistency and reproducibility of measurement instruments — surveys, tests, and multi-item scales.

Example → Cronbach's alpha=0.91 on 12-item survey; item analysis shows question 7 reduces reliability — dropping it improves α to 0.94.

#98

Signal Detection Analysis

Uses ROC curves and precision-recall trade-offs to tune binary classifiers, balancing false positive vs. false negative costs.

Example → ROC analysis on a fraud model (AUC=0.94) at threshold 0.4: precision=71%, recall=88% — optimal trade-off given missed fraud ($500) is 3x more costly than a false alarm.

#99

Root Cause Analysis

Systematically identifies the deepest root cause of a problem using 5 Whys, fishbone diagrams, fault trees, and Pareto charts.

Example → 5 Whys on server outages: outage → high CPU → memory leak → threading bug → code review gap → no automated performance regression tests. Root cause: missing CI/CD gate.

#100

Weighted Scoring Analysis

Assigns weights to multiple criteria to objectively rank and select among alternatives, bringing rigor to multi-factor decisions.

Example → Vendor selection: Cost (35%), Features (30%), Support (20%), Security (15%) — Vendor A scores 82.5 vs. Vendor B 78.3, providing an auditable procurement decision.

Advanced Methods Visualizations

§ 11 — Quick Reference

Comparison Table

A quick-reference guide to the most important analysis types — their complexity, typical tools, and best use cases.

#	Analysis Type	Category	Complexity	Primary Question	Common Tools
01	Descriptive	Foundational	Low	What happened?	Excel, SQL, Tableau
02	EDA	Foundational	Low	What patterns exist?	Python, R, Jupyter
03	Diagnostic	Foundational	Medium	Why did it happen?	SQL, Python, BI Tools
04	Predictive	Foundational	High	What will happen?	Scikit-learn, XGBoost
05	Prescriptive	Foundational	High	What should we do?	OR-Tools, Gurobi
09	Regression	Statistical	Medium	How are variables related?	R, Python statsmodels
11	Hypothesis Testing	Statistical	Medium	Is this result significant?	R, SciPy, SPSS
19	Clustering	ML & AI	Medium	What groups exist?	Scikit-learn, H2O
21	Anomaly Detection	ML & AI	High	What's unusual?	Isolation Forest, PyOD
29	Cohort Analysis	Behavioral	Low	How do groups change?	SQL, Mixpanel, Amplitude
30	Funnel Analysis	Behavioral	Low	Where do users drop off?	GA4, Mixpanel, SQL
32	RFM Analysis	Behavioral	Low	Who are our best customers?	SQL, Python, Excel
39	Sentiment Analysis	NLP	Medium	How do people feel?	VADER, BERT, HuggingFace
49	Time Series	Time-Based	High	What are the trends?	Prophet, ARIMA, statsmodels
59	Financial Ratio	Business	Low	Is the business healthy?	Excel, SQL, Bloomberg
69	Geospatial	Spatial	Medium	Where is it happening?	QGIS, GeoPandas, ArcGIS
77	A/B Testing	Marketing	Medium	Which variant is better?	Optimizely, VWO, Python
86	Bayesian Analysis	Advanced	High	How do beliefs update?	PyMC, Stan, JAGS
87	Monte Carlo	Advanced	High	What's the range of outcomes?	Python, @RISK, Crystal Ball
100	Weighted Scoring	Advanced	Low	Which option ranks best?	Excel, Python, Decision Tools

The Landscape ofData Analytics

Distribution by Category

Complexity vs. Insight Value

Analytics Maturity Ladder

Industry Usage Frequency