Probability and Statistical Inference Questions

Covers fundamental probability theory and statistical inference from first principles to practical applications. Core probability concepts include sample spaces and events, independence, conditional probability, Bayes theorem, expected value, variance, and standard deviation. Reviews common probability distributions such as normal, binomial, Poisson, uniform, and exponential, their parameters, typical use cases, computation of probabilities, and approximation methods. Explains sampling distributions and the Central Limit Theorem and their implications for estimation and confidence intervals. Presents descriptive statistics and data summary measures including mean, median, variance, and standard deviation. Details the hypothesis testing workflow including null and alternative hypotheses, p values, statistical significance, type one and type two errors, power, effect size, and interpretation of results. Reviews commonly used tests and methods and guidance for selection and assumptions checking, including z tests, t tests, chi square tests, analysis of variance, and basic nonparametric alternatives. Emphasizes practical issues such as correlation versus causation, impact of sample size and data quality, assumptions validation, reasoning about rare events and tail risks, and communicating uncertainty. At more advanced levels expect experimental design and interpretation at scale including A B tests, sample size and power calculations, multiple testing and false discovery rate adjustment, and design choices for robust inference in real world systems.

MediumTechnical

69 practiced

Compare Bonferroni correction and the Benjamini-Hochberg (BH) procedure for multiple testing. Given p-values [0.001, 0.02, 0.03, 0.2] (m=4) and alpha=0.05, identify which hypotheses are rejected by Bonferroni and by BH. Explain when each method is preferable in an applied-science setting.

EasyTechnical

49 practiced

You're presenting A/B test results to a product manager who asks: what's the difference between a p-value, a confidence interval, and effect size? Explain each concept in plain language, state what each does and does not tell you, and give an example sentence you would use to summarize results to a non-technical stakeholder.

EasyTechnical

67 practiced

Write a numerically stable single-pass Python function that computes the running mean and (unbiased) sample variance for a stream of numbers (no storing of the full stream). Use Welford's algorithm. The function should support merging two summaries (for parallel processing). Provide code (Python) and brief explanation.

HardTechnical

60 practiced

You observe treatment assignment depends on user behavior in logged observational data. Propose a strategy to estimate the average treatment effect (ATE) of a new personalization algorithm using propensity score methods: describe propensity score estimation, matching, inverse probability weighting (IPW), and doubly-robust estimators. List assumptions (unconfoundedness, overlap) and diagnostics to check in practice.

HardTechnical

53 practiced

You need to control False Discovery Rate (FDR) across thousands of hypothesis tests for feature selection in a machine learning pipeline where features are highly correlated and tests are grouped by feature families. Design a scalable pipeline that computes p-values, adjusts for dependence, optionally uses knockoffs or hierarchical testing, and produces a list of features with controlled FDR. Discuss practical choices, computational shortcuts, and monitoring of FDR in production.

Unlock Full Question Bank

Get access to hundreds of Probability and Statistical Inference interview questions and detailed answers.

Join thousands of developers preparing for their dream job.