InterviewStack.io LogoInterviewStack.io

Privacy-Preserving Experiment Design Questions

Techniques and considerations for designing experiments and data collection strategies that protect privacy. Covers methods such as differential privacy, secure aggregation, federated learning, synthetic data, data minimization, consent management, de-identification, and privacy risk assessment, with emphasis on maintaining data utility and regulatory compliance while enabling robust experimentation.

EasyTechnical
0 practiced
You are designing a consent flow and metadata model for an ML-driven feature experiment that collects behavioral data. List essential legal and UX elements (purpose, retention, scope, opt-in/opt-out), describe how you would store and version consent at scale, and explain how to attach consent metadata to experiment datasets to enforce data minimization and per-user privacy budgets during analysis and model training.
MediumTechnical
0 practiced
You need to decide whether to share a synthetic version of an internal customer dataset with analysts. Design an evaluation plan that measures utility and privacy for the synthetic data: include predictive model transferability tests, statistical similarity metrics (marginal and joint), propensity-score classifiers, nearest-neighbor disclosure risk, and a decision rule for safe release based on both utility thresholds and measured disclosure risk.
EasyTechnical
0 practiced
Describe a simple approach to make web A/B testing privacy-preserving by adding noise to metrics such as click-through rates or mean session time. Explain how to calibrate Laplace or Gaussian noise using sensitivity and epsilon, how noise affects p-values and confidence intervals, and provide practical heuristics for choosing epsilon and adjusting sample sizes to retain statistical power.
MediumTechnical
0 practiced
Explain the Bonawitz et al. secure aggregation protocol at the level of key setup, masking contributions, verifying shares, aggregation, and unmasking. Discuss implementation challenges including pairwise key management, handling client failures, bandwidth and CPU constraints on client devices, and suggestions to optimize for mobile clients with intermittent connectivity.
MediumTechnical
0 practiced
Explain how to compute confidence intervals for a mean or a proportion when the published statistic has Gaussian noise added to it for central differential privacy. Provide the formula for adjusted confidence intervals that account for both sampling variance and DP noise variance, and explain how to interpret p-values derived from noisy statistics.

Unlock Full Question Bank

Get access to hundreds of Privacy-Preserving Experiment Design interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.