InterviewStack.io LogoInterviewStack.io

Data augmentation and handling distribution shift Questions

Master augmentation techniques (random crops, flips, rotations, color jittering, mixup, CutMix). Understand why augmentation helps. Discuss domain adaptation and techniques for handling domain shift in production systems.

HardTechnical
70 practiced
Propose a training strategy that combines self-supervised learning (SSL) with augmentation to handle domain shift: specify pretext tasks, augmentation choices appropriate for contrastive learning (e.g., SimCLR), and a fine-tuning schedule for downstream supervised tasks. Provide concrete recommendations for vision and for text, and explain how SSL improves robustness to domain change.
EasyTechnical
88 practiced
Explain the SMOTE algorithm for dealing with imbalanced tabular data. Describe step-by-step how synthetic samples are generated in feature space, mention limitations (e.g., introducing borderline or noisy synthetic samples), and suggest practical checks or downstream model adjustments you would apply in a production ML pipeline after SMOTE.
HardTechnical
91 practiced
Implement importance weighting for covariate shift correction in PyTorch. Given labeled source data (x_s, y_s) and unlabeled small target data x_t, describe how to train a logistic regression classifier to distinguish source vs target, compute importance weights w(x)=p_t(x)/p_s(x) from classifier outputs, and show how to incorporate these weights into a PyTorch training loop (weighted loss). Mention normalization and regularization to control variance.
MediumTechnical
86 practiced
Describe augmentation strategies for multivariate time-series classification and forecasting. Explain jittering, scaling, permutation, time-warping, window slicing, and how to preserve label semantics for forecasting horizons vs classification. Provide a code outline to augment sliding windows while maintaining temporal coherence across channels.
MediumTechnical
76 practiced
Contrast adversarial augmentation (e.g., FGSM/PGD-based perturbations) with random augmentations. Explain when adversarial augmentations are appropriate, how they affect robustness and calibration, and how to balance adversarial examples with natural augmentations during training in production systems.

Unlock Full Question Bank

Get access to hundreds of Data augmentation and handling distribution shift interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.