InterviewStack.io LogoInterviewStack.io

Python Programming & ML Libraries Questions

Python programming language fundamentals (syntax, data structures, control flow, error handling) with practical usage of machine learning libraries such as NumPy, pandas, scikit-learn, TensorFlow, and PyTorch for data manipulation, model development, training, evaluation, and lightweight ML tasks.

MediumTechnical
0 practiced
Given time-series data per user, implement rolling-window feature generation that computes mean, std, min, and max over the past 7 days per user using pandas. Show an efficient approach using groupby + rolling that avoids expanding intermediary dataframes unnecessarily and explain index requirements for groupby. Provide code and discuss performance implications.
HardTechnical
0 practiced
You need to train a large transformer model on a single GPU with 16GB RAM. Describe concrete techniques to reduce memory usage: gradient checkpointing, mixed precision, parameter sharding, activation offloading, reducing batch size, and optimizer state optimizations. For each technique, explain expected memory savings, performance impact, and how to enable it in PyTorch (conceptually or with libraries).
MediumTechnical
0 practiced
Implement a PyTorch Dataset and DataLoader to load images from a directory with on-the-fly augmentation using torchvision.transforms. Additionally, implement a collate_fn to handle variable-length labels (e.g., variable-length captions) so the DataLoader returns a batch of images and a list of caption tensors. Comment on using num_workers and pin_memory for performance.
MediumTechnical
0 practiced
Implement a function mad_outlier_mask(x: np.ndarray, thresh: float = 3.5) -> np.ndarray that returns a boolean mask marking outliers based on the Median Absolute Deviation (MAD). Use vectorized NumPy operations (no Python loops), handle NaNs by ignoring them in medians, and return a mask of the same shape as x.
EasyTechnical
0 practiced
Write a Python function compute_mean_std(nums: List[float]) -> Tuple[float, float] that computes the mean and sample standard deviation in a single pass without using NumPy. Your implementation should be numerically stable for long streams of values (hint: use Welford's algorithm). Provide the function code and briefly explain time and space complexity and why this approach is numerically stable.

Unlock Full Question Bank

Get access to hundreds of Python Programming & ML Libraries interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.