InterviewStack.io LogoInterviewStack.io

Python Programming & ML Libraries Questions

Python programming language fundamentals (syntax, data structures, control flow, error handling) with practical usage of machine learning libraries such as NumPy, pandas, scikit-learn, TensorFlow, and PyTorch for data manipulation, model development, training, evaluation, and lightweight ML tasks.

HardTechnical
0 practiced
Implement a custom autograd Function in PyTorch in Python that computes a numerically stable softmax cross-entropy in one operation to reduce intermediate memory usage. Provide a class MyXent(torch.autograd.Function) with static forward and backward methods, using the log-sum-exp trick for stability, and describe what tensors must be saved in ctx for the backward pass. Also discuss ignore_index handling and per-sample weighting.
HardTechnical
0 practiced
Describe quantization-aware training (QAT) versus post-training quantization (static and dynamic) for neural networks in PyTorch. Provide Python code snippets for applying dynamic quantization to an LSTM and static quantization to a CNN, explain calibration datasets, expected accuracy trade-offs, and hardware support considerations. State when to choose each approach for deployment.
MediumTechnical
0 practiced
Explain how automatic differentiation (autograd) works in PyTorch and TensorFlow. Cover reverse-mode vs forward-mode differentiation, dynamic vs static graph construction, tape-based recording, how gradients are accumulated, and examples of when you would implement a custom backward pass. Give brief code hints for custom autograd in PyTorch and gradient tape usage in TensorFlow.
MediumTechnical
0 practiced
Given a tabular dataset for binary classification with mixed numeric and high-cardinality categorical features, outline a detailed Python preprocessing and feature-engineering plan before training a gradient-boosted tree model. Cover categorical encoding choices (one-hot, target, hashing), strategies to avoid target leakage, K-fold strategies for target encoding, missing value handling, and lightweight feature selection methods. Reference concrete pandas and scikit-learn APIs you would use.
EasyTechnical
0 practiced
Given a stream or large list of tuples (user_id, score) implement Python code that computes the top 3 users by total aggregated score in a memory-efficient single pass. The solution must support inputs too large to fit entirely in memory. Provide code using collections.Counter or defaultdict and heapq to keep top-k, explain complexity, and discuss how you would modify it for streaming input (infinite stream).

Unlock Full Question Bank

Get access to hundreds of Python Programming & ML Libraries interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.