InterviewStack.io LogoInterviewStack.io

Python Programming & ML Libraries Questions

Python programming language fundamentals (syntax, data structures, control flow, error handling) with practical usage of machine learning libraries such as NumPy, pandas, scikit-learn, TensorFlow, and PyTorch for data manipulation, model development, training, evaluation, and lightweight ML tasks.

MediumTechnical
0 practiced
You find a Python loop performing element-wise computations on large NumPy arrays that is a performance hotspot. Describe how you would profile the code to find the bottleneck and convert the loop into vectorized NumPy operations. Provide a before-and-after snippet for a concrete example and discuss memory trade-offs (copying vs views).
MediumTechnical
0 practiced
Given time-series data per user, implement rolling-window feature generation that computes mean, std, min, and max over the past 7 days per user using pandas. Show an efficient approach using groupby + rolling that avoids expanding intermediary dataframes unnecessarily and explain index requirements for groupby. Provide code and discuss performance implications.
HardTechnical
0 practiced
Explain numerical stability issues when implementing softmax and cross-entropy from scratch in NumPy for large logits. Provide stable implementations of softmax and log-softmax and show how to compute cross-entropy loss in a numerically stable way. Briefly discuss the Jacobian of softmax and common operations to avoid overflow/underflow.
HardTechnical
0 practiced
Describe how you would implement a performance-critical preprocessing step as a Python/C++ extension for speed. Compare using pybind11 vs Cython: explain the build process, how to pass NumPy arrays without copying, how to manage the GIL, and pitfalls around memory ownership and multi-threading. Include example function signature and high-level steps to integrate into a Python pipeline.
HardSystem Design
0 practiced
Design a lightweight A/B testing framework in Python to evaluate two ML models in production. Include components for deterministic bucketing of users, routing logic, logging and instrumentation, statistical significance calculations (t-test, bootstrap), monitoring multiple metrics, and safeguards against optional stopping (peeking). Describe the API and data schema for experiment events.

Unlock Full Question Bank

Get access to hundreds of Python Programming & ML Libraries interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.