Python Programming & ML Libraries Questions

Python programming language fundamentals (syntax, data structures, control flow, error handling) with practical usage of machine learning libraries such as NumPy, pandas, scikit-learn, TensorFlow, and PyTorch for data manipulation, model development, training, evaluation, and lightweight ML tasks.

MediumTechnical

0 practiced

You find a Python loop performing element-wise computations on large NumPy arrays that is a performance hotspot. Describe how you would profile the code to find the bottleneck and convert the loop into vectorized NumPy operations. Provide a before-and-after snippet for a concrete example and discuss memory trade-offs (copying vs views).

MediumTechnical

0 practiced

Given time-series data per user, implement rolling-window feature generation that computes mean, std, min, and max over the past 7 days per user using pandas. Show an efficient approach using groupby + rolling that avoids expanding intermediary dataframes unnecessarily and explain index requirements for groupby. Provide code and discuss performance implications.

HardTechnical

0 practiced

Explain numerical stability issues when implementing softmax and cross-entropy from scratch in NumPy for large logits. Provide stable implementations of softmax and log-softmax and show how to compute cross-entropy loss in a numerically stable way. Briefly discuss the Jacobian of softmax and common operations to avoid overflow/underflow.

HardTechnical

0 practiced

Describe how you would implement a performance-critical preprocessing step as a Python/C++ extension for speed. Compare using pybind11 vs Cython: explain the build process, how to pass NumPy arrays without copying, how to manage the GIL, and pitfalls around memory ownership and multi-threading. Include example function signature and high-level steps to integrate into a Python pipeline.

HardSystem Design

0 practiced

Design a lightweight A/B testing framework in Python to evaluate two ML models in production. Include components for deterministic bucketing of users, routing logic, logging and instrumentation, statistical significance calculations (t-test, bootstrap), monitoring multiple metrics, and safeguards against optional stopping (peeking). Describe the API and data schema for experiment events.

Unlock Full Question Bank

Get access to hundreds of Python Programming & ML Libraries interview questions and detailed answers.

Join thousands of developers preparing for their dream job.