InterviewStack.io LogoInterviewStack.io

Scikit Learn, Pandas, and NumPy Usage Questions

Practical proficiency with these core libraries. Pandas: DataFrames, data manipulation, handling missing values. NumPy: arrays, vectorized operations, mathematical functions. Scikit-learn: preprocessing, model fitting, evaluation metrics, pipelines. Knowing standard patterns and APIs. Writing efficient, readable code using these libraries.

EasyTechnical
71 practiced
Using scikit-learn, write code to compute precision, recall, f1-score, and the confusion matrix for a binary classifier given y_true and y_pred numpy arrays. Explain the conceptual difference between precision and recall and give an example scenario where you would favor precision over recall.
HardTechnical
54 practiced
Design a scalable feature selection approach with scikit-learn for 50k features (e.g., TF-IDF features) before training a logistic regression. Compare SelectFromModel using L1-regularized logistic regression, variance thresholding, and mutual information. Provide code using SelectFromModel with LogisticRegression(solver='saga', penalty='l1') and discuss runtime and sparsity trade-offs.
HardTechnical
75 practiced
Describe how to reliably serialize a scikit-learn Pipeline that contains custom transformer classes and third-party objects so it can be loaded on a different machine for production. Discuss pitfalls of pickle/joblib, define best practices (avoid lambdas, ensure import path for classes, include environment dependencies), and explain when converting to ONNX is advantageous.
MediumTechnical
116 practiced
Using numpy.einsum, implement a function that computes the matrix of pairwise dot products between two 2D arrays A (n x d) and B (m x d). Provide the einsum expression, compare to A.dot(B.T) in simple benchmarks, and explain when einsum is advantageous for more complex contractions.
EasyTechnical
55 practiced
Explain the difference between a NumPy view and a copy. Give short Python examples showing slicing that returns a view and when operations produce a copy. Explain memory implications and how to detect whether an array owns its data.

Unlock Full Question Bank

Get access to hundreds of Scikit Learn, Pandas, and NumPy Usage interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.