InterviewStack.io LogoInterviewStack.io

Classification and Regression Fundamentals Questions

Covers the core concepts and distinctions between classification and regression in supervised learning. Classification predicts discrete categories, either binary or multi class, while regression predicts continuous numerical values. Candidates should understand how to format and encode target variables for each task, common algorithms for each family, and the theoretical foundations of representative models such as linear regression and logistic regression. For regression, know least squares estimation, coefficients interpretation, residual analysis, assumptions of the linear model, R squared, and common loss and error measures including mean squared error, root mean squared error, and mean absolute error. For classification, know logistic regression with its sigmoid transformation and probability interpretation, decision trees, k nearest neighbors, and other basic classifiers; understand loss functions such as cross entropy and evaluation metrics including accuracy, precision, recall, F one score, and area under the receiver operating characteristic curve. Also be prepared to discuss model selection, regularization techniques such as L one and L two regularization, handling class imbalance, calibration and probability outputs, feature preprocessing and encoding for targets and inputs, and trade offs when choosing approaches based on problem constraints and data characteristics.

HardTechnical
0 practiced
Design a simple A/B test to compare a new model that predicts customer upgrade propensity against the current rule-based approach. Specify the test metric(s), sample size considerations, randomization strategy, and how BI dashboards should report interim results safely.
EasyTechnical
0 practiced
Discuss the implications of using accuracy as the primary metric for a classification problem where classes are imbalanced and one class is business-critical. Provide at least three reasons why accuracy can be misleading and suggest alternative metrics with BI-friendly explanations.
MediumTechnical
0 practiced
You're choosing between using K-Nearest Neighbors (KNN) and decision trees for a classification task predicting fraud. Describe the trade-offs in terms of interpretability, feature scaling, handling of categorical variables, and performance on sparse high-dimensional data. Which would you prefer for a BI-driven fraud alerting dashboard and why?
HardTechnical
0 practiced
Describe how to incorporate sample weights into training for both regression and classification models when historical sampling favored high-value customers. Explain how sample weighting affects loss functions and how you would validate the weighted model's performance for fairness across customer segments.
EasyTechnical
0 practiced
Explain the difference between mean absolute error (MAE) and mean squared error (MSE) in terms of business impact. Provide an example where MAE is preferred for reporting to product managers and another where MSE (or RMSE) better reflects the business objective.

Unlock Full Question Bank

Get access to hundreds of Classification and Regression Fundamentals interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.