InterviewStack.io LogoInterviewStack.io

Classification and Regression Fundamentals Questions

Covers the core concepts and distinctions between classification and regression in supervised learning. Classification predicts discrete categories, either binary or multi class, while regression predicts continuous numerical values. Candidates should understand how to format and encode target variables for each task, common algorithms for each family, and the theoretical foundations of representative models such as linear regression and logistic regression. For regression, know least squares estimation, coefficients interpretation, residual analysis, assumptions of the linear model, R squared, and common loss and error measures including mean squared error, root mean squared error, and mean absolute error. For classification, know logistic regression with its sigmoid transformation and probability interpretation, decision trees, k nearest neighbors, and other basic classifiers; understand loss functions such as cross entropy and evaluation metrics including accuracy, precision, recall, F one score, and area under the receiver operating characteristic curve. Also be prepared to discuss model selection, regularization techniques such as L one and L two regularization, handling class imbalance, calibration and probability outputs, feature preprocessing and encoding for targets and inputs, and trade offs when choosing approaches based on problem constraints and data characteristics.

HardTechnical
26 practiced
A classification model in production produces well-calibrated probability scores during training, but in production you notice overconfident probabilities (e.g., predicted 0.9 but actual rate ~0.6). Describe steps to diagnose the cause and methods to recalibrate probabilities in a BI scoring pipeline.
MediumTechnical
27 practiced
You observe that your regression model residuals have a pattern when plotted against fitted values, suggesting non-linearity. As a BI analyst, how would you modify features or the modeling approach to address this, and what visualizations would you add to your report to show improvement?
HardTechnical
29 practiced
A linear regression model's residuals show autocorrelation (Durbin-Watson statistic indicates non-independence). As a BI analyst predicting weekly sales, explain why this matters and outline practical remedies you would apply in the modeling pipeline.
HardTechnical
25 practiced
You suspect a model is exhibiting bias against a particular customer demographic. Outline a checklist of analyses and remediation steps you would take as a BI analyst to detect, quantify, and mitigate bias while maintaining model utility.
HardTechnical
23 practiced
A BI dashboard shows predicted sales and prediction intervals from a linear regression model. Explain how you would compute 95% prediction intervals for individual predictions and how these differ from 95% confidence intervals for the mean prediction. Provide the formulaic intuition and how you'd present this in a dashboard.

Unlock Full Question Bank

Get access to hundreds of Classification and Regression Fundamentals interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.