InterviewStack.io LogoInterviewStack.io

Python Programming & ML Libraries Questions

Python programming language fundamentals (syntax, data structures, control flow, error handling) with practical usage of machine learning libraries such as NumPy, pandas, scikit-learn, TensorFlow, and PyTorch for data manipulation, model development, training, evaluation, and lightweight ML tasks.

HardTechnical
0 practiced
Implement a custom autograd Function in PyTorch in Python that computes a numerically stable softmax cross-entropy in one operation to reduce intermediate memory usage. Provide a class MyXent(torch.autograd.Function) with static forward and backward methods, using the log-sum-exp trick for stability, and describe what tensors must be saved in ctx for the backward pass. Also discuss ignore_index handling and per-sample weighting.
MediumTechnical
0 practiced
Write an async Python function using aiohttp and asyncio that concurrently downloads a list of URLs, limits concurrency to N simultaneous requests using a semaphore, reuses a single ClientSession, handles timeouts and cancellations gracefully, and writes responses to disk. Explain when async I/O is preferable for ML pipelines such as fetching many small files from remote stores.
MediumTechnical
0 practiced
Implement in Python an efficient, vectorized function to compute pairwise cosine similarity between rows of a 2D NumPy array X with shape (m, d) and return an (m, m) float32 similarity matrix. Provide code that avoids explicit Python loops, explain memory/time complexity, and describe strategies (chunking, approximate nearest neighbors) to handle very large m (e.g., m = 200k) where a full m x m matrix is infeasible.
EasyTechnical
0 practiced
Write pytest unit tests in Python for a function standardize(array) that returns a zero-mean, unit-variance array per column. Include tests for empty input, constant columns (zero variance), typical numeric arrays, and use pytest.approx for floating tolerance. Also discuss how to mock file I/O or slow network calls in unit tests for data loaders.
EasyTechnical
0 practiced
Given a stream or large list of tuples (user_id, score) implement Python code that computes the top 3 users by total aggregated score in a memory-efficient single pass. The solution must support inputs too large to fit entirely in memory. Provide code using collections.Counter or defaultdict and heapq to keep top-k, explain complexity, and discuss how you would modify it for streaming input (infinite stream).

Unlock Full Question Bank

Get access to hundreds of Python Programming & ML Libraries interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.