InterviewStack.io LogoInterviewStack.io

Python Data Structures and Algorithms Questions

Core Python data structure and algorithm knowledge used for manipulating collections and solving common data processing problems. Candidates should know built in types such as lists, dictionaries, sets, and tuples and their performance characteristics; be able to implement and reason about searching, sorting, counting, deduplication, and frequency analysis tasks; and choose appropriate algorithms and data structures for time and space efficiency. Familiarity with Python standard library utilities such as collections.Counter, defaultdict, deque, and heapq is expected, as is writing Pythonic, clear code that handles edge cases. Questions may include algorithmic trade offs, complexity analysis, and applying these techniques to practical data manipulation problems where custom logic is required beyond what pandas or NumPy provide.

HardTechnical
22 practiced
Design and implement a compressed prefix tree (radix / PATRICIA trie) in Python that supports insert(word), search(word), and starts_with(prefix). Focus on memory efficiency and speed when storing millions of short strings (e.g., 10M keys). Discuss trade-offs versus a hash table for prefix queries and how to reduce memory overhead in Python.
MediumTechnical
22 practiced
You need to generate combinations of 10 features taken 3 at a time for feature engineering but some feature pairs are invalid. Implement a Python generator using itertools.combinations that prunes invalid combinations early and stops after N valid combinations. Provide code and explain how pruning reduces work and memory use.
MediumTechnical
17 practiced
Explain Python's built-in sorted() stability and the properties of Timsort. In a pipeline where you need to sort records by (primary_key, timestamp) and maintain stability, describe an efficient approach that minimizes memory overhead and respects ordering. When is it better to sort once with a composite key vs multiple stable sorts?
EasyTechnical
24 practiced
Write a Python function top_k(file_path: str, k: int) -> List[int] that reads a file where each line contains a single integer and returns the k largest integers. The file is too large to load into memory. Use heapq and design for O(n log k) time and O(k) memory. Handle k <= 0 and malformed lines gracefully.
MediumTechnical
21 practiced
You have 1000 shard log files each with user IDs per line. Implement a memory-efficient Python approach to compute the top 100 most frequent user_ids across all shards. Consider both a single-machine approach and a distributed/MapReduce-style approach. Use Counters/local top-k and merging strategies to avoid keeping all counts in memory at once.

Unlock Full Question Bank

Get access to hundreds of Python Data Structures and Algorithms interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.