Assessment of a candidates practical proficiency across the technology stack and tools relevant to their role. This includes the ability to list and explain hands on experience with programming languages, frameworks, libraries, cloud platforms, data and machine learning tooling, analytics and visualization tools, and design and prototyping software. Candidates should demonstrate depth not just familiarity by describing specific problems they solved with each tool, trade offs between alternatives, integration points, deployment and operational considerations, and examples of end to end workflows. The description covers developer and data scientist stacks such as Python and C plus plus, machine learning frameworks like TensorFlow and PyTorch, cloud providers such as Amazon Web Services, Google Cloud Platform and Microsoft Azure, as well as design tools and research tools such as Figma and Adobe Creative Suite. Interviewers may probe for evidence of hands on tasks, configuration and troubleshooting, performance or cost trade offs, versioning and collaboration practices, and how the candidate keeps skills current.
HardSystem Design
0 practiced
Product requires sub-second sessionization lookups for personalization while analytics needs hourly aggregated views. Propose an architecture combining stream processing, an online feature store (or low-latency key-value store), and batch materialization for analytics. Address freshness, consistency between online and offline features, cost trade-offs, and monitoring to ensure SLAs.
HardSystem Design
0 practiced
Design a logging, tracing, and metrics architecture for a distributed data platform that includes microservices, ETL jobs, and streaming processors. Cover log aggregation (Fluentd/Logstash -> ELK or managed logging), distributed tracing (OpenTelemetry), correlation IDs across systems, sampling strategies, retention policies, cost controls, and methods to trace a single record end-to-end through the platform.
MediumTechnical
0 practiced
A BigQuery table is partitioned by date and clustered by user_id, but queries scanning the last 30 days still read many bytes and cost more than expected. Provide concrete SQL rewrites and table-structure changes to reduce scanned bytes: examples should include partition filters, avoiding non-deterministic expressions on partition columns, column pruning, and when to use materialized views or denormalization.
EasyTechnical
0 practiced
Design a simple daily data quality pipeline using Great Expectations (or custom validators). Describe important checks (non-null, uniqueness, distributional checks, schema conformance), how you'd schedule and store test results, how to route alerts and remediation, and how to gate downstream pipeline runs when critical checks fail.
HardTechnical
0 practiced
Design a governance and compliance framework for data pipelines that handle PII and PHI. Cover discovery and classification of sensitive fields, masking/tokenization techniques, audit trails, RBAC and KMS-based encryption, retention and deletion workflows (right-to-be-forgotten), and how to validate and demonstrate compliance during audits.
Unlock Full Question Bank
Get access to hundreds of Technical Tools and Stack Proficiency interview questions and detailed answers.