InterviewStack.io LogoInterviewStack.io

Airbnb-Specific Data Patterns Questions

Domain-specific data modeling and analytics patterns used in Airbnb-scale product analytics. Covers data schema design, event and transaction patterns, feature engineering templates for predictive models, cohort and lifecycle analytics, geospatial and temporal data patterns, price and demand forecasting signals, AB testing data patterns, and data quality, governance, and lineage considerations relevant to Airbnb data.

MediumTechnical
0 practiced
For a bookings fact table ingesting ~5TB/day of Parquet data, recommend partitioning and clustering strategies to speed queries commonly filtered by city, date, and host_id. Discuss partition key selection, file size targets, clustering/ordering to improve pruning, and maintenance tasks such as compaction and repartitioning.
MediumTechnical
0 practiced
Describe how you would implement deduplication and late-arrival handling for booking events using Kafka as the source and Spark Structured Streaming as the processor. Include watermark settings, state TTLs, idempotent writes to sinks (e.g., Delta), and strategies when required reprocessing window exceeds the watermark.
MediumTechnical
0 practiced
Write SQL to compute one-year cohort LTV per acquisition cohort using transactions(user_id, amount, occurred_at). Define cohort_month as the month of the user's first transaction and produce: cohort_month, month_since_cohort, cohort_revenue, cumulative_revenue_per_user, and average_LTV. Explain handling of refunds and incomplete cohorts.
HardTechnical
0 practiced
Discuss trade-offs between using a Delta-like upsert (MERGE / CDC) approach versus an append-only immutable event store for bookings and pricing data at Airbnb. Consider query latency, storage costs, backfills, auditability, schema evolution, reconciliation, and regulatory audits. Provide scenarios where each approach is preferable.
EasyTechnical
0 practiced
Given an events table with schema: events(event_id STRING PRIMARY KEY, user_id STRING, event_name STRING, occurred_at TIMESTAMP, listing_id STRING, properties JSON), write a SQL query (Postgres / BigQuery style) to compute Daily Active Users (DAU) and unique guests per listing per day for the last 30 days. Ensure that multiple events by the same user on the same day count as one active user. Explain your query choices and performance considerations.

Unlock Full Question Bank

Get access to hundreds of Airbnb-Specific Data Patterns interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.