This track is for interview rounds where you are asked to code, not just explain concepts.
It complements the rest of the repo by focusing on:
- Python problem solving under time pressure
- SQL query writing for analytics and data workflows
- algorithm selection and complexity trade-offs
- practice topics that show up in ML engineer, data engineer, analytics engineer, and AI platform interviews
Prioritize these topics in order:
- Arrays, strings, hash maps, sets
- Sorting, heaps, binary search
- Stacks, queues, recursion
- Trees and graphs with BFS and DFS
- Sliding window, two pointers, prefix sums
- Dynamic programming basics
- Matrix and tabular manipulation patterns
- Practical data-processing utilities in Python
Prioritize these topics in order:
SELECT,WHERE,GROUP BY,HAVING- Joins and null handling
- Window functions
- CTEs and layered query design
- Deduplication and ranking
- Time-series aggregation and date bucketing
- Funnel, retention, and cohort patterns
- Performance reasoning and data-model awareness
Use this track with these concept-heavy guides:
- Statistics & Probability Guide
- Classical ML Overview
- Feature Engineering
- Time Series
- DuckDB Complete Guide
- dbt Interview Q&A
- Backend System Design Interview Guide
Why these matter:
- Python challenges often test the same thinking you need for feature engineering, preprocessing, metrics, and pipeline code.
- SQL challenges often mirror warehouse analytics, dbt modeling, experimentation analysis, and event data work.
- Backend interview loops regularly combine algorithmic reasoning with system trade-offs.
| Role | Likely coding focus | What to master |
|---|---|---|
| ML Engineer | Python, arrays, metrics, data transforms | hash maps, heaps, matrices, feature logic, complexity |
| AI / Applied AI Engineer | Python, APIs, parsing, async workflows | strings, recursion, queues, graph traversal, data shaping |
| Data Engineer | SQL and Python scripting | joins, windows, CTEs, dedup, batching, file transforms |
| Analytics Engineer | SQL-heavy | aggregations, windows, date logic, dbt-style layered queries |
| MLOps / Platform | Python and systems-flavored coding | queues, retries, parsing, state handling, complexity trade-offs |
- Solve one easy-to-medium array or string problem
- Solve one medium graph, heap, or interval problem
- Review complexity and edge cases out loud
- Write one aggregation query
- Write one window-function query
- Rewrite one query with a cleaner CTE structure
- revisit mistakes
- classify them as logic, syntax, edge case, or complexity mistakes
- write the shortest correct explanation you would give in an interview
You do not need the fanciest solution first. You need to show:
- a correct baseline
- awareness of time and space complexity
- clarity on edge cases
- ability to improve the solution step by step
- readable Python or SQL under pressure
In practice:
- start with the brute-force version if needed
- state its complexity clearly
- then optimize
You are in good shape if you can do the following without much hesitation:
- use a dictionary or set immediately when lookup speed matters
- explain when to choose sorting versus a heap
- write BFS and DFS without searching for syntax
- use
row_number(),rank(),lag(), and running aggregates in SQL - deduplicate rows with a window function
- reason about time-bucketed analytics queries
If not, start with the two practice guides in this track and follow the linked concept guides.