Interview questions

SQL Practical Interview Questions: 25 Answers Using a Think-Aloud Script

July 3, 2025Updated May 9, 202619 min read
Top 30 Most Common Sql Practical Interview Questions You Should Prepare For

Master SQL practical interview questions with a think-aloud script for joins, NULLs, ranking, and window functions in live interviews.

Most SQL interview failures aren't knowledge failures. You know what a JOIN does. You've written GROUP BY a hundred times. The problem is that the moment a prompt gets slightly unfamiliar — a three-table schema, a ranking question with ties, a left join that produces unexpected NULLs — the gap between knowing the answer and being able to explain it live becomes a chasm. SQL practical interview questions don't test whether you've memorized syntax. They test whether you can think through a data problem out loud while someone is watching.

This article teaches you a repeatable think-aloud script for answering sql practical interview questions clearly, even when you don't immediately know the final query. Every section builds one piece of that script: how to restate the question, how to choose the right SQL shape, how to explain edge cases without bluffing, and how to handle follow-ups before the interviewer asks them.

The script isn't complicated. But most candidates have never practiced it deliberately, which is why so many strong SQL writers come across as weak SQL thinkers.

Start with the Business Question, Not the Query

The most common mistake in SQL interviews isn't a syntax error. It's opening with SELECT before you've confirmed what you're actually trying to compute. Interviewers notice immediately — not because jumping to code is wrong in principle, but because it signals that you're pattern-matching to a memorized answer rather than reasoning about the specific problem in front of you.

How Do You Restate the Ask Without Sounding Lost?

The move is simple: translate the prompt into one plain-English sentence and confirm the grain of the result before you write a single character of SQL. The grain is the level of detail in the output — one row per customer, one row per customer per month, one row per order. Getting this wrong means your query produces the right shape but the wrong answer, and interviewers who've seen hundreds of candidates know exactly when that's happened.

A weak answer to "find our top customers by monthly revenue" looks like this: the candidate immediately writes `SELECT customer_id, SUM(revenue)` and starts adding clauses. A strong answer sounds like: "So we want one row per customer per month, ranked by total revenue within that month — is that right? I'm assuming we're working from an orders table with a customer ID, an order date, and a revenue column." That's it. Two sentences. You've named the grain, named the tables, and confirmed the output shape. You haven't written any SQL yet, and you're already ahead of most candidates. According to SHRM's guidance on structured interviewing, interviewers consistently rate communication of reasoning as a stronger signal of job readiness than solution correctness alone.

What Does a Clean Think-Aloud Intro Actually Sound Like?

Take the prompt: "Find the top 3 customers by monthly revenue." Here's the opening script:

"Okay — so the output I'm imagining is a list of customers, grouped by month, with their total revenue for that month, and I want to keep only the top 3 per month. I'll need to aggregate by customer and month first, then rank within each month, then filter. Let me start with the aggregation step."

That's the whole intro. You've named the time window, the grouping level, the aggregation, and the filtering step — all before touching the keyboard. The interviewer now knows you understand what you're building. Everything after this is mechanics.

Choose the Right SQL Shape Before You Touch the Keyboard

One of the most common SQL interview questions for data analyst roles involves a multi-step problem where several approaches could technically work — JOIN, subquery, HAVING, window function — and the interviewer is watching to see if you know when each one is appropriate. The wrong move is to reach for whichever one you practiced most recently.

When Is JOIN the Right Move, and When Is It Overkill?

JOIN is the right move when you need columns from two different tables in the same output row. A customer's name lives in a `customers` table; their order total lives in an `orders` table. You need both. JOIN is the answer.

JOIN becomes overkill when you only need to check for existence or filter based on a condition in another table. If you want customers who have ever placed an order, you don't need to JOIN and then DISTINCT away duplicates — you need EXISTS or a subquery. The JOIN version works, but it creates an intermediate result set that's wider and messier than the problem requires. In an interview, saying "I could JOIN here, but since I only need to check existence, I'd use EXISTS instead — it's cleaner and the intent is clearer" is worth more than the syntactically correct but verbose JOIN version.

When Should You Reach for a Subquery, HAVING, or a Window Function?

Use this prompt as a test: "Find customers whose monthly spend is above the average monthly spend." Three approaches, three different shapes.

A subquery approach calculates the average first in an inner query, then filters against it in the outer query. This is readable and works well when the average is a single value. HAVING works when you want to filter after aggregation in the same query — `HAVING SUM(revenue) > AVG(SUM(revenue))` — but this gets awkward quickly and isn't supported the same way across all SQL dialects. A window function approach calculates the average alongside each row, which makes the logic transparent but adds a layer of nesting when you need to filter on a window result.

The strong interview answer doesn't pick one and pretend the others don't exist. It says: "I'd start with the subquery approach because it's the most readable and the intent is obvious. If the dataset is large and we're running this repeatedly, I'd look at whether a window function version could reduce the number of passes over the data." You've explained the tradeoff out loud. That's what interviewers are listening for.

How Do You Explain the Grain Before You Write GROUP BY?

Grain confusion is the source of more wrong SQL answers than any other single error. A sales-by-month query that groups on `customer_id` and `month` produces one row per customer per month. A query that groups on just `month` produces one row per month. A query that doesn't group at all produces row-level detail. These are three completely different result sets, and the difference matters enormously.

Before writing GROUP BY, say the grain out loud: "I want one row per customer per month, so I'll group on customer_id and the month extracted from order_date." This forces you to confirm the output shape before you commit to the clause, and it gives the interviewer a clear signal that you're not just copying a pattern. The PostgreSQL documentation on GROUP BY makes this explicit: every column in SELECT that isn't an aggregate must appear in GROUP BY, which means the grain is baked directly into the syntax.

Explain Joins, Duplicates, and NULLs Without Sounding Nervous

Practical SQL interview questions about joins, NULLs, and duplicates are where memorized answers break down fastest, because the interviewer can always add one detail that changes the answer. The goal isn't to have a perfect answer ready. It's to have a framework for reasoning through the edge case out loud.

Why Do Duplicate Rows Show Up After a Join?

Duplicates after a join are almost always a grain mismatch, not a mistake. If a customer has three orders and you join the `customers` table to the `orders` table on `customer_id`, you get three rows for that customer — one per order. That's correct behavior. The "duplication" is the one-to-many relationship expressing itself in the result set.

The interview-safe explanation is: "The join multiplied rows because the orders table has multiple rows per customer. If I want one row per customer, I need to aggregate before or after the join, depending on what I'm computing." This tells the interviewer you understand what multiplication is happening, which is the actual question behind the question.

How Do You Talk Through NULLs Without Bluffing?

A LEFT JOIN between customers and orders will produce NULL in the order columns for any customer who has never placed an order. If an interviewer asks what those NULLs mean, the answer matters. NULL doesn't mean zero. It means the join found no matching row — the data is absent, not zero-valued. A customer with NULL in the revenue column didn't generate zero revenue; they generated no record at all.

The practical consequence: `SUM(revenue)` will ignore NULLs, which might be what you want. But `COUNT(revenue)` will also ignore NULLs, while `COUNT()` will not. If you're counting customers who placed orders, `COUNT(order_id)` and `COUNT()` will give different answers after a LEFT JOIN. Say that out loud in the interview. "I'm using COUNT(order_id) here rather than COUNT(*) because I want to count only customers who have a matching order — the NULLs from the left join should be excluded."

Why Is NOT IN Such a Trap in Interviews?

`NOT IN` is one of the most dangerous constructs in SQL when the subquery can return NULLs, and it's a classic practical SQL interview question precisely because so many candidates don't know why. If the subquery inside `NOT IN` returns even one NULL, the entire NOT IN condition returns no rows. This is because SQL uses three-valued logic: a value compared to NULL is neither true nor false, it's unknown — and `NOT IN` propagates that unknown to the whole result.

The safer alternatives are `NOT EXISTS` or a LEFT JOIN with a `WHERE foreign_key IS NULL` filter. Both handle NULLs predictably. In an interview, the answer is: "I'd avoid NOT IN here because if the subquery returns any NULLs, the whole query returns nothing — which is probably not what we want. I'd use NOT EXISTS instead." The MySQL documentation on NULL comparison behavior documents this behavior explicitly and it's consistent across major dialects.

Use Ranking Patterns Without Getting Cute

SQL query interview questions about ranking and top-N problems are where the gap between memorized syntax and real understanding is most visible. The classic "second highest salary" question has been asked in interviews for fifteen years, and interviewers still use it because the naive answer breaks on ties.

What Is the Cleanest Way to Answer Top-N Questions?

For a simple top-3 customers by total revenue with no grouping by time period, `ORDER BY total_revenue DESC LIMIT 3` is correct and you should say so. Don't reach for a window function when a sort and a limit solves the problem cleanly. The think-aloud version: "I'll aggregate revenue by customer, sort descending, and take the top 3. LIMIT 3 is enough here because there's no grouping dimension — I want the top 3 globally."

The moment you add a time dimension — top 3 per month — LIMIT no longer works. You need to rank within each partition, which means a window function. Say the transition out loud: "Now that we're looking at top 3 per month, I can't use LIMIT because that only applies to the whole result set. I'll use RANK() or DENSE_RANK() partitioned by month."

How Do You Handle the Second-Highest Salary Question?

Three approaches, each with a different failure mode. The MAX subquery (`SELECT MAX(salary) FROM employees WHERE salary < (SELECT MAX(salary) FROM employees)`) is readable and works in most cases, but breaks if there are no distinct second-highest values. The DISTINCT approach (`SELECT DISTINCT salary ORDER BY salary DESC LIMIT 1 OFFSET 1`) is cleaner and handles duplicates correctly. The window function approach (`SELECT salary FROM (SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk FROM employees) WHERE rnk = 2`) is the most explicit and the easiest to extend to any nth value.

In an interview, lead with the window function version and explain why: "I'd use DENSE_RANK here because it handles ties — if two people share the highest salary, I still want the actual second-distinct salary level, not the second row." Then mention that the subquery version works for simple cases. You've shown you know the tradeoffs, not just the syntax.

What Do You Say When the Interviewer Adds Ties and Duplicates?

This is the follow-up that breaks memorized answers. If two employees share the highest salary, `ROW_NUMBER()` assigns them different ranks — one of them gets rank 1, the other gets rank 2, and the "second highest" answer changes depending on arbitrary row ordering. `RANK()` gives them both rank 1 and skips rank 2. `DENSE_RANK()` gives them both rank 1 and makes the next distinct salary rank 2.

The answer for most business problems is `DENSE_RANK()`, because it reflects the intuitive meaning of "second highest" — the second distinct value, not the second row. Say that reasoning out loud. "With ties, ROW_NUMBER breaks down because the ordering is arbitrary. DENSE_RANK gives me the second distinct salary level, which is what the question is actually asking for." The PostgreSQL window function documentation covers the behavioral differences between all three in detail.

Solve Real Interview Prompts the Way Strong Candidates Do

How Do You Answer Top 3 Customers by Monthly Revenue?

Full think-aloud arc: "I want one row per customer per month, ranked by total revenue within that month, keeping only the top 3. Step one: aggregate revenue by customer and month. Step two: rank within each month using DENSE_RANK() partitioned by month, ordered by revenue descending. Step three: wrap that in a subquery or CTE and filter where rank is 3 or less."

The SQL follows naturally from the narration. The interviewer has been following your reasoning the entire time, so even if you make a small syntax error, they already know you understand the problem. That's the whole point of the script.

How Do You Answer Second-Highest Salary With Edge Cases?

Weak version: `SELECT MAX(salary) FROM employees WHERE salary < (SELECT MAX(salary) FROM employees)`. This is a one-liner that works on clean data and breaks when there are ties at the top or when the table has only one distinct salary value.

Strong version: state the edge cases first, then write the query. "I want the second-distinct salary level, so I'll use DENSE_RANK to handle ties. I'll also need to handle the case where there is no second salary — if everyone has the same salary, this should return NULL rather than an error." The window function version with a NULL-safe outer filter handles both. Saying that out loud before writing a single line of code is what separates a strong answer from a memorized one.

How Do You Explain a Messy Multi-Table Business Question End to End?

Scenario: you have `orders`, `refunds`, and `customers` in separate tables. You need net revenue per customer for the last 90 days. The think-aloud: "I'll start by joining orders to refunds on order_id — that's a LEFT JOIN because not every order has a refund. Then I'll compute net revenue as order_amount minus COALESCE(refund_amount, 0), because refund_amount will be NULL for orders with no refund and I want to treat that as zero. Then I'll join to customers to get the customer name, filter on order_date in the last 90 days, and group by customer_id. Let me sketch the CTE structure before I write the full query."

You've narrated joins, NULL handling, date filtering, and aggregation in sequence. The interviewer can follow every decision. That's the goal.

Answer the Follow-Ups Before the Interviewer Asks Them

SQL problem-solving questions rarely end with the first query. The follow-up is where the real evaluation happens, and strong candidates anticipate it.

What If They Ask Why You Chose That Query Shape?

Defend the choice in plain language: speed, readability, and edge-case safety. "I used a CTE instead of a nested subquery because the logic has three steps and a CTE makes each step readable on its own. Performance-wise they're equivalent in most modern optimizers, but the CTE is easier to debug if the output looks wrong." That's a complete answer. You don't need to cite benchmarks — you need to show that your choice was deliberate.

What If They Ask How Your Query Behaves on Bad Data?

Name the failure modes directly. Missing customer IDs will cause LEFT JOIN rows to appear without a name — I'd add a filter or a COALESCE on the name column. Duplicated order IDs in the refunds table would inflate the refund total — I'd deduplicate refunds before the join. Partial refunds where refund_amount exceeds order_amount would produce negative net revenue — I'd add a GREATEST(0, net_revenue) guard if the business rule says net revenue can't be negative.

You don't have to solve every problem. You have to show that you thought about them.

What If They Push on Performance or EXPLAIN?

Keep it honest and high-level. "I'd look at the EXPLAIN output to see if we're doing a sequential scan on a large table where an index scan would be faster. The most common fix is an index on the join key or the filter column — in this case, an index on order_date would help the 90-day filter significantly." You're not expected to be a DBA in a data analyst interview. You are expected to know that EXPLAIN exists, that sequential scans on large tables are a flag, and that indexes on filter and join columns are the first lever to pull.

Practice Speaking SQL Out Loud Until It Stops Feeling Weird

What Is the 10-Minute Drill That Builds the Habit Fastest?

Pick one prompt. Spend one minute restating it in plain English and naming the grain. Spend five minutes drafting the query. Spend the remaining four minutes explaining every clause out loud as if the interviewer just asked "why did you write it that way?" — including the parts that feel obvious. The goal isn't a perfect query in five minutes. The goal is to make the narration feel natural, because right now it probably doesn't. Research on deliberate practice from Anders Ericsson's work on skill acquisition shows that focused repetition of the specific skill — not just general study — is what builds fluency under pressure.

How Do You Compare a Weak Answer With a Strong One?

Weak: candidate reads the prompt, immediately types `SELECT customer_id, SUM(amount) FROM orders GROUP BY customer_id ORDER BY 2 DESC LIMIT 3`, says "that's it," and waits.

Strong: candidate says "one row per customer, total revenue, top 3 globally — no time dimension, so LIMIT works here. I'm using SUM(amount) and I'll alias it as total_revenue for clarity. The ORDER BY is on the alias descending. One thing I'd check: are there any NULLs in the amount column? If so, SUM will ignore them, which is probably correct, but worth confirming."

Same query. Completely different interview. The second candidate has given the interviewer three additional signals: they understand the grain, they chose LIMIT deliberately, and they thought about NULLs without being asked. That's what the think-aloud script produces.

How Verve AI Can Help You Ace Your Coding Interview With SQL

The structural problem in SQL interview prep isn't that candidates don't practice queries — it's that they practice in silence. They write the SQL, check if it runs, and move on. What they never practice is the narration: explaining the grain, naming the edge cases, defending the query shape out loud while someone is evaluating them. That gap is what the think-aloud script addresses, and it's also where Verve AI Coding Copilot fits in.

Verve AI Coding Copilot reads your screen during live technical rounds and practice sessions, seeing the exact prompt you're looking at and responding to what you're actually doing — not a canned script. For SQL problems, that means Verve AI Coding Copilot can surface the edge case you're about to miss, suggest the grain clarification you forgot to state, or flag that your NOT IN might return empty results if the subquery produces NULLs. It works across LeetCode, HackerRank, CodeSignal, and live technical rounds, so the environment where you practice matches the environment where you're evaluated.

The Secondary Copilot feature is particularly useful for multi-step SQL problems — the kind where you need to hold the aggregation logic, the join structure, and the ranking pattern in your head simultaneously. Verve AI Coding Copilot tracks your reasoning step by step and stays invisible to screen share, so you can use it in a live interview without disruption. If the interviewer adds a twist — ties in the ranking, NULLs in the refund table, a third table you didn't expect — Verve AI Coding Copilot responds to the updated prompt in real time, not to whatever you prepped for last week.

The real-time screen-aware capability is what makes this different from a flashcard app or a static question bank. You're not reviewing what you already know. You're getting live support on the exact problem in front of you, at the moment you need it.

Conclusion

The interview room doesn't care how many SQL queries you've written. It cares whether you can explain one query clearly under pressure, with someone watching, on a problem you've never seen before. The think-aloud script this article lays out — restate the question, name the grain, choose the SQL shape, explain the edge cases, anticipate the follow-ups — isn't a trick. It's just the structured version of how strong SQL thinkers already work.

Before your next practice session, pick the hardest prompt you can find and speak it first. Say the grain out loud. Say which tables you'll join and why. Say what happens if there are NULLs. Write the SQL last. That sequence is the part that changes how you come across — and it's the part that almost nobody practices deliberately.

AT

Avery Thompson

Interview Guidance

Ace your live interviews with AI support!

Get Started For Free

Available on Mac, Windows and iPhone