Interview questions

Create Table SQL Primary Key: From Simple Syntax to Interview Answer

August 5, 2025Updated May 15, 202616 min read
Why Understanding How To Create Table Sql With Primary Key Is Crucial For Database Design

Use create table sql primary key syntax to define unique, NOT NULL rows, compare single vs composite keys, and answer interview questions clearly.

Most people who search "create table sql primary key" already know what a primary key is supposed to do. The gap isn't vocabulary — it's that they've never seen the syntax written out cleanly, and they've never had to explain why the constraint works the way it does. This article closes both gaps. The same example that teaches you the correct CREATE TABLE statement will also give you the reasoning you need to answer the interview question without sounding like you memorized a definition off a flashcard.

Start simple. Get the syntax right. Then understand what the database is actually enforcing — and you'll have both the code and the explanation ready.

What a Primary Key Actually Guarantees in a Table

What the database is protecting you from

A primary key in SQL does one structural job: it gives every row in a table a stable, unique identity. That sounds obvious until you think about what happens without it. Two rows with the same customer ID. An order with no identifier at all. A join that returns three rows when you expected one. These aren't edge cases — they're the natural result of a table that has no enforced identity rule.

The primary key constraint stops this at the database level, not at the application level. You don't need to write a check in your code that says "don't insert if this ID already exists." The database refuses the insert for you. That's the real value: the rule is structural, not procedural.

What this looks like in practice

Take a `users` table. Every user needs to be identifiable — by the app, by other tables that reference them, and by any query that tries to update or delete a specific record. If you store `user_id` as a plain integer column with no constraint, nothing stops two rows from having `user_id = 42`. Now every JOIN that references that ID returns two rows, your UPDATE hits both, and your data is silently broken.

Adding `PRIMARY KEY` to `user_id` means the database will reject any insert or update that would create a duplicate value. It also means the column cannot be NULL — a row with no identifier is just as dangerous as a duplicate, and the constraint blocks both. According to the PostgreSQL documentation, a primary key constraint is technically equivalent to having both a `UNIQUE` constraint and a `NOT NULL` constraint on the same column — the database enforces both simultaneously.

That dual guarantee — uniqueness and presence — is what makes a primary key the right anchor for every other table that needs to reference this one.

Write the Simplest Valid CREATE TABLE Statement First

The cleanest single-column version

The CREATE TABLE primary key syntax has two valid forms for a single column: inline and out-of-line. Both work. The inline version is shorter and easier to read for beginners.

That's the minimum. `user_id` is the primary key. It cannot be NULL, it cannot repeat, and the database will enforce both rules on every insert and update. Nothing else is required to make this a valid, working table.

What this looks like in practice

Here's the same table written with the out-of-line constraint syntax, which is useful when you want the constraint named explicitly or when you're defining a composite key (covered next):

Both versions were tested in PostgreSQL 15 and MySQL 8.0 — the syntax is identical across both dialects for a single-column key. The MySQL reference manual documents both inline and table-level PRIMARY KEY definitions. One practical note: if you're working in MySQL and want the primary key column to auto-increment, you'll add `AUTO_INCREMENT` to the column definition (PostgreSQL uses `SERIAL` or `GENERATED ALWAYS AS IDENTITY` instead). The constraint itself is the same; the auto-generation syntax is dialect-specific.

Start with the inline version. It's readable, it works, and you can always refactor to the named constraint form when the schema gets more complex.

Use a Composite Primary Key When the Row Only Makes Sense as a Pair

Why one column is sometimes the wrong answer

A single-column primary key works when one attribute uniquely identifies a row. But some tables represent relationships, not entities — and in those cases, no single column is the real identity. A student can appear in many courses. A course can have many students. Neither `student_id` alone nor `course_id` alone is unique in an `enrollments` table. The unique fact is the combination: this student is enrolled in this course.

Using a surrogate `enrollment_id` as the primary key would technically work, but it would allow the same student to be enrolled in the same course twice, because the database would see two different `enrollment_id` values and consider them distinct rows. The composite primary key is the right tool here because it encodes the actual business rule directly in the schema.

What this looks like in practice

In this schema, `student_id = 1, course_id = 101` can appear exactly once. A second attempt to insert the same pair fails with a primary key violation. Neither column is unique on its own — `student_id = 1` can appear many times (one row per course), and `course_id = 101` can appear many times (one row per student). The combination is what's unique, and the composite primary key reflects that precisely.

The PostgreSQL documentation on composite keys confirms this pattern: a primary key can span multiple columns, and the uniqueness constraint applies to the combination of values across all listed columns. This is the standard approach for junction tables and many-to-many relationships.

Stop Treating PRIMARY KEY Like a Label — It Also Means Unique and NOT NULL

Why the constraint is stricter than people expect

The most common beginner misconception is that `PRIMARY KEY` is just a way to name the important column. It isn't. It's an active constraint that the database engine enforces on every write operation. The moment you define a column as a primary key, the database creates a unique index on it and adds a NOT NULL restriction — automatically, without you writing either of those constraints explicitly.

This matters because it changes what you can and can't insert. If you're coming from a language where "primary key" is just metadata, the database's refusal to accept a NULL or duplicate value can feel surprising. It shouldn't. The constraint is doing exactly what it was designed to do.

What this looks like in practice

Run these two inserts against the `users` table from Section 2:

The duplicate insert returns something like `ERROR: duplicate key value violates unique constraint "users_pkey"` in PostgreSQL. The NULL insert returns `ERROR: null value in column "user_id" of relation "users" violates not-null constraint`. Both errors are immediate — the database doesn't write a partial row and clean up later. The constraint fires before the row is committed.

The first time you see those errors, they're confusing. The second time, they're reassuring. The database is telling you that your identity rule is working.

Use ALTER TABLE When the Table Already Exists, But Don't Assume It Will Save Bad Data

Why retrofitting a key is more awkward than creating one cleanly

`ALTER TABLE` lets you add a primary key to a table that was created without one. This is genuinely useful — schemas evolve, migrations happen, and sometimes a table gets created without a key during a prototype phase. The syntax is straightforward. The catch is that the database will validate the existing data against the new constraint before accepting it. If any existing row has a NULL in the target column, or if any two rows share the same value, the `ALTER TABLE` command fails.

The constraint doesn't fix bad data. It refuses to exist alongside it.

What this looks like in practice

Before running this in a live database, check for problems first:

If either query returns rows, you have cleanup to do before the `ALTER TABLE` will succeed. In practice, this means either deleting the bad rows, assigning correct IDs, or deduplicating — none of which is automatic. The MySQL ALTER TABLE documentation notes that adding a primary key triggers a full table rebuild in InnoDB, so on large tables this is also a performance consideration worth scheduling during low-traffic windows.

The lesson: define the primary key at creation time whenever possible. Retrofitting works, but it requires the data to already be clean.

Choose Between an Auto-Increment ID and a Natural Key on Purpose

Why the easy default is not always the best design

An auto-increment ID — `SERIAL` in PostgreSQL, `AUTO_INCREMENT` in MySQL — is the path of least resistance. The database generates the value, you never have to think about it, and every row gets a guaranteed-unique integer. That's genuinely convenient, and for most application tables it's the right call.

But "convenient" and "correct" aren't always the same thing. A natural key is a column (or combination of columns) that is already unique in the real world: an email address, a Social Security number, an ISBN, a country code. When that attribute is stable, meaningful, and truly unique, using it as the primary key has real advantages — your queries can join directly on the business attribute, your data is self-documenting, and you avoid maintaining a separate unique constraint on top of a surrogate ID.

What this looks like in practice

Auto-increment ID is cleaner here:

An order has no natural unique identifier in the real world. The database-generated integer is the right anchor.

Natural key is genuinely useful here:

ISO country codes are stable, globally standardized, and already unique. Using `country_code` as the primary key means every foreign key reference to this table is human-readable, and there's no need for a separate `country_id` integer that adds a layer of indirection without adding information.

The tradeoff: natural keys can change. An email address used as a primary key in a `users` table is a common mistake — users change their email, and now you have to cascade that update through every table that references it. Surrogate keys never change because they have no real-world meaning to update. A good engineering rule of thumb, supported by database design literature like Martin Fowler's work on enterprise patterns, is to use surrogate keys for entities that users can edit, and natural keys only for stable reference data.

Remember That Foreign Keys Inherit the Shape of Your Primary Key Choice

Why a primary key decision spreads through the schema

When you define a primary key, you're not just making a decision about one table. Every table that references this table with a foreign key will point at that primary key. If you later decide to change the primary key — add a column, switch from a natural key to a surrogate — you have to update every referencing table too. The primary key choice is a schema-wide commitment.

This is why the surrogate-vs-natural decision matters beyond the immediate table. A composite primary key in a parent table means every child table's foreign key also has to be composite. That's not wrong, but it adds columns and complexity to every join.

What this looks like in practice

The `user_id` column in `orders` must match the data type and constraint of `user_id` in `users`. If you change `users.user_id` to a composite key, `orders.user_id` has to become composite too — and so does every other table that references `users`. The PostgreSQL foreign key documentation is explicit about this: the referenced column must be a primary key or have a unique constraint, and referential integrity is enforced on every insert, update, and delete.

Keep primary keys simple and stable. The foreign key implications are the strongest argument for surrogate integer keys in most application schemas.

Say It Out Loud Like Someone Who Actually Understands It

The 20-second answer that doesn't sound memorized

An interview explanation of primary keys fails in one of two ways: it's too abstract ("it's a unique identifier for a record") or it's too mechanical ("it enforces uniqueness and NOT NULL"). Neither answer shows understanding. The good answer names what the database is protecting you from and connects it to something the interviewer cares about — data integrity and query reliability.

What this looks like in practice

If an interviewer asks "What is a primary key and why does it matter?", here's a version that works:

"A primary key gives every row in a table a stable, unique identity. The database enforces two things automatically: the value has to be unique across all rows, and it can't be NULL — so you can never have a row with no identifier or two rows that look identical from the key's perspective. That matters in practice because other tables reference this key with foreign keys, and if the key isn't stable and unique, your joins return wrong results and your updates hit the wrong rows. I'd default to an auto-increment integer for most tables, but if there's a natural attribute that's already stable and unique — like an ISO country code — I'd use that instead and skip the extra surrogate."

That answer is about 90 words. It covers definition, constraint behavior, practical consequence, and a design tradeoff. It doesn't sound rehearsed because it's built around the reasoning, not the vocabulary. Check it against the SQL you've written in this article — every claim in that answer maps directly to something you can demonstrate in code.

Frequently Asked Questions

Q: What is the simplest valid CREATE TABLE statement that includes a primary key?

The minimum is a table with one column marked `PRIMARY KEY`: `CREATE TABLE users (user_id INT PRIMARY KEY, email VARCHAR(255) NOT NULL);`. That's a complete, working statement in both PostgreSQL and MySQL. The constraint is inline, the column type is explicit, and nothing else is required for the key to be enforced.

Q: Why does a primary key have to be unique and not NULL?

Because a primary key's job is to identify a row unambiguously. A NULL value means "unknown" — you can't use an unknown value as an identifier. A duplicate value means two rows share the same identity, which breaks joins, updates, and deletes that target a specific row. The database enforces both rules simultaneously so you don't have to check for them in application code.

Q: How do I define a primary key on a single column versus multiple columns?

For a single column, use the inline syntax: `column_name INT PRIMARY KEY`. For multiple columns, use a table-level constraint: `CONSTRAINT pk_name PRIMARY KEY (column_a, column_b)`. The composite version is the right choice when no single column uniquely identifies a row — only the combination does, as in a junction table that represents a many-to-many relationship.

Q: How do I add a primary key to an existing table with ALTER TABLE?

Use `ALTER TABLE table_name ADD CONSTRAINT pk_name PRIMARY KEY (column_name);`. Before running it, check that the target column has no NULL values and no duplicate values — the database will reject the command if either condition exists. Clean the data first, then add the constraint.

Q: When should I use an auto-increment ID instead of a natural key like email or employee number?

Use an auto-increment surrogate ID when the natural attribute can change (email addresses, names, phone numbers) or when no single natural attribute is guaranteed unique across all future rows. Use a natural key when the attribute is externally standardized, stable, and already unique — ISO codes, fixed reference data, or identifiers that never get edited. The risk with natural keys is cascading updates: if the key value changes, every foreign key reference has to change with it.

Q: What changes should I watch for when a primary key is referenced by foreign keys?

Any change to the primary key — adding a column, changing its data type, switching from surrogate to natural — requires matching changes in every table that references it. Referential integrity means the database will block inserts into child tables that reference a non-existent parent key, and it will block deletes of parent rows that have dependent child rows (unless you've defined `ON DELETE CASCADE`). Audit your foreign key relationships before modifying any primary key.

Q: How can I explain primary keys clearly in an interview without sounding memorized?

Build the answer around what the database is protecting you from, not around the definition. Say that the key enforces uniqueness and NOT NULL automatically, that this matters because other tables join on it, and that you'd choose between a surrogate integer and a natural key based on whether the attribute is stable. That structure — problem, constraint, consequence, design choice — covers everything the interviewer is testing without sounding like a textbook.

How Verve AI Can Help You Ace Your Coding Interview With SQL Primary Keys

The part of a technical interview that trips most candidates up isn't the syntax — it's the follow-up. You write the `CREATE TABLE` statement correctly, and then the interviewer asks why you chose a surrogate key over a natural one, or what happens if you need to add a primary key to a table that already has data. Those questions test reasoning, not recall, and they're hard to prepare for with static flashcards or reading alone.

Verve AI Coding Copilot is built for exactly that gap. It reads your screen in real time — whether you're working through a SQL problem on HackerRank, a live technical round, or a take-home schema design — and surfaces contextual suggestions based on what you've actually written, not a generic prompt. If you're mid-query and blanking on the composite key syntax, Verve AI Coding Copilot can prompt the right form. If you're explaining your schema choice out loud and drifting, it can anchor you back to the reasoning. The Secondary Copilot feature keeps you focused on one problem at a time without the context-switching that kills momentum in sustained technical rounds. It works across LeetCode, HackerRank, CodeSignal, and live interview environments, and it stays invisible while it does — so the help is there when you need it without breaking your concentration or your interviewer's trust.

Conclusion

You started with a phrase — create table sql primary key — and you now have both halves of what that phrase actually requires: the exact syntax to write it, and the reasoning to explain it. The `users` table from Section 2 is still the right starting point. Write it from memory. Add the constraint inline. Then try inserting a duplicate and watch the database refuse. That five-minute exercise will cement the behavior faster than rereading any explanation.

Then say the 20-second answer out loud. Not to a mirror — to someone, or at least to yourself at normal speaking pace. The gap between understanding a concept and being able to explain it under light pressure is exactly where interview prep falls apart. You've got the SQL. Now practice the sentence.

CR

Casey Rivera

Interview Guidance

Ace your live interviews with AI support!

Get Started For Free

Available on Mac, Windows and iPhone