What is the difference between a primary key and a unique key?

SQLhard2 min read

Reviewed by Aditya Kumar · Last reviewed 2026-03-24

A primary key uniquely identifies each row in a table, disallows NULL values, and there can only be one per table. A unique key also enforces uniqueness across specified columns but allows a single…

Why This Question Matters

This hard-level SQL question appears frequently in data engineering interviews at companies like Accenture, Cognizant, EPAM, and 1 others. While less common, it tests deeper understanding that distinguishes strong candidates. Mastering the underlying concepts (spark, sql) will help you answer variations of this question confidently.

How to Approach This

This is a senior-level question that tests architectural thinking. Lead with the high-level design, then drill into specifics. Discuss trade-offs explicitly - there is rarely one correct answer. Show awareness of scale, fault tolerance, and operational complexity. The expert answer includes a code example that demonstrates the implementation pattern.

A primary key uniquely identifies each row in a table, disallows NULL values, and there can only be one per table. A unique key also enforces uniqueness across specified columns but allows a single NULL value in a column, and multiple unique keys can exist on a table.

Mechanics and Purpose

Primary Key (PK): Its fundamental purpose is to define the identity of a record. It is crucial for establishing referential integrity, as other tables can link to it via foreign keys. A PK inherently implies NOT NULL and UNIQUE constraints. Most relational databases automatically create a unique, non-clustered index for the PK, and often use it as the clustering key* (e.g., in SQL Server, MySQL InnoDB), dictating the physical storage order of data rows. This optimizes range queries and joins. * Unique Key (UK): Its purpose is to ensure that all values in the specified column(s) are distinct, serving as an "alternate key." A UK creates a unique index, enhancing lookup performance. Unlike a PK, a unique key column can contain one NULL value (as NULL is not considered equal to itself in SQL's three-valued logic, thus not violating uniqueness). This is useful for columns like email_address or SSN where uniqueness is required but a value might occasionally be absent.

Trade-offs and Real-world Implications

The choice impacts data integrity, physical storage, and query performance. Only a primary key can be referenced by a foreign key, making it central to relational database design. While both create indexes that add overhead to write operations (inserts, updates, deletes), the PK's role as a clustering key can significantly affect read performance for certain query patterns.

In modern data platforms and distributed systems (e.g., Spark, Delta Lake, Snowflake), the concept of a "primary key" often becomes a logical constraint rather than a physically enforced one. For instance, Delta Lake's transaction log ensures atomicity, but global uniqueness across distributed writes might be enforced at the application layer during ingestion or through batch validation jobs. Similarly, Snowflake's micro-partitions and clustering keys (like Z-ordering) are physical optimizations that can align with a logical PK, but the uniqueness itself might be guaranteed by upstream ETL processes or MERGE statements.

CREATE TABLE Users (
    user_id INT PRIMARY KEY,
    username VARCHAR(50) NOT NULL UNIQUE,
    email VARCHAR(100) UNIQUE, -- Allows one NULL email
    registration_date DATE
);

In the interview, also mention the distinction between logical constraints (for data modeling and application logic) and physical enforcement mechanisms, especially when discussing distributed data systems.

What is the difference between a primary key and a unique key?

SQLhard2 min read

Reviewed by Aditya Kumar · Last reviewed 2026-03-24

Why This Question Matters

How to Approach This

Mechanics and Purpose

Trade-offs and Real-world Implications

CREATE TABLE Users (
    user_id INT PRIMARY KEY,
    username VARCHAR(50) NOT NULL UNIQUE,
    email VARCHAR(100) UNIQUE, -- Allows one NULL email
    registration_date DATE
);

What is the difference between a primary key and a unique key?

Why This Question Matters

How to Approach This

Mechanics and Purpose

Trade-offs and Real-world Implications

Related SQL Questions

Level up your prep

What is the difference between a primary key and a unique key?

Why This Question Matters

How to Approach This

Mechanics and Purpose

Trade-offs and Real-world Implications

Related SQL Questions

Level up your prep