How do Primary Keys and Unique Keys differ in a relational database ? Question For - Mid Level Developer

Question

How do Primary Keys and Unique Keys differ in a relational database ? Question For – Mid Level Developer

Brief Answer

Both Primary Keys (PKs) and Unique Keys (UKs) are crucial constraints in relational databases that enforce uniqueness on column(s). However, they serve distinct purposes and have key differences:

NULLability: A Primary Key cannot contain any NULL values, ensuring every record has a complete and unique identifier. In contrast, a Unique Key *can* contain at most one NULL value, which is useful for unique attributes that might not always be present (e.g., an employee’s email address might be unique if present, but some employees might not have one yet).
Number per Table: A table can have only one Primary Key, as it’s designed to be the single, definitive identifier for each row (entity integrity). Conversely, a table can have multiple Unique Keys, applied to other attributes that also require uniqueness (e.g., Social Security Number, employee ID from another system).
Indexing Behavior: By default, a Primary Key automatically creates a clustered index. This means the actual physical storage order of the data rows in the table is determined by the Primary Key values, which significantly speeds up data retrieval when querying by the PK. A Unique Key, by default, creates a non-clustered index. This is a separate data structure that points to the data, still allowing for fast searches and uniqueness enforcement without affecting the physical order of the table.
Foreign Key Relationships: Primary Keys are the standard and most common choice for establishing relationships with Foreign Keys in other tables, ensuring strong referential integrity. While technically possible in some systems, using Unique Keys for Foreign Key relationships is less common due to the complexity introduced by their NULLability.

In summary, the Primary Key uniquely identifies each record and forms the core of entity integrity, while Unique Keys provide additional uniqueness constraints on other important attributes within the table.

Super Brief Answer

Both Primary Keys and Unique Keys enforce uniqueness, but differ as follows:

NULLs: A Primary Key cannot contain any NULL values. A Unique Key can contain one NULL value.
Count: Only one Primary Key per table. Multiple Unique Keys are allowed per table.
Indexing: Primary Keys typically create a clustered index (physical data order). Unique Keys typically create a non-clustered index.
Purpose: Primary Key is the main record identifier (entity integrity). Unique Keys enforce uniqueness on alternative attributes.

Detailed Answer

In relational database design, both Primary Keys and Unique Keys are crucial constraints used to enforce data integrity and uniqueness. While they share the common goal of ensuring no duplicate values exist in the constrained column(s), they serve distinct purposes and have key differences in their behavior and application. Understanding these distinctions is fundamental for any mid-level developer working with databases.

Understanding Primary Keys vs. Unique Keys in Relational Databases

Both Primary Keys and Unique Keys enforce uniqueness on column(s) within a relational database table. However, a Primary Key cannot contain any NULL values and there can be only one per table, typically creating a clustered index. In contrast, a Unique Key allows for one NULL value, multiple Unique Keys can exist per table, and they usually create non-clustered indexes by default.

Key Differences Between Primary Keys and Unique Keys

While both enforce uniqueness, their core distinctions lie in:

NULLability
Number per Table
Indexing Behavior
Role in Foreign Key Relationships

1. NULLability

This is arguably the most significant difference. A Primary Key column or set of columns cannot contain any NULL values. This ensures that every record in the table has a unique, non-null identifier, which is essential for entity integrity.

A Unique Key, on the other hand, can contain one NULL value. This allows for scenarios where a unique identifier might not always be available for every record, but you still want to prevent duplicates among the non-null values. For example, an employee might not yet have an assigned email address, but if they do, that email address must be unique across all employees.

2. Number per Table

A relational table is designed to represent a single entity (e.g., `Employees`, `Products`). Therefore, a table can have only one Primary Key. This single Primary Key uniquely identifies each instance (row) of that entity.

In contrast, a table can have multiple Unique Keys. These are used to enforce uniqueness on other candidate keys or attributes that should be unique but are not the primary identifier. For instance, in an `Employees` table, `EmployeeID` would be the Primary Key, while `SocialSecurityNumber` or `EmailAddress` could be Unique Keys.

3. Indexing Behavior

By default, a Primary Key automatically creates a clustered index on the table. A clustered index determines the physical storage order of the data rows in the table. This means the table’s data is physically sorted based on the Primary Key values, which significantly speeds up data retrieval operations when querying by the Primary Key.

A Unique Key, by default, creates a non-clustered index. A non-clustered index is a separate data structure that contains the key values and pointers to the actual data rows. It does not affect the physical order of the data in the table but still allows for faster searches and ensures uniqueness. Think of a clustered index like the ordered pages of a book, and a non-clustered index like an index at the back of the book pointing to specific pages.

4. Role in Foreign Key Relationships

Primary Keys are commonly used to establish relationships with Foreign Keys in other tables. A Foreign Key in a child table references the Primary Key of a parent table, enforcing referential integrity (ensuring that references between tables are valid).

While technically possible in some database systems, using Unique Keys for foreign key relationships is less common. The allowance of a NULL value in a Unique Key complicates referential integrity enforcement, as a Foreign Key would then need to handle potential NULL references, which often goes against the principle of strong relationships.

Summary Table: Primary Key vs. Unique Key

Feature	Primary Key	Unique Key
NULL Values	Cannot contain any NULL values.	Can contain at most one NULL value.
Number per Table	Only one per table.	Multiple per table are allowed.
Default Index Type	Clustered Index (determines physical order).	Non-Clustered Index (separate structure, points to data).
Foreign Key Reference	Commonly referenced by Foreign Keys.	Less commonly referenced by Foreign Keys.
Purpose	Uniquely identifies each record; enforces entity integrity.	Enforces uniqueness on alternative candidate keys.

SQL Code Sample


-- Creating a table with a Primary Key and a Unique Key

CREATE TABLE Employees (
    -- EmployeeID is the Primary Key (not null, unique, clustered index by default)
    EmployeeID INT PRIMARY KEY,
    FirstName VARCHAR(50) NOT NULL,
    LastName VARCHAR(50) NOT NULL,
    -- SocialSecurityNumber is a Unique Key (unique, can be null once, non-clustered index by default)
    SocialSecurityNumber VARCHAR(20) UNIQUE,
    DepartmentID INT
);

-- Example of another table with a Unique constraint on a nullable column

CREATE TABLE Contacts (
    ContactID INT PRIMARY KEY,
    EmailAddress VARCHAR(100) UNIQUE,  -- Allows one NULL value for EmailAddress
    PhoneNumber VARCHAR(20)
);

-- Inserting data to demonstrate NULLability
INSERT INTO Employees (EmployeeID, FirstName, LastName, SocialSecurityNumber, DepartmentID)
VALUES (1, 'John', 'Doe', '123-45-6789', 101);

INSERT INTO Employees (EmployeeID, FirstName, LastName, SocialSecurityNumber, DepartmentID)
VALUES (2, 'Jane', 'Smith', NULL, 102); -- OK: Unique Key allows one NULL

-- INSERT INTO Employees (EmployeeID, FirstName, LastName, SocialSecurityNumber, DepartmentID)
-- VALUES (3, 'Peter', 'Jones', NULL, 103); -- Fails: Unique Key does not allow a second NULL if one already exists for the same column

-- INSERT INTO Employees (EmployeeID, FirstName, LastName, SocialSecurityNumber, DepartmentID)
-- VALUES (1, 'Duplicate', 'ID', '987-65-4321', 104); -- Fails: Primary Key must be unique

INSERT INTO Contacts (ContactID, EmailAddress, PhoneNumber)
VALUES (101, 'john.doe@example.com', '555-1234');

INSERT INTO Contacts (ContactID, EmailAddress, PhoneNumber)
VALUES (102, NULL, '555-5678'); -- OK: Unique Key allows one NULL for EmailAddress

-- INSERT INTO Contacts (ContactID, EmailAddress, PhoneNumber)
-- VALUES (103, NULL, '555-9012'); -- Fails: Unique Key does not allow a second NULL for EmailAddress

Interview Tips for Mid-Level Developers

When discussing Primary Keys and Unique Keys in an interview, aim to clearly articulate the core differences and demonstrate a deeper understanding beyond just definitions:

Start with the Essentials: Immediately highlight the NULLability difference and the “one Primary Key vs. many Unique Keys” aspect.
Indexing Matters: Mention the default index types (clustered for Primary Key, non-clustered for Unique Key) and briefly explain the impact on physical storage and lookup performance. Use the “book pages vs. book index” analogy if it feels natural.
Provide Practical Examples: Use a simple, relatable example like an `Employee` table where `EmployeeID` is the Primary Key, and `SocialSecurityNumber` or `EmailAddress` serves as Unique Keys. Explain why each is appropriate in its role.
Discuss Relationships: Briefly touch upon the common use of Primary Keys in establishing Foreign Key relationships and why Unique Keys are less suitable for this purpose (due to NULLs complicating referential integrity).
Emphasize Purpose: Conclude by stating that Primary Keys ensure entity integrity (each record has a unique, non-null identifier), while Unique Keys provide alternative unique identifiers for other attributes.