How do the set operators UNION , INTERSECT , and MINUS differ in their functionality when comparing and combining result sets in SQL?Expertise Level: Senior Level Developer
Question
How do the set operators UNION , INTERSECT , and MINUS differ in their functionality when comparing and combining result sets in SQL?Expertise Level: Senior Level Developer
Brief Answer
SQL’s set operators (UNION, INTERSECT, MINUS/EXCEPT) combine or compare results from multiple SELECT statements. They differ fundamentally in how they handle duplicates and which rows they return. A critical prerequisite for all is that the queries must have the same number of columns and compatible data types in the same order.
1. UNION
- Function: Combines distinct rows from two or more queries. Automatically removes duplicates.
- When to Use: To get a consolidated list of unique entries from multiple sources (e.g., all unique customers from online and physical stores).
- Note: Slower than UNION ALL due to duplicate elimination.
2. UNION ALL
- Function: Combines ALL rows from two or more queries, including duplicates.
- When to Use: When you need to combine all records and performance is critical, as it skips duplicate removal.
- Performance: Generally faster than UNION.
3. INTERSECT
- Function: Returns only the common rows found in all involved queries.
- When to Use: To identify records that exist in all specified datasets (e.g., customers who purchased both online and in-store).
4. MINUS / EXCEPT
- Function: Returns rows present in the result set of the first query but absent from the second query’s result set. (MINUS in Oracle, EXCEPT in ANSI SQL/SQL Server/PostgreSQL/MySQL 8+).
- When to Use: To find differences between two sets (e.g., customers who shopped online but never in a physical store).
- Key Point: This operator is not commutative (A EXCEPT B is different from B EXCEPT A).
Key Considerations for Interviews:
- Compatibility: Emphasize that all queries must have the same number of columns and compatible data types.
- Performance: Always prefer
UNION ALLoverUNIONif duplicates are acceptable, due to performance benefits. - Directionality: Highlight that
MINUS/EXCEPTis directional/non-commutative. - Syntax: Mention both
MINUS(Oracle) andEXCEPT(ANSI standard) for comprehensive knowledge.
Super Brief Answer
SQL set operators combine or compare result sets, requiring compatible columns across queries:
- UNION: Combines distinct rows from multiple queries.
- UNION ALL: Combines all rows, including duplicates (faster than UNION).
- INTERSECT: Returns only common rows found in all queries.
- MINUS/EXCEPT: Returns rows in the first query’s set but not in the second’s. (Note: Non-commutative).
Detailed Answer
SQL’s set operators — UNION, INTERSECT, and MINUS (or EXCEPT) — are powerful tools for combining and comparing result sets from multiple SELECT statements. They differ primarily in how they handle duplicates and which rows they return:
UNION: Combines distinct rows from two or more queries.INTERSECT: Returns only the common (shared) rows found in all queries.MINUS(orEXCEPT): Retrieves rows that are present in the first query’s result set but absent from the second query’s result set.
All set operations require that the corresponding columns in each SELECT statement have compatible data types and the same number of columns, in the same order.
In SQL, set operators allow you to combine the results of two or more queries into a single result set. This functionality is analogous to set theory in mathematics, enabling operations like combining sets, finding common elements, or identifying unique elements. Mastering these operators is crucial for advanced data manipulation, reporting, and analytics.
Understanding SQL Set Operators
Let’s illustrate the functionality of each operator using two hypothetical tables: Customers_Online and Customers_Physical, representing customer data from different sales channels.
-- Sample Tables for Demonstration
CREATE TABLE Customers_Online (
CustomerID INT PRIMARY KEY,
CustomerName VARCHAR(100)
);
INSERT INTO Customers_Online (CustomerID, CustomerName) VALUES
(1, 'Alice'),
(2, 'Bob'),
(3, 'Charlie'),
(5, 'Eve');
CREATE TABLE Customers_Physical (
CustomerID INT PRIMARY KEY,
CustomerName VARCHAR(100)
);
INSERT INTO Customers_Physical (CustomerID, CustomerName) VALUES
(2, 'Bob'),
(3, 'Charlie'),
(4, 'David'),
(6, 'Frank');
1. UNION
The UNION operator combines the result sets of two or more SELECT statements into a single result set. The key characteristic of UNION is that it automatically removes duplicate rows from the final output.
When to Use:
Use UNION when you need a consolidated list of unique entries from multiple sources. For instance, to get a complete list of all unique customer names from both online and physical stores.
Example:
SELECT CustomerID, CustomerName FROM Customers_Online
UNION
SELECT CustomerID, CustomerName FROM Customers_Physical;
Expected Result:
This query would return a combined list of all unique customers (Alice, Bob, Charlie, David, Eve, Frank), with Bob and Charlie appearing only once, even though they are present in both tables.
2. UNION ALL
Similar to UNION, UNION ALL combines the result sets of two or more SELECT statements. However, unlike UNION, it does not remove duplicate rows. Every row from each query is included in the final result.
When to Use:
Use UNION ALL when you need to combine all records, including duplicates, and performance is a concern. It’s generally faster than UNION because it skips the resource-intensive process of scanning and eliminating duplicate rows.
Example:
SELECT CustomerID, CustomerName FROM Customers_Online
UNION ALL
SELECT CustomerID, CustomerName FROM Customers_Physical;
Expected Result:
This query would return all customers from Customers_Online followed by all customers from Customers_Physical. Bob and Charlie would appear twice (once from each table).
3. INTERSECT
The INTERSECT operator returns only the rows that are present in the result sets of all involved SELECT statements. It effectively finds the common elements between the queries.
When to Use:
Use INTERSECT to identify records that exist in multiple, distinct data sets. For example, to find customers who have purchased both online and in a physical store.
Example:
SELECT CustomerID, CustomerName FROM Customers_Online
INTERSECT
SELECT CustomerID, CustomerName FROM Customers_Physical;
Expected Result:
This query would return only the customers who appear in both lists (Bob, Charlie).
4. MINUS / EXCEPT
The MINUS operator (used in Oracle and some other databases) or EXCEPT operator (ANSI standard, used in SQL Server, PostgreSQL, MySQL 8+) returns rows that are present in the result set of the first SELECT statement but not in the result set of the second SELECT statement. It identifies unique rows from the first query that are absent from the second.
When to Use:
Use MINUS/EXCEPT to find differences between two sets of data. For instance, to identify customers who have shopped online but have never visited a physical store.
Example (using EXCEPT for ANSI standard):
SELECT CustomerID, CustomerName FROM Customers_Online
EXCEPT
SELECT CustomerID, CustomerName FROM Customers_Physical;
Expected Result:
This query would return customers present in Customers_Online but not in Customers_Physical (Alice, Eve).
Example (using MINUS for Oracle compatibility):
SELECT CustomerID, CustomerName FROM Customers_Online
MINUS
SELECT CustomerID, CustomerName FROM Customers_Physical;
Expected Result:
Same as above: customers present in Customers_Online but not in Customers_Physical (Alice, Eve).
Key Considerations & Interview Insights
When working with SQL set operators, especially in an interview context, demonstrating a deeper understanding of their nuances is crucial:
-
Data Type and Column Compatibility
All set operators require that the number of columns and their data types (or compatible data types that allow implicit conversion) must be identical and in the same order across all participating
SELECTstatements. Failure to meet these requirements will result in a SQL error. Be aware that implicit conversions can sometimes lead to unexpected results; explicit casting (e.g., usingCAST()orCONVERT()) is often a safer practice.-- Example of incompatible data types (will error) SELECT CustomerName FROM Customers_Online -- VARCHAR UNION ALL SELECT CustomerID FROM Customers_Physical; -- INT -
Performance Implications: UNION vs. UNION ALL
Always prefer
UNION ALLoverUNIONif you don’t need to eliminate duplicates. The duplicate removal process inUNIONrequires additional sorting and comparison, which can be computationally expensive and significantly impact performance, especially with large datasets. -
Non-Commutativity of MINUS / EXCEPT
Unlike
UNIONandINTERSECT(which are commutative, meaning A UNION B is the same as B UNION A),MINUS/EXCEPTis not commutative.(Query A) EXCEPT (Query B)will yield different results from(Query B) EXCEPT (Query A). This directional aspect is critical for accurate data comparison.-- A EXCEPT B (Customers_Online MINUS Customers_Physical) SELECT CustomerID, CustomerName FROM Customers_Online EXCEPT SELECT CustomerID, CustomerName FROM Customers_Physical; -- Result: (1, Alice), (5, Eve) -- B EXCEPT A (Customers_Physical MINUS Customers_Online) SELECT CustomerID, CustomerName FROM Customers_Physical EXCEPT SELECT CustomerID, CustomerName FROM Customers_Online; -- Result: (4, David), (6, Frank) -
ANSI Standard vs. Vendor-Specific Syntax
While Oracle uses
MINUS,EXCEPTis the ANSI SQL standard. UsingEXCEPTdemonstrates a broader knowledge of SQL standards and promotes query portability across different relational database management systems (RDBMS) like PostgreSQL, SQL Server, and MySQL (version 8+). -
Real-World Problem Solving
Be ready to articulate scenarios where each operator is best suited. For example, finding customers who participated in ‘Summer Sale’ AND ‘Holiday Promotion’ (
INTERSECT), or customers who joined ‘Summer Sale’ BUT NOT ‘Holiday Promotion’ (EXCEPT/MINUS).

