SQL UNION and UNION ALL Step by step Implementation and Top 10 Questions and Answers
 .NET School AI Teacher - SELECT ANY TEXT TO EXPLANATION.    Last Update: April 01, 2025      18 mins read      Difficulty-Level: beginner

SQL UNION and UNION ALL: An In-Depth Explanation and Key Features

Structured Query Language (SQL) provides powerful tools for data manipulation and retrieval. Among these tools are UNION and UNION ALL, which allow users to combine the results of two or more SELECT statements into a single result set. This is particularly useful for consolidating data from different tables or for creating more complex queries. Understanding the distinction between UNION and UNION ALL is essential for optimizing query performance and ensuring accurate data manipulation.

Overview

The UNION and UNION ALL operators are used when you want to combine the results from two or more SELECT statements. Both operators require that the number and types of columns in the SELECT statements are identical. This ensures that the result set maintains a consistent structure. The primary difference between UNION and UNION ALL lies in how they handle duplicate rows.

Syntax

  1. UNION

    SELECT column_list_1 FROM table_1
    UNION
    SELECT column_list_2 FROM table_2;
    
  2. UNION ALL

    SELECT column_list_1 FROM table_1
    UNION ALL
    SELECT column_list_2 FROM table_2;
    

Key Differences

  • Handling Duplicates:

    • UNION: Automatically removes duplicate rows from the final result set.
    • UNION ALL: Includes all rows from both SELECT statements, even if they are duplicates.
  • Performance:

    • UNION: Generally slower than UNION ALL because it involves an additional step to identify and eliminate duplicate rows.
    • UNION ALL: Faster since it does not need to perform any duplicate elimination.
  • Use Cases:

    • UNION: Use when you want to combine data from different tables and ensure that the result set contains unique values.
    • UNION ALL: Use when combining data where duplicates are expected and allowable, or when performance is a critical factor.

Detailed Example

Consider two tables, employees and contractors, both containing information about individuals working at a company. We want to retrieve a list of all unique email addresses from these tables.

  • Employees Table | id | name | email | |----|-------------|----------------------| | 1 | John Doe | john.doe@example.com | | 2 | Jane Smith | jane.smith@example.com | | 3 | Bob Johnson | bob.johnson@example.com |

  • Contractors Table | id | name | email | |----|-------------|----------------------| | 1 | Emily White | emily.white@example.com | | 2 | Bob Johnson | bob.johnson@example.com |

  1. Using UNION

    SELECT email FROM employees
    UNION
    SELECT email FROM contractors;
    

    Result: | email | |-----------------------| | john.doe@example.com | | jane.smith@example.com| | bob.johnson@example.com | | emily.white@example.com|

  2. Using UNION ALL

    SELECT email FROM employees
    UNION ALL
    SELECT email FROM contractors;
    

    Result: | email | |-----------------------| | john.doe@example.com | | jane.smith@example.com| | bob.johnson@example.com | | bob.johnson@example.com | -- Duplicate row | emily.white@example.com|

Additional Considerations

  • Column Order and Data Types: The columns in the SELECT statements used with UNION or UNION ALL must be in the same order and have compatible data types. This ensures that the columns from each SELECT statement align correctly in the final result set.

  • Sorting: If you need to sort the combined result set, the ORDER BY clause should be placed at the end of the UNION or UNION ALL query. Sorting applies to the final combined result, not just the individual SELECT statements.

    SELECT email FROM employees
    UNION
    SELECT email FROM contractors
    ORDER BY email;
    
  • Null Handling: NULL values are treated as duplicate rows. Therefore, UNION will remove duplicate NULL values, while UNION ALL will include them all.

Conclusion

Understanding the nuances of UNION and UNION ALL is crucial for effective SQL query writing. By using UNION when uniqueness is required and UNION ALL when performance and allowance of duplicates are priorities, you can optimize your SQL queries for both efficiency and accuracy. Proper use of these tools enhances your ability to consolidate and analyze data from multiple sources, making SQL an indispensable language in data management and analysis.




Understanding SQL UNION and UNION ALL: Examples, Set Route, and Run Application

Introduction to UNION and UNION ALL

SQL's UNION and UNION ALL are operators used to combine the result sets of two or more SELECT statements into a single result set. These operators are particularly useful when you need to integrate data from multiple tables that have similar structures, such as tables containing employee records across different departments or locations.

  • UNION: Combines the results of two or more SELECT statements but eliminates duplicate rows, making it ideal for cases where you want unique entries.
  • UNION ALL: Also combines the results but does not eliminate duplicates, which is faster than UNION because it does not perform the extra step of checking for and removing duplicates.

Now, let’s go through a step-by-step guide using these operators.


Setting Up Your Environment

Before diving into examples, you need to have a database environment ready:

  1. Install Database Management System (DBMS): You can install a DBMS like MySQL, PostgreSQL, SQLite, or Microsoft SQL Server. For ease of use, we will use MySQL in this example.

  2. Create Database and Tables: Let's create a couple of tables to demonstrate how UNION and UNION ALL work.

    CREATE DATABASE CompanyDB;
    
    USE CompanyDB;
    
    CREATE TABLE Sales (
        id INT PRIMARY KEY,
        name VARCHAR(50),
        amount FLOAT
    );
    
    CREATE TABLE Service (
        id INT PRIMARY KEY,
        name VARCHAR(50),
        amount FLOAT
    );
    
  3. Insert Sample Data: Insert some data into these tables.

    INSERT INTO Sales (id, name, amount) VALUES
    (1, 'Alice', 500.00),
    (2, 'Bob', 750.00),
    (3, 'Charlie', 600.00);
    
    INSERT INTO Service (id, name, amount) VALUES
    (4, 'David', 900.00),
    (5, 'Alice', 800.00), -- Duplicate name
    (6, 'Eve', 150.00);
    

Understanding Data Flow with UNION and UNION ALL

  1. Data in Individual Tables:

    • Sales Table: | id | name | amount | |----|---------|--------| | 1 | Alice | 500.00 | | 2 | Bob | 750.00 | | 3 | Charlie | 600.00 |

    • Service Table: | id | name | amount | |----|-------|--------| | 4 | David | 900.00 | | 5 | Alice | 800.00 | | 6 | Eve | 150.00 |

  2. Using UNION: We'll now use UNION to combine the sales data and service data.

    SELECT id, name, amount FROM Sales 
    UNION 
    SELECT id, name, amount FROM Service;
    

    The combined result table will look like this:

    | id | name | amount | |----|---------|--------| | 1 | Alice | 500.00 | | 2 | Bob | 750.00 | | 3 | Charlie | 600.00 | | 4 | David | 900.00 | | 5 | Alice | 800.00 | | 6 | Eve | 150.00 |

    However, UNION will remove any duplicate rows based on all columns. Since there are duplicate 'Alice' entries with different amounts, they will all appear in the result.

  3. Using UNION ALL: If you want to include all the entries without removing duplicates, UNION ALL is your choice.

    SELECT id, name, amount FROM Sales 
    UNION ALL 
    SELECT id, name, amount FROM Service;
    

    This will yield:

    | id | name | amount | |----|---------|--------| | 1 | Alice | 500.00 | | 2 | Bob | 750.00 | | 3 | Charlie | 600.00 | | 4 | David | 900.00 | | 5 | Alice | 800.00 | -- Duplicate Alice entry with different amount | 6 | Eve | 150.00 |

    Notice that both 'Alice' entries now appear in the result set.


Running an Application Using UNION and UNION ALL

Suppose you are building a web application to display total revenue data from both the Sales and Service departments. Here’s how you might implement it:

  1. Database Connectivity: First, establish a connection between your application and the MySQL database.

    Example using Python and MySQL Connector:

    import mysql.connector
    
    db = mysql.connector.connect(
        host="localhost",
        user="yourusername",
        passwd="yourpassword",
        database="CompanyDB"
    )
    
    cursor = db.cursor()
    
  2. Executing a Query with UNION: You want to show the total revenue and ensure no duplicate names appear, even if they have different amounts.

    query_union = """
    SELECT id, name, amount FROM Sales 
    UNION 
    SELECT id, name, amount FROM Service;
    """
    
    cursor.execute(query_union)
    result_union = cursor.fetchall()
    
    print("REVENUE WITH UNION:")
    for record in result_union:
        print(record)
    

    Output:

    REVENUE WITH UNION:
    (1, 'Alice', 500.0)
    (2, 'Bob', 750.0)
    (3, 'Charlie', 600.0)
    (4, 'David', 900.0)
    (5, 'Alice', 800.0)
    (6, 'Eve', 150.0)
    
  3. Executing a Query with UNION ALL: Now, for a report that includes all transactions regardless of whether names are repeated, use UNION ALL.

    query_union_all = """
    SELECT id, name, amount FROM Sales 
    UNION ALL 
    SELECT id, name, amount FROM Service;
    """
    
    cursor.execute(query_union_all)
    result_union_all = cursor.fetchall()
    
    print("REVENUE WITH UNION ALL:")
    for record in result_union_all:
        print(record)
    

    Output:

    REVENUE WITH UNION ALL:
    (1, 'Alice', 500.0)
    (2, 'Bob', 750.0)
    (3, 'Charlie', 600.0)
    (4, 'David', 900.0)
    (5, 'Alice', 800.0)
    (6, 'Eve', 150.0)
    
  4. Display Results on Web Page: You can format these results and display them on a web page using HTML/CSS or a templating engine like Jinja2 in Flask.

    Flask Example:

    from flask import Flask, render_template_string
    
    app = Flask(__name__)
    
    html_union_template = """
    <h1>Revenue with UNION</h1>
    <table border="1">
        <tr><th>ID</th><th>Name</th><th>Amount</th></tr>
        {% for row in data %}
            <tr><td>{{ row[0] }}</td><td>{{ row[1] }}</td><td>{{ row[2] }}</td></tr>
        {% endfor %}
    </table>
    """
    
    @app.route('/union')
    def union_results():
        cursor.execute(query_union)
        result_union = cursor.fetchall()
        return render_template_string(html_union_template, data=result_union)
    
    html_union_all_template = """
    <h1>Revenue with UNION ALL</h1>
    <table border="1">
        <tr><th>ID</th><th>Name</th><th>Amount</th></tr>
        {% for row in data %}
            <tr><td>{{ row[0] }}</td><td>{{ row[1] }}</td><td>{{ row[2] }}</td></tr>
        {% endfor %}
    </table>
    """
    
    @app.route('/union_all')
    def union_all_results():
        cursor.execute(query_union_all)
        result_union_all = cursor.fetchall()
        return render_template_string(html_union_all_template, data=result_union_all)
    
    if __name__ == '__main__':
        app.run(debug=True)
    

    Navigate to http://localhost:5000/union and http://localhost:5000/union_all to see the revenue data displayed.


Final Thoughts

Using UNION and UNION ALL effectively can streamline data retrieval processes and enhance the usability of your applications. Remember that UNION removes duplicates and UNION ALL does not, which affects performance; consider your specific needs before choosing one over the other.

By following this step-by-step guide, you’ve set up a simple database structure, run queries using UNION and UNION ALL, and even integrated these into a basic web application. This foundational knowledge will help you as you develop more complex applications relying on SQL for data management.




Top 10 Questions and Answers on SQL UNION and UNION ALL

SQL UNION and UNION ALL are used to combine the results of two or more SELECT statements into a single result set. Understanding their differences and how to use them effectively is crucial for managing and querying databases efficiently. Here are ten common questions, along with detailed answers, covering the essential aspects of SQL UNION and UNION ALL.

1. What is the difference between SQL UNION and UNION ALL?

UNION and UNION ALL both perform the combining of the result sets of two or more SELECT statements. However, they differ in how they handle duplicate values.

  • UNION: Removes duplicate rows from the result set. It only returns unique rows across all SELECT statements involved.
  • UNION ALL: Includes all rows from all SELECT statements, even if there are duplicates.

Example:

SELECT column_name FROM table1
UNION
SELECT column_name FROM table2;

SELECT column_name FROM table1
UNION ALL
SELECT column_name FROM table2;

In this example, the first query will return unique rows from table1 and table2, whereas the second query (UNION ALL) will include all rows from both tables, including those that are identical.

2. Which operator should be used when you want to retrieve all records, even including duplicated ones?

When you need to retrieve all records, including those that appear in more than one of the combined result sets, you should use UNION ALL. This operator does not eliminate duplicates and is generally more efficient since it skips the additional step of checking for duplicates.

Example:

SELECT employee_id FROM employees_in_ny
UNION ALL
SELECT employee_id FROM employees_in_ca; 
-- This will include all employee IDs from both states and keep any duplicates.

3. How do you ensure that the columns being united have compatible data types?

To ensure compatibility, the columns in the SELECT statements must have:

  • The same number of columns (the same width).
  • Corresponding columns must have compatible data types, though not necessarily identical ones (e.g., INTEGER and FLOAT can sometimes be combined based on specific SQL dialects).

Sometimes, explicit type casting is necessary to ensure compatibility. For instance, casting a VARCHAR to a CHAR or converting a string representation of a date to an actual DATE type.

Example:

SELECT age AS numeric_column FROM students
UNION ALL
SELECT CAST(salary AS INTEGER) AS numeric_column FROM teachers; 

4. Can SQL UNION and UNION ALL work with different numbers of columns from each SELECT statement?

No, SQL UNION and UNION ALL cannot work if the SELECT statements have a different number of columns. Each SELECT statement in a UNION or UNION ALL operation must have the same number of result columns; otherwise, you'll receive an error.

Solution: Adjust the number of columns by adding empty or dummy columns where necessary.

Example:

SELECT id, name, age FROM individuals_in_ny
UNION ALL
SELECT id, name, NULL AS age FROM individuals_in_ca

Here, age is added as a NULL value in the second query to match the number of columns in the first query.

5. Is it mandatory to maintain the order of columns across all SELECT statements when using UNION and UNION ALL?

Yes, the order of columns is essential because UNION and UNION ALL match columns based on their position rather than their names. Ensure that corresponding columns across all SELECT statements hold similar or compatible data.

6. How does SQL UNION handle NULL values when combining result sets?

UNION treats NULLs just like other data values during comparison for eliminating duplicates. Therefore, if NULLs in the corresponding positions of multiple result sets are identical, they will be considered duplicates and only one instance will be returned.

Example:

SELECT id, product_name FROM inventory_west
UNION
SELECT id, product_name FROM inventory_east

If product_name is NULL for certain products in both tables, these rows will be treated as duplicates.

7. What are some best practices when using UNION and UNION ALL?

  • Use UNION ALL if you know for sure there won't be duplicate entries or if performance is critical because UNION is more time-consuming due to sorting and removing duplicates.
  • Always specify columns by name in your SELECT statements rather than using SELECT * to ensure proper alignment and meaningful data retrieval.
  • Be mindful of data types and explicitly cast when necessary to prevent errors.
  • Keep your queries logically and syntactically simple for better readability and easier maintenance.
  • Test your queries in stages to verify that the combined data is accurate and formatted correctly.

8. Can you use UNION or UNION ALL with more than two SELECT statements?

Absolutely, you can use UNION or UNION ALL to combine results from more than two SELECT statements. Just chain them together as required. Each SELECT statement must follow the rules for UNION and UNION ALL—same number and compatible data types.

Example:

SELECT customer_id FROM orders_ny
UNION ALL
SELECT customer_id FROM orders_ca
UNION ALL
SELECT customer_id FROM orders_tx;

9. What happens if I try to execute a UNION or UNION ALL query on SELECT statements with incompatible schemas?

You'll encounter an error if the schemas (the structure of columns and their respective data types) of the SELECT statements do not perfectly match. To resolve, you need to adjust the queries so they align properly.

Common Error Message:

ERROR: UNION types text and integer cannot be matched

Example Scenario:

SELECT name, address FROM customers
UNION ALL
SELECT id, email FROM users;

This query would result in an error as the data types and number of columns between customers and users do not match. Correcting this would involve selecting corresponding columns with compatible types.

10. How can I sort the combined results of UNION or UNION ALL operations?

Sorting can be applied to the combined result set obtained from UNION or UNION ALL operations through the ORDER BY clause. However, the ORDER BY clause must appear at the end of the last SELECT statement.

Example:

SELECT student_id, name FROM high_school_students
UNION
SELECT student_id, name FROM middle_school_students
ORDER BY name ASC;

In this query, after combining all students' data from both tables and removing duplicates, the result is then sorted alphabetically by student name.

Conclusion

Using UNION and UNION ALL can significantly simplify complex SQL queries by enabling the combination of results from multiple tables. However, it’s important to keep in mind the fundamental differences between these operators, especially regarding performance and handling of duplicate values. By adhering to best practices and ensuring schema compatibility, you can effectively leverage UNION and UNION ALL to streamline data management and retrieval processes.