A Complete Guide - MongoDB Use Cases for Aggregation Pipelines

Last Updated: 03 Jul, 2025

YOU NEED ANY HELP? THEN SELECT ANY TEXT.

MongoDB Use Cases for Aggregation Pipelines

1. Data Transformation and Filtering

Use Case: Data cleaning and preprocessing are crucial stages in data analysis. Aggregation pipelines can perform these tasks efficiently by filtering out unwanted data and transforming the existing data into a suitable format.

Important Information:

$match: Filters documents that pass the specified condition.
$project: Reshapes each document in the stream, such as by adding new fields or removing existing fields.
$redact: Restricts the documents to the set of fields that match the given condition.
$replaceRoot: Replaces the input document with the specified document.

Example:

db.sales.aggregate([
  { $match: { status: "A" } },
  { $project: { _id: 0, item: 1, price: 1, quantity: 1 } }
])

This pipeline filters documents with a status of "A" and projects only the item, price, and quantity fields.

2. Data Aggregation and Grouping

Use Case: Summarizing and aggregating large datasets is essential for generating insights and driving business decisions. Aggregation pipelines can perform complex grouping operations.

Important Information:

$group: Groups input documents by a specified expression and outputs to the next stage a document for each distinct grouping.
$accumulator: Adds user-defined accumulators to the $group stage.
$bucket: Categorizes incoming documents into groups based on specified boundaries.

Example:

db.sales.aggregate([
  { $group: { _id: "$item", totalQuantity: { $sum: "$quantity" } } }
])

This pipeline groups documents by the item field and sums up the quantity for each item.

3. Data Sorting and Limiting

Use Case: Sorting and limiting results are common requirements in data analysis to focus on top-performing or bottom-performing items.

Important Information:

$sort: Sorts all input documents and returns them in sorted order.
$limit: Limits the number of documents passed to the next stage in the pipeline.
$skip: Skips a specified number of documents and passes the remaining documents to the next stage.

Example:

db.sales.aggregate([
  { $sort: { totalQuantity: -1 } },
  { $limit: 5 }
])

This pipeline sorts the documents by totalQuantity in descending order and limits the results to the top 5.

4. Joining Data from Different Collections

Use Case: Data often resides in multiple collections or databases. Aggregation pipelines can perform lookups to combine data from different collections.

Important Information:

$lookup: Performs a left outer join to another collection in the same database to filter in documents from the “joined” collection for processing.
$graphLookup: Recursively joins subdocuments to the current document from the same collection.

Example:

db.orders.aggregate([
  {
    $lookup:
      {
        from: "customers",
        localField: "cust_id",
        foreignField: "customer_id",
        as: "customer_info"
      }
   }
])

This pipeline joins the orders collection with the customers collection based on the cust_id and customer_id fields.

5. Calculating Running Totals and Moving Averages

Use Case: Running totals and moving averages are useful for analyzing trends over time, such as stock price movements or sales performance.

Important Information:

$addFields: Adds new fields to documents. $addFields outputs documents that contain all existing fields from the input documents and newly added fields.
$window: Performs operations on a set of documents in a defined window and outputs the result for each document individually. $window can be used to calculate running totals and moving averages.

Example:

db.sales.aggregate([
  {
    $window:
      {
        output:
          {
            runningTotal: { $sum: "$value" },
            movingAvg: { $avg: "$value", window: { documents: [-2, 2] } }
          }
      }
  }
])

This pipeline calculates a running total and a moving average with a window of 5 documents centered on the current document.

6. Geospatial Aggregation

Use Case: Geospatial data is critical in location-based services, logistics, and urban planning. Aggregation pipelines can perform geospatial operations directly within MongoDB.

Important Information:

$geoNear: Outputs documents in order of nearest to farthest from a specified point.
$geoWithin: Selects documents with geospatial data within a specified shape.
$geoIntersects: Selects documents with geospatial data that intersect with a specified shape.

Example:

db.places.aggregate([
  {
    $geoNear:
      {
        near: { type: "Point", coordinates: [ -73.99279 , 40.719296 ] },
        distanceField: "dist.calculated",
        maxDistance: 2000,
        spherical: true
      }
  }
])

This pipeline retrieves locations within a maximum distance of 2000 meters from the specified point.

7. Data Analysis and Reporting

Use Case: Generating business reports and performing data analysis are vital for making informed decisions. Aggregation pipelines can provide the required data in real-time.

Important Information:

$facet: Performs multiple aggregation operations in parallel.
$bucketAuto: Categorizes incoming documents into a specific number of groups based on a specified expression.
$out: Writes the result of the aggregation pipeline to a specified collection.

Example:

db.orders.aggregate([
  {
    $facet:
      {
        byCategory:
          [
            { $group: { _id: "$category", totalSales: { $sum: "$amount" } } },
            { $sort: { totalSales: -1 } },
            { $limit: 5 }
          ],
        salesByDay:
          [
            { $group: { _id: { $dayOfYear: "$orderDate" }, totalSales: { $sum: "$amount" } } },
            { $sort: { _id: 1 } },
            { $limit: 12 }
          ]
      }
  }
])

This pipeline generates two reports: the top 5 categories by total sales and the total sales by day for the last 12 days.

Conclusion

Online Code run

🔔 Note: Select your programming language to check or run code at

💻 Run Code Compiler

Step-by-Step Guide: How to Implement MongoDB Use Cases for Aggregation Pipelines

Use Case 1: Calculating Total Sales for a Product

Scenario: Suppose you have a collection named orders with the following documents. You want to calculate the total sales for each product.

{ "_id": 1, "product_id": "A101", "quantity": 3, "price": 10 }
{ "_id": 2, "product_id": "B202", "quantity": 5, "price": 20 }
{ "_id": 3, "product_id": "A101", "quantity": 2, "price": 10 }
{ "_id": 4, "product_id": "C303", "quantity": 4, "price": 15 }
{ "_id": 5, "product_id": "B202", "quantity": 3, "price": 20 }

Objective: Calculate the total sales (quantity * price) for each product.

Steps:

Match: Filter the documents (if needed, but in this case, all documents are relevant).
Group: Group by product_id and calculate the total sales.
Project: Format the output.

Aggregation Pipeline:

db.orders.aggregate([
  {
    $group: {
      _id: "$product_id",
      totalSales: { $sum: { $multiply: ["$quantity", "$price"] } }
    }
  },
  {
    $project: {
      _id: 0,
      product_id: "$_id",
      totalSales: 1
    }
  }
])

Output:

{ "product_id": "A101", "totalSales": 50 }
{ "product_id": "B202", "totalSales": 170 }
{ "product_id": "C303", "totalSales": 60 }

Use Case 2: Filtering and Sorting Data

Scenario: Suppose you have a collection named transactions with the following documents. You want to filter transactions for a specific year, sort them by amount in descending order, and limit the results to the top 5.

{ "_id": 1, "account_id": "123", "year": 2021, "amount": 150 }
{ "_id": 2, "account_id": "456", "year": 2022, "amount": 200 }
{ "_id": 3, "account_id": "789", "year": 2021, "amount": 250 }
{ "_id": 4, "account_id": "123", "year": 2021, "amount": 300 }
{ "_id": 5, "account_id": "456", "year": 2021, "amount": 350 }
{ "_id": 6, "account_id": "789", "year": 2022, "amount": 100 }

Objective: Filter transactions for the year 2021, sort them by amount in descending order, and limit the results to the top 5.

Steps:

Match: Filter documents for the year 2021.
Sort: Sort the documents by amount in descending order.
Limit: Limit the results to the top 5.

Aggregation Pipeline:

db.transactions.aggregate([
  {
    $match: { year: 2021 }
  },
  {
    $sort: { amount: -1 }
  },
  {
    $limit: 5
  }
])

Output:

{ "_id": 5, "account_id": "456", "year": 2021, "amount": 350 }
{ "_id": 4, "account_id": "123", "year": 2021, "amount": 300 }
{ "_id": 3, "account_id": "789", "year": 2021, "amount": 250 }
{ "_id": 1, "account_id": "123", "year": 2021, "amount": 150 }

Use Case 3: Grouping by Multiple Fields and Using Accumulators

Scenario: Suppose you have a collection named sales with the following documents. You want to group sales by region and product, calculate the total sales amount, and find the average sales amount per product.

{ "_id": 1, "region": "North", "product": "Widget", "amount": 200 }
{ "_id": 2, "region": "South", "product": "Widget", "amount": 150 }
{ "_id": 3, "region": "North", "product": "Gadget", "amount": 250 }
{ "_id": 4, "region": "South", "product": "Gadget", "amount": 100 }
{ "_id": 5, "region": "North", "product": "Widget", "amount": 300 }
{ "_id": 6, "region": "South", "product": "Widget", "amount": 175 }

Objective: Group sales by region and product, calculate the total sales amount, and find the average sales amount per product.

Steps:

Group: Group by region and product, and calculate total and average sales.
Project: Format the output.

Aggregation Pipeline:

db.sales.aggregate([
  {
    $group: {
      _id: { region: "$region", product: "$product" },
      totalSales: { $sum: "$amount" },
      averageSales: { $avg: "$amount" }
    }
  },
  {
    $project: {
      _id: 0,
      region: "$_id.region",
      product: "$_id.product",
      totalSales: 1,
      averageSales: 1
    }
  }
])

Output:

{ "region": "North", "product": "Widget", "totalSales": 500, "averageSales": 250 }
{ "region": "South", "product": "Widget", "totalSales": 325, "averageSales": 162.5 }
{ "region": "North", "product": "Gadget", "totalSales": 250, "averageSales": 250 }
{ "region": "South", "product": "Gadget", "totalSales": 100, "averageSales": 100 }

Use Case 4: Joining Collections (Lookup)

Scenario: Suppose you have two collections named orders and customers. You want to join these collections to retrieve customer information for each order.

Collections:

orders:

{ "_id": 1, "customer_id": "C100", "product": "Widget", "amount": 200 }
{ "_id": 2, "customer_id": "C101", "product": "Gadget", "amount": 150 }
{ "_id": 3, "customer_id": "C100", "product": "Widget", "amount": 300 }

customers:

{ "_id": "C100", "name": "Alice", "email": "alice@example.com" }
{ "_id": "C101", "name": "Bob", "email": "bob@example.com" }

Objective: Join the orders and customers collections to retrieve customer information for each order.

Steps:

Lookup: Join the customers collection with the orders collection based on customer_id.
Project: Format the output.

Aggregation Pipeline:

db.orders.aggregate([
  {
    $lookup: {
      from: "customers",
      localField: "customer_id",
      foreignField: "_id",
      as: "customer_info"
    }
  },
  {
    $unwind: "$customer_info"
  },
  {
    $project: {
      _id: 0,
      order_id: "$_id",
      product: 1,
      amount: 1,
      customer_name: "$customer_info.name",
      customer_email: "$customer_info.email"
    }
  }
])

Output:

YOU NEED ANY HELP? THEN SELECT ANY TEXT.

Top 10 Interview Questions & Answers on MongoDB Use Cases for Aggregation Pipelines

1. Data Transformation and Formatting

Question: How can MongoDB Aggregation Pipelines be used to format and transform raw data into a more readable format? Answer: The $project stage is used to select and reshape fields to be returned from the pipeline. You can also use $addFields to add new fields or modify existing ones.

Example: Suppose you have a collection employees with document fields firstName, lastName, and salary. You want to create a full name field combining firstName and lastName.

db.employees.aggregate([
    {
        $project: {
            _id: 0, // exclude _id
            fullName: { $concat: ["$firstName", " ", "$lastName"] },
            salary: 1
        }
    }
]);

2. Data Filtering

Question: How can Aggregation Pipelines filter documents to return only those that meet specific criteria? Answer: The $match stage is used to filter documents based on query conditions. Similar to a find() query, it is useful for narrowing down your dataset.

Example: To find all employees earning more than $70,000.

db.employees.aggregate([
    {
        $match: { salary: { $gt: 70000 } }
    }
]);

3. Grouping Data

Question: How does MongoDB handle data grouping within a pipeline? Answer: The $group stage groups documents by a certain key and can optionally aggregate data based on various accumulators like $sum, $avg, $max, etc.

Example: Group employees by department and find the average salary in each department.

db.employees.aggregate([
    {
        $group: {
            _id: "$department",
            avgSalary: { $avg: "$salary" }
        }
    }
]);

4. Sorting and Limiting Results

Question: Can Aggregation Pipelines sort and limit the number of documents returned? Answer: Yes, the $sort and $limit stages serve these purposes. $sort organizes documents in a specified order while $limit restricts the number of documents passed to the next stage.

Example: List the top 5 highest-paid employees.

db.employees.aggregate([
    {
        $sort: { salary: -1 }
    },
    {
        $limit: 5
    }
]);

5. Data Lookup and Joining

Question: How can Aggregation Pipelines perform complex lookups and joins between different collections? Answer: The $lookup stage facilitates left outer joins, while $unwind can deconstruct arrays into multiple documents. These are useful for combining data from related collections.

Example: List all employees along with their associated projects from the projects collection.

db.employees.aggregate([
    {
        $lookup: {
            from: "projects",
            localField: "_id",
            foreignField: "employeeId",
            as: "assignedProjects"
        }
    }
]);

6. Conditional Logic and Computations

Question: How can Aggregation Pipelines incorporate conditional logic and computations? Answer: Operators like $cond, $switch, $ifNull, and $add can be used to perform logic and arithmetic operations.

Example: Add a new field to indicate whether an employee is high earning (over $100,000).

db.employees.aggregate([
    {
        $addFields: {
            isHighEarning: { $cond: { if: { $gt: ["$salary", 100000] }, then: true, else: false } }
        }
    }
]);

7. Handling Null and Missing Values

Question: How can Aggregation Pipelines manage null or missing values? Answer: Operators such as $ifNull and $filter allow for handling missing or null values efficiently.

Example: Set salary to 0 for any null or missing value.

db.employees.aggregate([
    {
        $addFields: {
            salary: { $ifNull: ["$salary", 0] }
        }
    }
]);

8. Complex Data Extraction Using `$unwind` and `$group`

Question: How can Aggregation Pipelines handle arrays and extract complex data? Answer: $unwind splits the documents into multiple ones based on an array field. After $unwind, operations like $group can aggregate data across the expanded documents.

Example: Find out how many projects each employee is assigned to.

db.projects.aggregate([
    {
        $unwind: "$employeeIds"
    },
    {
        $group: {
            _id: "$employeeIds",
            projectCount: { $sum: 1 }
        }
    }
]);

9. Pipeline Optimization and Indexing

Question: How can you improve the performance of MongoDB Aggregation Pipelines? Answer: Proper indexing can significantly boost the performance of $match stages. Utilizing indexes on fields that are frequently used in filtering, sorting, and grouping operations can substantially speed up the aggregation processes.

Example: Ensure there is an index on department for the grouping query from earlier.

db.employees.createIndex({ department: 1 });

10. Data Export and Analysis

Question: Can Aggregation Pipelines aid in exporting and analyzing data? Answer: Aggregation Pipelines can output data into various formats for further analysis in other tools via $out to another collection, $merge to merge with another collection, or $export (in newer versions) to export results to a file or other destinations.

Example: Export the aggregated data to a CSV file using the MongoDB Shell.

A Complete Guide - MongoDB Use Cases for Aggregation Pipelines

MongoDB Use Cases for Aggregation Pipelines

1. Data Transformation and Filtering

2. Data Aggregation and Grouping

3. Data Sorting and Limiting

4. Joining Data from Different Collections

5. Calculating Running Totals and Moving Averages

6. Geospatial Aggregation

7. Data Analysis and Reporting

Conclusion

Online Code run

Use Case 1: Calculating Total Sales for a Product

Use Case 2: Filtering and Sorting Data

Use Case 3: Grouping by Multiple Fields and Using Accumulators

Use Case 4: Joining Collections (Lookup)

1. Data Transformation and Formatting

2. Data Filtering

3. Grouping Data

4. Sorting and Limiting Results

5. Data Lookup and Joining

6. Conditional Logic and Computations

7. Handling Null and Missing Values

8. Complex Data Extraction Using $unwind and $group

9. Pipeline Optimization and Indexing

10. Data Export and Analysis

You May Like This Related .NET Topic

8. Complex Data Extraction Using `$unwind` and `$group`