Skip to content

Commit 33e528c

Browse files
authored
DOCSP 32572 - adding performance and considerations to $lookup (#4771) (#5005)
* DOCSP-32572 adding performance and considerations to lookup * DOCSP-32572 adding performance and considerations to lookup * DOCSP-32572 formatting errors * DOCSP-32572 performance and considerations for lookup * DOCSP-32572 performance and considerations for lookup * DOCSP-32572 performance and considerations for lookup * DOCSP-32572 performance and considerations for lookup * DOCSP-32572 performance and considerations for lookup * DOCSP-32572 performance and considerations for lookup * DOCSP-32572 performance and considerations for lookup * DOCSP-32572 performance and considerations for lookup * DOCSP-32572 performance and considerations for lookup * DOCSP-32572 performance and considerations for lookup * DOCSP-32572 fixing bullet spacing * DOCSP-32572 adding general strategies * DOCSP-32572 fixing spacing * DOCSP-32572 adding embedded data modeling reference * DOCSP-32572 adding embedded data modeling reference * DOCSP-32572 copy edits from Jeff * DOCSP-32572 copy edits from Jeff * DOCSP-32572 copy edits from Jeff * DOCSP-32572 copy edits from Jeff * DOCSP-32572 copy edits round 2 * DOCSP-32572 copy edits round 2 * DOCSP-32572 copy edits round 2 * DOCSP-32572 copy edits round 2 * DOCSP-32572 copy edits round 2 * DOCSP-32572 copy edits round 2 * DOCSP-32572 tech edit * DOCSP-32572 fixing line * DOCSP-32572 fixing line * DOCSP-32572 tech edit
1 parent 8fe63a7 commit 33e528c

File tree

2 files changed

+98
-2
lines changed

2 files changed

+98
-2
lines changed

source/core/data-modeling-introduction.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,8 @@ revolves around the structure of documents and how the application
6060
represents relationships between data. MongoDB allows related data to
6161
be embedded within a single document.
6262

63+
.. _embedded-data-modeling:
64+
6365
Embedded Data
6466
~~~~~~~~~~~~~
6567

source/reference/operator/aggregation/lookup.txt

Lines changed: 96 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,8 @@ Syntax
4242

4343
The :pipeline:`$lookup` stage has the following syntaxes:
4444

45+
.. _lookup-single-equality:
46+
4547
Equality Match with a Single Join Condition
4648
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4749

@@ -129,7 +131,7 @@ The operation would correspond to the following pseudo-SQL statement:
129131

130132
See these examples:
131133

132-
- :ref:`lookup-single-equality`
134+
- :ref:`lookup-single-equality-example`
133135
- :ref:`unwind-example`
134136
- :ref:`lookup-mergeObjects`
135137

@@ -506,10 +508,93 @@ in the ``from`` parameter of :pipeline:`$lookup` stages.
506508

507509
For more information, see :ref:`agg-lookup-optimization-sbe`.
508510

511+
.. _lookup-performance-considerations:
512+
513+
Performance Considerations
514+
~~~~~~~~~~~~~~~~~~~~~~~~~~
515+
516+
``$lookup`` performance depends on the type of operation performed.
517+
Refer to the following table for performance considerations for
518+
different ``$lookup`` operations.
519+
520+
.. list-table::
521+
:header-rows: 1
522+
:widths: 20 80
523+
524+
* - ``$lookup`` Operation
525+
- Performance Considerations
526+
527+
* - :ref:`Equality Match with a Single Join
528+
<lookup-single-equality-example>`
529+
530+
- .. _equality-match-performance:
531+
532+
- ``$lookup`` operations that perform equality matches with a
533+
single join typically perform better when the source collection
534+
contains an index on the ``foreignField``.
535+
536+
* - :ref:`Uncorrelated Subqueries<lookup-uncorrelated-subquery>`
537+
538+
- .. _uncorrelated-subqueries-performance:
539+
540+
- ``$lookup`` operations that contain uncorrelated subqueries
541+
typically perform better when the inner pipeline can reference
542+
an index on the ``foreignField``.
543+
544+
- MongoDB only needs to run the ``$lookup`` subquery once before
545+
caching the query because there is no relationship between the
546+
source and foreign collections. The ``$lookup`` subquery is not
547+
based on any value in the source collection. This behavior
548+
improves performance for subsequent executions of this query.
549+
550+
551+
* - :ref:`Correlated Subqueries <lookup-concise-correlated-subquery>`
552+
553+
- .. _correlated-subqueries-performance:
554+
555+
- ``$lookup`` operations that contain correlated subqueries
556+
typically perform better when the following conditions apply:
557+
558+
- The source collection contains an index on the
559+
``localField``.
560+
561+
- The foreign collection contains an index on the
562+
``foreignField``.
563+
564+
- The foreign collection contains an index that references the
565+
inner pipline.
566+
567+
- If your pipeline passes a large number of documents to the
568+
``$lookup`` query, the following strategies may improve
569+
performance:
570+
571+
- Reduce the number of documents that MongoDB passes to the
572+
``$lookup`` query. For example, set a stricter filter
573+
during the ``$match`` stage.
574+
575+
- Run the inner pipeline of the ``$lookup`` subquery as a
576+
separate query and use ``$out`` to create a temporary
577+
collection. Then, run an :ref:`equality match with a single
578+
join <lookup-single-equality>`.
579+
580+
- Reconsider the data's schema to ensure it is optimal for the
581+
use case.
582+
583+
For general performance strategies, see :ref:`Indexing Strategies
584+
<manual-indexing-strategies>` and :ref:`Query Optimization
585+
<read-operations-indexing>`.
586+
587+
.. important::
588+
589+
Excessive use of ``$lookup`` within a query may slow down
590+
performance. To avoid multiple ``$lookup`` stages, consider an
591+
:ref:`embedded data model <embedded-data-modeling>` to optimize query
592+
performance.
593+
509594
Examples
510595
--------
511596

512-
.. _lookup-single-equality:
597+
.. _lookup-single-equality-example:
513598

514599
Perform a Single Equality Join with ``$lookup``
515600
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -599,6 +684,9 @@ The operation corresponds to this pseudo-SQL statement:
599684
WHERE sku = orders.item
600685
);
601686

687+
For more information, see
688+
:ref:`Equality Match Performance Considerations<equality-match-performance>`.
689+
602690
.. _unwind-example:
603691

604692
Use ``$lookup`` with an Array
@@ -972,6 +1060,9 @@ The operation corresponds to this pseudo-SQL statement:
9721060
WHERE year = 2018
9731061
);
9741062

1063+
For more information, see
1064+
:ref:`Uncorrelated Subquery Performance Considerations <uncorrelated-subqueries-performance>`.
1065+
9751066
.. _lookup-concise-correlated-subquery:
9761067

9771068
Perform a Concise Correlated Subquery with ``$lookup``
@@ -1136,3 +1227,6 @@ The previous examples correspond to this pseudo-SQL statement:
11361227
WHERE restaurants.name = orders.restaurant_name
11371228
AND restaurants.beverages = orders.drink
11381229
);
1230+
1231+
For more information, see
1232+
:ref:`Correlated Subquery Performance Considerations <correlated-subqueries-performance>`.

0 commit comments

Comments
 (0)