Skip to content

Commit b97812a

Browse files
Aggregation pipeline docs (#11)
* DOCSP-8674 Doc for lookup stage * DOCSP-9965 Doc for out stage
1 parent 4285a87 commit b97812a

File tree

5 files changed

+533
-0
lines changed

5 files changed

+533
-0
lines changed

source/reference.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,3 +21,4 @@ This section contains information on the following:
2121

2222
/reference/api/datalakes-api
2323
/reference/cli/datalake-cli
24+
/reference/pipeline/aggr-pipeline

source/reference/examples/path-syntax-examples.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,8 @@ to the following:
8888
In the above example, ``startDate`` is interpreted as a string. For more
8989
information on all supported data types, see :ref:`datalake-path-attribute-types`.
9090

91+
.. _datalake-path-syntax-egs:
92+
9193
Examples
9294
--------
9395

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
.. _adl-aggregation-pipeline:
2+
3+
===========================
4+
Aggregation Pipeline Stages
5+
===========================
6+
7+
.. default-domain:: mongodb
8+
9+
.. include:: /includes/fact-data-lake-beta.rst
10+
11+
{+data-lake+} supports the MongoDB :manual:`pipeline
12+
</core/aggregation-pipeline/>` stages described in
13+
:ref:`data-lake-mql-support`. In addition, {+data-lake-short+}
14+
introduces alternate syntax for the following stages:
15+
16+
- :ref:`adl-lookup-stage`
17+
- :ref:`adl-out-stage`
18+
19+
.. class:: hidden
20+
21+
.. toctree::
22+
:titlesonly:
23+
24+
/reference/pipeline/lookup-stage
25+
/reference/pipeline/out
Lines changed: 215 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,215 @@
1+
.. _adl-lookup-stage:
2+
3+
===========
4+
``$lookup``
5+
===========
6+
7+
.. default-domain:: mongodb
8+
9+
.. include:: /includes/fact-data-lake-beta.rst
10+
11+
.. contents:: On this page
12+
:local:
13+
:backlinks: none
14+
:depth: 2
15+
:class: singlecol
16+
17+
``$lookup`` performs a left outer join to an unsharded collection in the
18+
same database to filter in documents from the “joined” collection for
19+
processing. For more information, see :manual:`$lookup </manual/reference/operator/aggregation/lookup/>`. In {+data-lake+}, ``$lookup`` can be used to perform a join of collections from different databases.
20+
21+
.. _adl-lookup-syntax:
22+
23+
Syntax
24+
------
25+
26+
The ``$lookup``
27+
:manual:`syntax </reference/operator/aggregation/lookup/#syntax>` is
28+
described in the MongoDB server manual. In {+data-lake-short+}, the ``from``
29+
field in ``$lookup`` has the following alternate syntax to support specifying
30+
an object containing an optional database name and a required collection name:
31+
32+
.. tabs::
33+
34+
.. tab:: Equality Match
35+
:tabid: equalitymatch
36+
37+
.. code-block:: json
38+
39+
{
40+
$lookup: {
41+
localField: "<fieldName>",
42+
from: <collection-to-join>|{db: <db>, coll: <collection-to-join>},
43+
foreignField: "<fieldName>",
44+
as: "<output-array-field>",
45+
}
46+
}
47+
48+
.. tab:: Join Conditions and Uncorrelated Sub-queries
49+
:tabid: joincondition
50+
51+
.. code-block:: json
52+
53+
{
54+
$lookup: {
55+
from: <collection to join>|{db: <db>, coll: <collection-to-join>},
56+
let: { <var_1>: <expression>, …, <var_n>: <expression> },
57+
pipeline: [ <pipeline to execute on the collection to join> ],
58+
as: <output array field>
59+
}
60+
}
61+
62+
``from`` Field Object
63+
---------------------
64+
65+
.. list-table::
66+
:header-rows: 1
67+
:widths: 10 10 70 10
68+
69+
* - Field
70+
- Type
71+
- Description
72+
- Necessity
73+
74+
* - ``db``
75+
- string
76+
- The database name.
77+
78+
If the database name is specified, data is read from the collection in
79+
the specified database. If you specify a database name that is different
80+
from the database upon which the command is operating, all nested
81+
:manual:`$lookup
82+
</reference/operator/aggregation/lookup/#pipe._S_lookup>` stages **must**
83+
also specify a database name.
84+
85+
If the database name is not specified within a :manual:`$lookup
86+
</reference/operator/aggregation/lookup/#pipe._S_lookup>` stage,
87+
collections in the stage inherit the database name specified in the closest
88+
parent :manual:`$lookup </reference/operator/aggregation/lookup/#pipe._S_lookup>`
89+
stage if it exists or the database upon which the command is operating.
90+
- Conditional
91+
92+
* - ``coll``
93+
- string
94+
- The collection name.
95+
- Required
96+
97+
Examples
98+
--------
99+
100+
Suppose there are three databases named ``sourceDB1``, ``sourceDB2``, and
101+
``sourceDB3`` with the following collections:
102+
103+
.. tabs::
104+
105+
.. tab:: sourceDB1
106+
:tabid: sourceDB1
107+
108+
.. code-block:: json
109+
110+
db.orders.insert([
111+
{ "_id" : 1, "item" : "almonds", "price" : 12, "quantity" : 2 },
112+
{ "_id" : 2, "item" : "pecans", "price" : 20, "quantity" : 1 },
113+
{ "_id" : 3 }
114+
])
115+
116+
.. tab:: sourceDB2
117+
:tabid: sourceDB2
118+
119+
.. code-block:: json
120+
121+
db.catalog.insert([
122+
{ "_id" : 1, "sku" : "almonds", description: "product 1" },
123+
{ "_id" : 2, "sku" : "bread", description: "product 2" },
124+
{ "_id" : 3, "sku" : "cashews", description: "product 3" },
125+
{ "_id" : 4, "sku" : "pecans", description: "product 4" },
126+
{ "_id" : 5, "sku": null, description: "Incomplete" },
127+
{ "_id" : 6 }
128+
])
129+
130+
.. tab:: sourceDB3
131+
:tabid: sourceDB3
132+
133+
.. code-block:: json
134+
135+
db.warehouses.insert([
136+
{ "_id" : 1, "stock_item" : "almonds", warehouse: "A", "instock" : 120 },
137+
{ "_id" : 2, "stock_item" : "pecans", warehouse: "A", "instock" : 70 },
138+
{ "_id" : 3, "stock_item" : "cashews", warehouse: "B", "instock" : 60 },
139+
{ "_id" : 4, "stock_item" : "bread", warehouse: "B", "instock" : 80 },
140+
{ "_id" : 5, "stock_item" : "cookies", warehouse: "A", "instock" : 80 }
141+
])
142+
143+
The following examples use the :manual:`$lookup
144+
</reference/operator/aggregation/lookup/#pipe._S_lookup>` aggregation stage
145+
to join documents from one collection with the documents from the collection
146+
in the other databases.
147+
148+
Basic Example
149+
~~~~~~~~~~~~~
150+
151+
The following aggregation operation on the ``sourceDB1.orders`` collection joins the documents from the ``orders`` collection with the documents
152+
from the ``sourceDB2.catalog`` collection using the ``item`` field from the ``orders`` collection and the ``sku`` field from the ``catalog`` collection:
153+
154+
.. code-block:: json
155+
156+
db.getSiblingDb("sourceDB1").orders.aggregate(
157+
{
158+
$lookup: {
159+
from: { db: "sourceDB2", coll: "catalog" },
160+
localField: "item",
161+
foreignField: "sku",
162+
as: "inventory_docs"
163+
}
164+
}
165+
)
166+
167+
Nested Example
168+
~~~~~~~~~~~~~~
169+
170+
The following aggregation operation on the ``sourceDB1.orders`` collection joins the documents from the ``orders`` collection with the documents from the ``sourceDB2.catalog`` collection and the documents from the ``sourceDB3.warehouses`` collection using the ``item`` field from the ``orders`` collection, the ``sku`` field from the ``catalog`` collection, and the ``stock_item`` and ``instock`` fields from the ``warehouses`` collection:
171+
172+
.. code-block:: json
173+
174+
db.getSiblingDb(“sourceDB1”).orders.aggregate(
175+
[
176+
{
177+
$lookup: {
178+
from: db: "sourceDB2", coll: "catalog",
179+
let: { "order_sku": "$item" },
180+
pipeline: [
181+
{
182+
$match: {
183+
$expr: {
184+
$eq: ["$sku", "$$order_sku"]
185+
}
186+
}
187+
},
188+
{
189+
$lookup: {
190+
from: db: "sourceDB3", coll: "warehouses",
191+
pipeline: [
192+
{
193+
$match: {
194+
$expr:{
195+
$eq : ["$stock_item", "$$order_sku"]
196+
}
197+
}
198+
},
199+
{
200+
$project : { instock: 1, _id: 0}
201+
}
202+
],
203+
as: "wh"
204+
}
205+
},
206+
{ "$unwind": "$wh" },
207+
{
208+
$project : { description: 1, instock: "$wh.instock", _id: 0}
209+
}
210+
],
211+
as: "inventory"
212+
},
213+
},
214+
]
215+
)

0 commit comments

Comments
 (0)