Skip to content

Commit 58348ac

Browse files
authored
DOCSP-30546: aggregation fundamentals (#10)
1 parent 4dbf1ca commit 58348ac

File tree

5 files changed

+328
-2
lines changed

5 files changed

+328
-2
lines changed

source/fundamentals.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ Fundamentals
1010

1111
/fundamentals/connections
1212
/fundamentals/crud
13+
/fundamentals/aggregation
1314
/fundamentals/run-command
1415

1516
..
@@ -19,7 +20,6 @@ Fundamentals
1920
/fundamentals/auth
2021
/fundamentals/enterprise-auth
2122
/fundamentals/bson
22-
/fundamentals/aggregation
2323
/fundamentals/indexes
2424
/fundamentals/transactions
2525
/fundamentals/logging

source/fundamentals/aggregation.txt

Lines changed: 240 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,240 @@
1+
.. _rust-aggregation:
2+
3+
===========
4+
Aggregation
5+
===========
6+
7+
.. contents:: On this page
8+
:local:
9+
:backlinks: none
10+
:depth: 2
11+
:class: singlecol
12+
13+
Overview
14+
--------
15+
16+
In this guide, you can learn how to perform **aggregation operations** in
17+
the {+driver-short+}.
18+
19+
Aggregation operations process data in your MongoDB collections based on
20+
specifications you can set in an **aggregation pipeline**. An aggregation
21+
pipeline consists of one or more **stages**. Each stage performs an
22+
operation based on its expression operators. After the driver executes
23+
the aggregation pipeline, it returns an aggregated result.
24+
25+
Analogy
26+
~~~~~~~
27+
28+
Aggregation operations function similarly to car factories with assembly
29+
lines. The assembly lines have stations with specialized tools to
30+
perform specific tasks. For example, when building a car, the assembly
31+
line begins with the frame. Then, as the car frame moves through the
32+
assembly line, each station assembles a separate part. The result is a
33+
transformed final product, the finished car.
34+
35+
The assembly line represents the *aggregation pipeline*, the individual
36+
stations represent the *aggregation stages*, the specialized tools
37+
represent the *expression operators*, and the finished product
38+
represents the *aggregated result*.
39+
40+
Compare Aggregation and Find Operations
41+
---------------------------------------
42+
43+
The following table lists the different tasks you can perform with find
44+
operations, compared to what you can achieve with aggregation
45+
operations. The aggregation framework provides expanded functionality
46+
that allows you to transform and manipulate your data.
47+
48+
.. list-table::
49+
:header-rows: 1
50+
:widths: 50 50
51+
52+
* - Find Operations
53+
- Aggregation Operations
54+
55+
* - | Select *certain* documents to return
56+
| Select *which* fields to return
57+
| Sort the results
58+
| Limit the results
59+
| Count the results
60+
- | Select *certain* documents to return
61+
| Select *which* fields to return
62+
| Sort the results
63+
| Limit the results
64+
| Count the results
65+
| Rename fields
66+
| Compute new fields
67+
| Summarize data
68+
| Connect and merge data sets
69+
70+
Limitations
71+
-----------
72+
73+
When performing aggregation operations, consider the following
74+
limitations:
75+
76+
- Returned documents must not violate the :manual:`BSON document size
77+
limit </reference/limits/#BSON-Document-Size>` of 16 megabytes.
78+
- Pipeline stages have a memory limit of 100 megabytes by default. If
79+
required, you may exceed this limit by setting the `allow_disk_use
80+
<{+api+}/options/struct.AggregateOptions.html#structfield.allow_disk_use>`__
81+
field in your ``AggregateOptions``.
82+
- The :manual:`$graphLookup
83+
</reference/operator/aggregation/graphLookup/>` operator
84+
has a strict memory limit of 100 megabytes and ignores
85+
the ``allow_disk_use`` setting.
86+
87+
Examples
88+
--------
89+
90+
.. TODO decide if we should use structs on this page. might get long
91+
92+
To run the examples in this section, load the sample data into a
93+
collection called ``db.site_users`` with the following code:
94+
95+
.. literalinclude:: /includes/fundamentals/code-snippets/aggregation.rs
96+
:start-after: begin-insert
97+
:end-before: end-insert
98+
:language: rust
99+
:dedent:
100+
101+
.. include:: /includes/fundamentals/automatic-creation.rst
102+
103+
Each document represents a user of a book-reviewing website and contains
104+
information about their name, age, genre interests, and date that they were
105+
last active on the website.
106+
107+
Age Insights by Genre
108+
~~~~~~~~~~~~~~~~~~~~~
109+
110+
The following example calculates the average, minimum, and maximum age of users
111+
interested in each genre.
112+
113+
The aggregation pipeline contains the following stages:
114+
115+
- An ``$unwind`` stage to separate each array entry in the
116+
``genre_interests`` field into a new document.
117+
- A ``$group`` stage to group documents by the value of the
118+
``genre_interests`` field. This stage finds the average, minimum, and
119+
maximum user age for users, using the ``$avg``, ``$min``, and ``$max``
120+
operators, respectively.
121+
122+
.. io-code-block::
123+
124+
.. input:: /includes/fundamentals/code-snippets/aggregation.rs
125+
:start-after: begin-age-agg
126+
:end-before: end-age-agg
127+
:language: rust
128+
:dedent:
129+
130+
.. output::
131+
:language: console
132+
:visible: false
133+
134+
* { "_id": "memoir", "avg_age": 25.8, "min_age": 18, "max_age": 39 }
135+
* { "_id": "sci-fi", "avg_age": 42, "min_age": 18, "max_age": 66 }
136+
* { "_id": "fiction", "avg_age": 33.333333333333336, "min_age": 16, "max_age": 66 }
137+
* { "_id": "nonfiction", "avg_age": 53.5, "min_age": 31, "max_age": 76 }
138+
* { "_id": "self help", "avg_age": 56, "min_age": 56, "max_age": 56 }
139+
* { "_id": "poetry", "avg_age": 39, "min_age": 39, "max_age": 39 }
140+
* { "_id": "literary", "avg_age": 49.5, "min_age": 21, "max_age": 76 }
141+
* { "_id": "fantasy", "avg_age": 34.666666666666664, "min_age": 18, "max_age": 66 }
142+
* { "_id": "mystery", "avg_age": 24.666666666666668, "min_age": 20, "max_age": 31 }
143+
* { "_id": "theory", "avg_age": 33, "min_age": 21, "max_age": 45 }
144+
* { "_id": "art", "avg_age": 39, "min_age": 39, "max_age": 39 }
145+
* { "_id": "sports", "avg_age": 22.5, "min_age": 16, "max_age": 29 }
146+
147+
Group by Time Component
148+
~~~~~~~~~~~~~~~~~~~~~~~
149+
150+
The following example finds how many users were last active in each
151+
month.
152+
153+
The aggregation pipeline contains the following stages:
154+
155+
- A ``$project`` stage to extract the month from the ``last_active``
156+
field as a number into the ``month_last_active`` field.
157+
- A ``$group`` stage to group documents by the ``month_last_active``
158+
field and count the number of documents for each month.
159+
- A ``$sort`` stage to set an ascending sort on the month.
160+
161+
.. io-code-block::
162+
163+
.. input:: /includes/fundamentals/code-snippets/aggregation.rs
164+
:start-after: begin-lastactive-agg
165+
:end-before: end-lastactive-agg
166+
:language: rust
167+
:dedent:
168+
169+
.. output::
170+
:language: console
171+
:visible: false
172+
173+
* { "_id": { "month_last_active": 1 }, "number": 3 }
174+
* { "_id": { "month_last_active": 5 }, "number": 4 }
175+
* { "_id": { "month_last_active": 6 }, "number": 1 }
176+
* { "_id": { "month_last_active": 7 }, "number": 1 }
177+
* { "_id": { "month_last_active": 8 }, "number": 2 }
178+
* { "_id": { "month_last_active": 11 }, "number": 1 }
179+
180+
Calculate Popular Genres
181+
~~~~~~~~~~~~~~~~~~~~~~~~
182+
183+
The following example finds the three most popular genres based on how
184+
often they appear in users' interests.
185+
186+
The aggregation pipeline contains the following stages:
187+
188+
- An ``$unwind`` stage to separate each array entry in the
189+
``genre_interests`` field into a new document.
190+
- A ``$group`` stage to group documents by the ``genre_interests``
191+
field and count the number of documents for each genre.
192+
- A ``$sort`` stage to set a descending sort on the genre popularity.
193+
- A ``$limit`` stage to show only the first three genres.
194+
195+
.. io-code-block::
196+
197+
.. input:: /includes/fundamentals/code-snippets/aggregation.rs
198+
:start-after: begin-popular-agg
199+
:end-before: end-popular-agg
200+
:language: rust
201+
:dedent:
202+
203+
.. output::
204+
:language: console
205+
:visible: false
206+
207+
* { "_id": "fiction", "number": 6 }
208+
* { "_id": "memoir", "number": 5 }
209+
* { "_id": "literary", "number": 4 }
210+
211+
Additional Information
212+
----------------------
213+
214+
To learn more about the terms mentioned, see the following
215+
guides:
216+
217+
- :manual:`Expression Operators </reference/operator/aggregation/>`
218+
- :manual:`Aggregation Pipeline </core/aggregation-pipeline/>`
219+
- :manual:`Aggregation Stages </meta/aggregation-quick-reference/#stages>`
220+
- :manual:`Operator Expressions </meta/aggregation-quick-reference/#operator-expressions>`
221+
- :manual:`Aggregation Pipeline Limits </core/aggregation-pipeline-limits/>`
222+
223+
.. TODO To view more aggregation examples, see the following guides:
224+
..
225+
.. - :ref:`Count <>`
226+
.. - :ref:`Limit <>`
227+
.. - :ref:`Skip <>`
228+
.. - :ref:`Text <>`
229+
230+
.. TODO To learn more about the ``aggregate()`` method and its behavior, see
231+
.. :ref:`Retrieve Data <>`.
232+
233+
API Documentation
234+
~~~~~~~~~~~~~~~~~
235+
236+
To learn more about any of the methods or types discussed in this
237+
guide, see the following API Documentation:
238+
239+
- `aggregate() <{+api+}/struct.Collection.html#method.aggregate>`__
240+
- `AggregateOptions <{+api+}/options/struct.AggregateOptions.html>`__

source/includes/fundamentals-sections.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ Fundamentals section:
33

44
- :ref:`Connect to MongoDB <rust-connection>`
55
- :ref:`Read from and Write to MongoDB <rust-crud>`
6+
- :ref:`Perform Aggregations <rust-aggregation>`
67
- :ref:`Run A Database Command <rust-run-command>`
78

89
..
@@ -11,7 +12,6 @@ Fundamentals section:
1112
- :ref:`Authenticate to MongoDB <rust-authentication-mechanisms>`
1213
- :ref:`Connect with Enterprise Authentication Mechanisms <rust-enterprise-authentication-mechanisms>`
1314
- :ref:`Convert Data to and from BSON <rust-bson>`
14-
- :ref:`Perform Aggregations <rust-aggregation>`
1515
- :ref:`Construct Indexes <rust-indexes>`
1616
- :ref:`Specify Collations to Order Results <rust-collations>`
1717
- :ref:`Record Log Messages <rust-logging>`
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
.. tip:: Non-existent Databases and Collections
2+
3+
If a database and collection don't exist when
4+
you perform a write operation on them, the server automatically
5+
creates them.
Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
use chrono::{ Utc };
2+
use mongodb::{ Client, Collection };
3+
use serde::{ Deserialize, Serialize };
4+
5+
#[tokio::main]
6+
async fn main() -> mongodb::error::Result<()> {
7+
let uri: &str = "<connection string>";
8+
9+
let client: Client = Client::with_uri_str(uri).await?;
10+
11+
// begin-insert
12+
let my_coll: Collection<Document> = client.database("db").collection("site_users");
13+
14+
let docs = vec![
15+
doc! { "name": "Sonya Mehta", "age": 23, "genre_interests": vec!["fiction", "mystery", "memoir"], "last_active": Utc.with_ymd_and_hms(2019, 5, 13, 0, 0, 0).unwrap() },
16+
doc! { "name": "Selena Sun", "age": 45, "genre_interests": vec!["fiction", "literary", "theory"], "last_active": Utc.with_ymd_and_hms(2019, 5, 25, 0, 0, 0).unwrap() },
17+
doc! { "name": "Carter Johnson", "age": 56, "genre_interests": vec!["literary", "self help"], "last_active": Utc.with_ymd_and_hms(2019, 5, 31, 0, 0, 0).unwrap() },
18+
doc! { "name": "Rick Cortes", "age": 18, "genre_interests": vec!["sci-fi", "fantasy", "memoir"], "last_active": Utc.with_ymd_and_hms(2019, 7, 1, 0, 0, 0).unwrap() },
19+
doc! { "name": "Belinda James", "age": 76, "genre_interests": vec!["literary", "nonfiction"], "last_active": Utc.with_ymd_and_hms(2019, 6, 11, 0, 0, 0).unwrap() },
20+
doc! { "name": "Corey Saltz", "age": 29, "genre_interests": vec!["fiction", "sports", "memoir"], "last_active": Utc.with_ymd_and_hms(2019, 1, 23, 0, 0, 0).unwrap() },
21+
doc! { "name": "John Soo", "age": 16, "genre_interests": vec!["fiction", "sports"], "last_active": Utc.with_ymd_and_hms(2019, 1, 3, 0, 0, 0).unwrap() },
22+
doc! { "name": "Lisa Ray", "age": 39, "genre_interests": vec!["poetry", "art", "memoir"], "last_active": Utc.with_ymd_and_hms(2019, 5, 30, 0, 0, 0).unwrap() },
23+
doc! { "name": "Kiran Murray", "age": 20, "genre_interests": vec!["mystery", "fantasy", "memoir"], "last_active": Utc.with_ymd_and_hms(2019, 1, 30, 0, 0, 0).unwrap() },
24+
doc! { "name": "Beth Carson", "age": 31, "genre_interests": vec!["mystery", "nonfiction"], "last_active": Utc.with_ymd_and_hms(2019, 8, 4, 0, 0, 0).unwrap() },
25+
doc! { "name": "Thalia Dorn", "age": 21, "genre_interests": vec!["theory", "literary", "fiction"], "last_active": Utc.with_ymd_and_hms(2019, 8, 19, 0, 0, 0).unwrap() },
26+
doc! { "name": "Arthur Ray", "age": 66, "genre_interests": vec!["sci-fi", "fantasy", "fiction"], "last_active": Utc.with_ymd_and_hms(2019, 11, 27, 0, 0, 0).unwrap() }
27+
];
28+
29+
my_coll.insert_many(docs, None).await?;
30+
// end-insert
31+
32+
// begin-age-agg
33+
let age_pipeline = vec![
34+
doc! { "$unwind": doc! { "path": "$genre_interests" } },
35+
doc! { "$group": doc! {
36+
"_id": "$genre_interests",
37+
"avg_age": doc! { "$avg": "$age" },
38+
"min_age": doc! { "$min": "$age" },
39+
"max_age": doc! { "$max": "$age" }
40+
} }
41+
];
42+
43+
let mut results = my_coll.aggregate(age_pipeline, None).await?;
44+
while let Some(result) = results.try_next().await? {
45+
let doc: Document = bson::from_document(result)?;
46+
println!("* {}", doc);
47+
}
48+
// end-age-agg
49+
50+
// begin-lastactive-agg
51+
let last_active_pipeline = vec![
52+
doc! { "$project": { "month_last_active" : doc! { "$month" : "$last_active" } } },
53+
doc! { "$group": doc! { "_id" : doc! {"month_last_active": "$month_last_active"} ,
54+
"number" : doc! { "$sum" : 1 } } },
55+
doc! { "$sort": { "_id.month_last_active" : 1 } }
56+
];
57+
58+
let mut results = my_coll.aggregate(last_active_pipeline, None).await?;
59+
while let Some(result) = results.try_next().await? {
60+
let doc: Document = bson::from_document(result)?;
61+
println!("* {}", doc);
62+
}
63+
// end-lastactive-agg
64+
65+
// begin-popular-agg
66+
let popularity_pipeline = vec![
67+
doc! { "$unwind" : "$genre_interests" },
68+
doc! { "$group" : doc! { "_id" : "$genre_interests" , "number" : doc! { "$sum" : 1 } } },
69+
doc! { "$sort" : doc! { "number" : -1 } },
70+
doc! { "$limit": 3 }
71+
];
72+
73+
let mut results = my_coll.aggregate(popularity_pipeline, None).await?;
74+
while let Some(result) = results.try_next().await? {
75+
let doc: Document = bson::from_document(result)?;
76+
println!("* {}", doc);
77+
}
78+
// end-popular-agg
79+
80+
Ok(())
81+
}

0 commit comments

Comments
 (0)