Skip to content

Commit 0e9802e

Browse files
DOCSP-15743 Enable count param in collStats agg stage (#150)
* DOCSP-15743 Enable count param in collStats agg stage * DOCSP-15743 updates for review feedback
1 parent 62ceb80 commit 0e9802e

File tree

1 file changed

+100
-46
lines changed

1 file changed

+100
-46
lines changed

source/reference/pipeline/collstats.txt

Lines changed: 100 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -12,32 +12,54 @@
1212
:depth: 2
1313
:class: singlecol
1414

15-
``$collStats`` returns statistics for a given collection. ``$collstats`` must
16-
be the first stage in the aggregation pipeline. For more information,
17-
see :manual:`$collStats </reference/operator/aggregation/collStats/>`. In
18-
{+data-lake+}, ``$collStats`` can only be used to retrieve information about the
15+
``$collStats`` returns statistics for a given collection.
16+
``$collstats`` must be the first stage in the aggregation pipeline. For
17+
more information, see :manual:`$collStats
18+
</reference/operator/aggregation/collStats/>`. In {+dl+},
19+
``$collStats`` can only be used to retrieve information about the
1920
partitions for a given collection or view.
2021

2122
.. _adl-collstats-syntax:
2223

2324
Syntax
2425
------
2526

26-
In {+adl+}, :manual:`$collStats </reference/operator/aggregation/collStats/>`
27-
accepts an empty document. It does not support any of the optional fields
28-
supported by the MongoDB server and returns an error if an unsupported option
29-
is specified.
27+
In {+adl+}, :manual:`$collStats
28+
</reference/operator/aggregation/collStats/>` accepts an empty
29+
document. It supports the optional field ``count`` only and returns
30+
an error if an unsupported option is specified.
3031

3132
.. code-block:: sh
3233

33-
db.<collection-name>|<view-name>.aggregate([{ "$collStats" : {} }])
34+
db.<collection-name>|<view-name>.aggregate([{ "$collStats" : { "count" : {} } }])
35+
36+
.. _adl-collstats-fields:
37+
38+
Fields
39+
------
40+
41+
.. list-table::
42+
:header-rows: 1
43+
:widths: 10 10 70 10
44+
45+
* - Field
46+
- type
47+
- Description
48+
- Necessity
49+
50+
* - ``count``
51+
- document
52+
- Adds the total number of documents in the partitions to the
53+
return document.
54+
- Optional
3455

3556
.. _adl-collstats-output:
3657

3758
Output
3859
------
3960

40-
``$collStats`` returns the following fields in the document for each partition:
61+
``$collStats`` returns the following fields in the document for each
62+
partition:
4163

4264
.. list-table::
4365
:header-rows: 1
@@ -47,37 +69,44 @@ Output
4769
- Type
4870
- Description
4971

72+
* - ``count``
73+
- number
74+
- The total number of documents in the partition. This is returned
75+
only if you specify the ``count`` option.
76+
5077
* - ``ns``
5178
- string
5279
- The namespace of the current collection or view in the format
5380
``[database].[collection|view]``.
5481

5582
* - ``partition``
5683
- document
57-
- The details about the partition such as the source, format, size, and
58-
:ref:`partition attributes <datalake-path-attribute-types>`, if any.
84+
- The details about the partition such as the source, format,
85+
size, and :ref:`partition attributes
86+
<datalake-path-attribute-types>`, if any.
5987

6088
* - ``partition.format``
6189
- string
6290
- The format of the file. Value can be any of the
63-
:ref:`data-lake-data-formats` for data in |s3| bucket or ``MONGO`` for
64-
data in the |service| cluster.
91+
:ref:`data-lake-data-formats` for data in |s3| bucket or
92+
``MONGO`` for data in the |service| cluster.
6593

6694
* - ``partition.attributes``
6795
- document
68-
- The :ref:`partition attributes <datalake-path-attribute-types>` for this
69-
partition defined in the
70-
:datalakeconf:`~databases.[n].collections.[n].dataSources.[n].path` for
71-
|s3| partitions. An empty document indicates that there are no partition
72-
attributes in the partition's data source.
96+
- The :ref:`partition attributes <datalake-path-attribute-types>`
97+
for this partition defined in the
98+
:datalakeconf:`~databases.[n].collections.[n].dataSources.[n].
99+
path` for |s3| partitions. An empty document indicates that
100+
there are no partition attributes in the partition's data source.
73101

74102
* - ``partition.size``
75103
- int
76104
- The size of the partition.
77105

78106
* - ``partition.source``
79107
- string
80-
- The source for the partition. The value can be one of the following:
108+
- The source for the partition. The value can be one of the
109+
following:
81110

82111
- The path to the file on |s3|.
83112
- The cluster name for partitions on |service|.
@@ -87,50 +116,75 @@ Output
87116
Examples
88117
--------
89118

90-
The following example shows :manual:`$collStats
91-
</reference/operator/aggregation/collStats/>` syntax for retrieving the
92-
partitions from a ``s3Db.abc`` collection with 3 files in an |s3|
93-
{+data-lake-store+}:
119+
.. tabs::
94120

95-
.. code-block:: sh
121+
.. tab:: Basic Example
122+
:tabid: basic
96123

97-
use s3Db
98-
db.abc.aggregate([ {$collStats: {}} ])
124+
The following example shows :manual:`$collStats
125+
</reference/operator/aggregation/collStats/>` syntax for
126+
retrieving the partitions from a ``s3Db.abc`` collection with 3
127+
files in an |s3| {+data-lake-store+}:
99128

100-
The preceding command returns the following output:
129+
.. code-block:: sh
101130

102-
.. code-block:: json
103-
:copyable: false
131+
use s3Db
132+
db.abc.aggregate([ {$collStats: {}} ])
104133

105-
{ "ns" : "s3Db.abc", "partition" : { "format" : "JSON", "attributes" : { "year" : NumberLong(2018) }, "size" : 139, "source" : "s3://my-bucket/s3Db/abc/2018/1.json?delimiter=%2F&region=us-east-1" } }
106-
{ "ns" : "s3Db.abc", "partition" : { "format" : "JSON", "attributes" : { "year" : NumberLong(2017) }, "size" : 124, "source" : "s3://my-bucket/s3Db/abc/2017/1.json?delimiter=%2F&region=us-east-1" } }
107-
{ "ns" : "s3Db.abc", "partition" : { "format" : "JSON", "attributes" : { "year" : NumberLong(2017) }, "size" : 130, "source" : "s3://my-bucket/s3Db/abc/2017/2.json?delimiter=%2F&region=us-east-1" } }
134+
The preceding command returns the following output:
108135

109-
The following example shows :manual:`$collStats
110-
</reference/operator/aggregation/collStats/>` syntax for retrieving the
111-
partitions from the ``atlasDb.sampleColl`` collection in the |service| cluster
112-
named ``mySandboxCluster``:
136+
.. code-block:: json
137+
:copyable: false
113138

114-
.. code-block:: sh
139+
{ "ns" : "s3Db.abc", "partition" : { "format" : "JSON", "attributes" : { "year" : NumberLong(2018) }, "size" : 139, "source" : "s3://my-bucket/s3Db/abc/2018/1.json?delimiter=%2F&region=us-east-1" } }
140+
{ "ns" : "s3Db.abc", "partition" : { "format" : "JSON", "attributes" : { "year" : NumberLong(2017) }, "size" : 124, "source" : "s3://my-bucket/s3Db/abc/2017/1.json?delimiter=%2F&region=us-east-1" } }
141+
{ "ns" : "s3Db.abc", "partition" : { "format" : "JSON", "attributes" : { "year" : NumberLong(2017) }, "size" : 130, "source" : "s3://my-bucket/s3Db/abc/2017/2.json?delimiter=%2F&region=us-east-1" } }
115142

116-
use atlasDb
117-
db.sampleColl.aggregate([ {$collStats: {}} ])
143+
The following example shows :manual:`$collStats
144+
</reference/operator/aggregation/collStats/>` syntax for
145+
retrieving the partitions from the ``atlasDb.sampleColl``
146+
collection in the |service| cluster named ``mySandboxCluster``:
118147

119-
The preceding command returns the following output:
148+
.. code-block:: sh
120149

121-
.. code-block:: json
122-
:copyable: false
150+
use atlasDb
151+
db.sampleColl.aggregate([ {$collStats: {}} ])
152+
153+
The preceding command returns the following output:
154+
155+
.. code-block:: json
156+
:copyable: false
157+
158+
{ "ns" : "atlasDb.sampleColl", "partition" : { "format" : "MONGO", "attributes" : { }, "size" : 94362191, "source" : "mySandboxCluster" } }
159+
160+
.. tab:: Count Example
161+
:tabid: count
162+
163+
The following example shows :manual:`$collStats
164+
</reference/operator/aggregation/collStats/>` syntax for
165+
retrieving the total number of documents in the partitions.
166+
167+
.. code-block:: sh
168+
169+
use s3Db
170+
db.abc.aggregate([ {$collStats: {"count" : {} }} ])
171+
172+
The preceding command returns the following output:
173+
174+
.. code-block:: json
175+
:copyable: false
123176

124-
{ "ns" : "atlasDb.sampleColl", "partition" : { "format" : "MONGO", "attributes" : { }, "size" : 94362191, "source" : "mySandboxCluster" } }
177+
{ "ns" : "atlasDb.sampleColl", "partition" : { "format" : "MONGO", "attributes" : { }, "size" : 94362191, "source" : "mySandboxCluster" }, "count" : 23530}
125178

126179
.. _adl-collstats-errors:
127180

128181
Errors
129182
------
130183

131184
An error similar to the following is returned if the :manual:`collStats
132-
</reference/operator/aggregation/collStats/>` argument document contains
133-
any of the options allowed by the MongoDB server but not by {+adl+}.
185+
</reference/operator/aggregation/collStats/>` argument document
186+
contains any of the options allowed by the MongoDB server but not by
187+
{+adl+}.
134188

135189
.. code-block:: json
136190
:copyable: false

0 commit comments

Comments
 (0)