Skip to content

Commit 4153574

Browse files
committed
DOCSP-30550: GridFS guide (#89)
(cherry picked from commit 7a5b04c)
1 parent 51d29ab commit 4153574

File tree

4 files changed

+464
-1
lines changed

4 files changed

+464
-1
lines changed

source/fundamentals.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,10 +27,10 @@ Fundamentals
2727
/fundamentals/monitoring
2828
/fundamentals/collations
2929
/fundamentals/geo
30+
/fundamentals/gridfs
3031

3132
..
3233
Connect to MongoDB Atlas from AWS Lambda <https://www.mongodb.com/docs/atlas/manage-connections-aws-lambda/>
33-
/fundamentals/gridfs
3434
/fundamentals/encrypt-fields
3535

3636
.. include:: /includes/fundamentals-sections.rst

source/fundamentals/gridfs.txt

Lines changed: 374 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,374 @@
1+
.. _rust-gridfs:
2+
3+
======
4+
GridFS
5+
======
6+
7+
.. facet::
8+
:name: genre
9+
:values: reference
10+
11+
.. meta::
12+
:keywords: large files, storage, code example
13+
14+
.. contents:: On this page
15+
:local:
16+
:backlinks: none
17+
:depth: 2
18+
:class: singlecol
19+
20+
Overview
21+
--------
22+
23+
In this guide, you can learn how to store and retrieve large files in
24+
MongoDB by using **GridFS**. GridFS is a specification that describes how
25+
to split files into chunks during storage and reassemble them during retrieval.
26+
The {+driver-short+} implementation of GridFS manages the operations and
27+
organization of the file storage.
28+
29+
Use GridFS if the size of your file exceeds the BSON document size limit of
30+
16 MB. GridFS also helps you access files without loading the entire file into
31+
memory. For more detailed information about whether GridFS is suitable for your
32+
use case, see the :manual:`GridFS </core/gridfs>` page in the Server manual.
33+
34+
To learn more about GridFS, navigate to the following sections in this guide:
35+
36+
- :ref:`How GridFS Works <rust-gridfs-info>`
37+
38+
- :ref:`Reference a GridFS Bucket <rust-gridfs-create>`
39+
40+
- :ref:`Upload Files <rust-gridfs-upload>`
41+
42+
- :ref:`Download Files <rust-gridfs-download>`
43+
44+
- :ref:`Retrieve File Information <rust-gridfs-retrieve>`
45+
46+
- :ref:`Rename Files <rust-gridfs-rename>`
47+
48+
- :ref:`Delete Files <rust-gridfs-delete>`
49+
50+
- :ref:`Delete a GridFS Bucket <rust-gridfs-drop>`
51+
52+
- :ref:`Additional Information <rust-gridfs-addtl-info>`
53+
54+
.. _rust-gridfs-info:
55+
56+
How GridFS Works
57+
----------------
58+
59+
GridFS organizes files in a **bucket**, which is a group of MongoDB collections
60+
containing file chunks and descriptive information. Buckets contain the
61+
following collections, named according to the convention defined in the GridFS
62+
specification:
63+
64+
- ``chunks``, which stores the binary file chunks
65+
- ``files``, which stores the file metadata
66+
67+
When you create a new GridFS bucket, the {+driver-short+} performs the following
68+
actions:
69+
70+
- Creates the ``chunks`` and ``files`` collections, prefixed with the default bucket
71+
name ``fs``, unless you specify a different name
72+
73+
- Creates an index on each collection to ensure efficient retrieval of files and related
74+
metadata
75+
76+
You can create a reference to a GridFS bucket by following the steps in the :ref:`rust-gridfs-create`
77+
section of this page. However, the driver does not create a new GridFS bucket and its
78+
indexes until the first write operation. For more information on GridFS indexes, see the
79+
:manual:`GridFS Indexes </core/gridfs/#gridfs-indexes>` page in the Server manual.
80+
81+
When storing a file in a GridFS bucket, the {+driver-short+} creates the following documents:
82+
83+
- One document in the ``files`` collection that stores a unique file ID, file name, and other
84+
file metadata
85+
86+
- One or more documents in the ``chunks`` collection that store the content of the file, which
87+
the driver splits into smaller pieces
88+
89+
The following diagram describes how GridFS splits files when uploading to a bucket:
90+
91+
.. figure:: /includes/figures/GridFS-upload.png
92+
:alt: A diagram that shows how GridFS uploads a file to a bucket
93+
94+
When retrieving files, GridFS fetches the metadata from the ``files`` collection in the
95+
specified bucket and uses the information to reconstruct the file from documents in the
96+
``chunks`` collection. You can read the file into memory or output it to a stream.
97+
98+
.. _rust-gridfs-create:
99+
100+
Reference a GridFS Bucket
101+
-------------------------
102+
103+
Before storing files in a GridFS bucket, create a bucket reference or get a reference to
104+
an existing bucket.
105+
106+
The following example calls the ``gridfs_bucket()`` method on a database instance, which
107+
creates a reference to either a new or existing GridFS bucket:
108+
109+
.. literalinclude:: /includes/fundamentals/code-snippets/gridfs.rs
110+
:start-after: start-create
111+
:end-before: end-create
112+
:language: rust
113+
:copyable:
114+
:dedent:
115+
116+
You can specify a custom bucket name and other options in a ``GridFsBucketOptions``
117+
struct instance. To begin building a ``GridFsBucketOptions`` instance, call the
118+
``GridFsBucketOptions::builder()`` method.
119+
120+
.. note:: Instantiating Options
121+
122+
The {+driver-short+} implements the Builder design pattern for the
123+
creation of many different types, including ``GridFsBucketOptions``. You
124+
can use the ``builder()`` method to construct an instance of each type
125+
by chaining option builder methods.
126+
127+
The following table describes the builder methods that you can use to set fields of a
128+
``GridFsBucketOptions`` instance:
129+
130+
.. list-table::
131+
:widths: 1 1 2
132+
:stub-columns: 1
133+
:header-rows: 1
134+
135+
* - Method
136+
- Possible Values
137+
- Description
138+
139+
* - | ``bucket_name()``
140+
- | Any ``String`` value
141+
- | Specifies a bucket name, which is set to ``fs`` by default
142+
143+
* - ``chunk_size_bytes()``
144+
- Any ``u32`` value
145+
- Specifies the chunk size used to break the file into chunks, which is 255 KB by default
146+
147+
* - | ``write_concern()``
148+
- | ``WriteConcern::w``,
149+
| ``WriteConcern::w_timeout``,
150+
| ``WriteConcern::journal``
151+
- | Specifies the bucket's write concern, which is set to the database's write concern by default
152+
153+
* - ``read_concern()``
154+
- ``ReadConcernLevel::Local``,
155+
``ReadConcernLevel::Majority``,
156+
``ReadConcernLevel::Linearizable``,
157+
``ReadConcernLevel::Available``,
158+
``ReadConcernLevel::Snapshot``
159+
- Specifies the bucket's read concern, which is set to the database's read concern by default
160+
161+
* - | ``selection_criteria()``
162+
- | ``SelectionCriteria::ReadPreference``,
163+
| ``SelectionCriteria::Predicate``
164+
- | Specifies which servers are suitable for a bucket operation, which is set to the database's selection
165+
| criteria by default
166+
167+
The following example specifies options in a ``GridFsBucketOptions`` instance to configure a custom
168+
bucket name and a five-second time limit for write operations:
169+
170+
.. literalinclude:: /includes/fundamentals/code-snippets/gridfs.rs
171+
:start-after: start-create-opts
172+
:end-before: end-create-opts
173+
:language: rust
174+
:copyable:
175+
:dedent:
176+
177+
.. _rust-gridfs-upload:
178+
179+
Upload Files
180+
------------
181+
182+
You can upload a file to a GridFS bucket by opening an upload stream and
183+
writing your file to the stream. Call the ``open_upload_stream()`` method on
184+
your bucket instance to open the stream. This method returns an instance of
185+
``GridFsUploadStream`` to which you can write the file contents. To upload
186+
the file contents to the ``GridFsUploadStream``, call the ``write_all()`` method
187+
and pass your file bytes as a parameter.
188+
189+
.. tip:: Import the Required Module
190+
191+
The ``GridFsUploadStream`` struct implements the ``futures_io::AsyncWrite`` trait.
192+
To use the ``AsyncWrite`` write methods, such as ``write_all()``, import the
193+
``AsyncWriteExt`` module into your application file with the following use
194+
declaration:
195+
196+
.. code-block:: rust
197+
198+
use futures_util::io::AsyncWriteExt;
199+
200+
The following example uses an upload stream to upload a file called ``"example.txt"``
201+
to a GridFS bucket:
202+
203+
.. literalinclude:: /includes/fundamentals/code-snippets/gridfs.rs
204+
:start-after: start-upload
205+
:end-before: end-upload
206+
:language: rust
207+
:copyable:
208+
:dedent:
209+
210+
.. _rust-gridfs-download:
211+
212+
Download Files
213+
--------------
214+
215+
You can download a file from a GridFS bucket by opening a download stream and
216+
reading from the stream. Call the ``open_download_stream()`` method on
217+
your bucket instance, specifying the desired file's ``_id`` value as a parameter.
218+
This method returns an instance ``GridFsDownloadStream`` from which you can access
219+
the file. To read the file from the ``GridFsDownloadStream``, call the ``read_to_end()``
220+
method and pass a vector as a parameter.
221+
222+
.. tip:: Import the Required Module
223+
224+
The ``GridFsDownloadStream`` struct implements the ``futures_io::AsyncRead`` trait.
225+
To use the ``AsyncRead`` read methods, such as ``read_to_end()``, import the
226+
``AsyncReadExt`` module into your application file with the following use
227+
declaration:
228+
229+
.. code-block:: rust
230+
231+
use futures_util::io::AsyncReadExt;
232+
233+
The following example uses a download stream to download a file with an ``_id`` value
234+
of ``3289`` from a GridFS bucket:
235+
236+
.. literalinclude:: /includes/fundamentals/code-snippets/gridfs.rs
237+
:start-after: start-download
238+
:end-before: end-download
239+
:language: rust
240+
:copyable:
241+
:dedent:
242+
243+
.. note::
244+
245+
The GridFS streaming API cannot load partial chunks. When a download
246+
stream needs to pull a chunk from MongoDB, it pulls the entire chunk
247+
into memory. The 255 KB default chunk size is usually sufficient, but
248+
you can reduce the chunk size to reduce memory overhead.
249+
250+
.. _rust-gridfs-retrieve:
251+
252+
Retrieve File Information
253+
-------------------------
254+
255+
You can retrieve information about the files stored in the ``files`` collection of
256+
the GridFS bucket. Each file is stored as an instance of the ``FilesCollectionDocument``
257+
type, which includes the following fields that represent file information:
258+
259+
- ``_id``: the file ID
260+
- ``length``: the file size
261+
- ``chunk_size_bytes``: the size of the file's chunks
262+
- ``upload_date``: the file's upload date and time
263+
- ``filename``: the name of the file
264+
- ``metadata``: a document that stores user-specified metadata
265+
266+
Call the ``find()`` method on a GridFS bucket instance to retrieve
267+
files from the bucket. The method returns a cursor instance
268+
from which you can access the results.
269+
270+
The following example retrieves and prints the length of each file in a GridFS bucket:
271+
272+
.. literalinclude:: /includes/fundamentals/code-snippets/gridfs.rs
273+
:start-after: start-retrieve
274+
:end-before: end-retrieve
275+
:language: rust
276+
:copyable:
277+
:dedent:
278+
279+
.. tip::
280+
281+
To learn more about the ``find()`` method, see the :ref:`rust-retrieve-guide`
282+
guide. To learn more about retrieving data from a cursor, see the :ref:`rust-cursor-guide`
283+
guide.
284+
285+
.. _rust-gridfs-rename:
286+
287+
Rename Files
288+
------------
289+
290+
You can update the name of a GridFS file in your bucket by calling the ``rename()`` method
291+
on a bucket instance. Pass the target file's ``_id`` value and the new file name as
292+
parameters to the ``rename()`` method.
293+
294+
.. note::
295+
The ``rename()`` method only supports updating the name of one file at
296+
a time. To rename multiple files, retrieve a list of files matching the
297+
file name from the bucket, extract the ``_id`` field from the files you
298+
want to rename, and pass each value in separate calls to the ``rename()``
299+
method.
300+
301+
The following example updates the ``filename`` field of the file containing an ``_id`` value
302+
of ``3289`` to ``"new_file_name"``:
303+
304+
.. literalinclude:: /includes/fundamentals/code-snippets/gridfs.rs
305+
:start-after: start-rename
306+
:end-before: end-rename
307+
:language: rust
308+
:copyable:
309+
:dedent:
310+
311+
.. _rust-gridfs-delete:
312+
313+
Delete Files
314+
------------
315+
316+
You can use the ``delete()`` method to remove a file from your bucket. To remove a
317+
file, call ``delete()`` on your bucket instance and pass the file's ``_id`` value
318+
as a parameter.
319+
320+
.. note::
321+
322+
The ``delete()`` method only supports deleting one file at a time. To
323+
delete multiple files, retrieve the files from the bucket, extract
324+
the ``_id`` field from the files you want to delete, and pass each ``_id``
325+
value in separate calls to the ``delete()`` method.
326+
327+
The following example deletes the file in which the value of the ``_id`` field is
328+
``3289``:
329+
330+
.. literalinclude:: /includes/fundamentals/code-snippets/gridfs.rs
331+
:start-after: start-delete-file
332+
:end-before: end-delete-file
333+
:language: rust
334+
:copyable:
335+
:dedent:
336+
337+
.. _rust-gridfs-drop:
338+
339+
Delete a GridFS Bucket
340+
----------------------
341+
342+
You can use the ``drop()`` method to delete a bucket, which removes a bucket's
343+
``files`` and ``chunks`` collections. To delete the bucket, call ``drop()`` on
344+
your bucket instance.
345+
346+
The following example deletes a GridFS bucket:
347+
348+
.. literalinclude:: /includes/fundamentals/code-snippets/gridfs.rs
349+
:start-after: start-delete-bucket
350+
:end-before: end-delete-bucket
351+
:language: rust
352+
:copyable:
353+
:dedent:
354+
355+
.. _rust-gridfs-addtl-info:
356+
357+
Additional Information
358+
----------------------
359+
360+
API Documentation
361+
~~~~~~~~~~~~~~~~~
362+
363+
To learn more about any of the methods or types mentioned in this
364+
guide, see the following API documentation:
365+
366+
- `gridfs_bucket() <{+api+}/struct.Database.html#method.gridfs_bucket>`__
367+
- `GridFsBucketOptions <{+api+}/options/struct.GridFsBucketOptions.html>`__
368+
- `open_upload_stream() <{+api+}/gridfs/struct.GridFsBucket.html#method.open_upload_stream>`__
369+
- `open_download_stream() <{+api+}/gridfs/struct.GridFsBucket.html#method.open_download_stream>`__
370+
- `FilesCollectionDocument <{+api+}/gridfs/struct.FilesCollectionDocument.html>`__
371+
- `rename() <{+api+}/gridfs/struct.GridFsBucket.html#method.rename>`__
372+
- `delete() <{+api+}/gridfs/struct.GridFsBucket.html#method.delete>`__
373+
- `drop() <{+api+}/gridfs/struct.GridFsBucket.html#method.drop>`__
374+
9.01 KB
Loading

0 commit comments

Comments
 (0)