Skip to content

Commit e8ccaa5

Browse files
authored
(DOCSP-21735) Add fullDocument config option (#119)
* (DOCSP-21735) Add fullDocument config option
1 parent 195e5c7 commit e8ccaa5

File tree

2 files changed

+102
-0
lines changed

2 files changed

+102
-0
lines changed

source/configuration/read.txt

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -293,6 +293,53 @@ This partitioner is not compatible with hashed shard keys.
293293

294294
**Default:** ``64``
295295

296+
.. _spark-change-stream-conf:
297+
298+
Change Streams
299+
--------------
300+
301+
.. note::
302+
303+
If you use ``SparkConf`` to set the connector's change stream
304+
configurations, prefix ``spark.mongodb.change.stream.`` to each
305+
property.
306+
307+
.. list-table::
308+
:header-rows: 1
309+
:widths: 35 65
310+
311+
* - Property name
312+
- Description
313+
314+
* - ``lookup.full.document``
315+
316+
- Determines what values your change stream returns on update
317+
operations.
318+
319+
The default setting returns the differences between the original
320+
document and the updated document.
321+
322+
The ``updateLookup`` setting returns the differences between the
323+
original document and updated document as well as a copy of the
324+
entire updated document.
325+
326+
.. tip::
327+
328+
For more information on how this change stream option works,
329+
see the MongoDB server manual guide
330+
:manual:`Lookup Full Document for Update Operation </changeStreams/#lookup-full-document-for-update-operations>`.
331+
332+
**Default:** "default"
333+
334+
* - ``publish.full.document.only``
335+
336+
- If ``true``, this property returns only the changed document
337+
instead of the full change stream document. The connector
338+
automatically sets the ``lookup.full.document`` property to
339+
``updateLookup`` to receive the updated documents.
340+
341+
**Default:** ``false``
342+
296343
.. _configure-input-uri:
297344

298345
``uri`` Configuration Setting

source/structured-streaming.txt

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,61 @@ Configuring a Write Stream to MongoDB
9292
For a complete list of methods, see the
9393
`pyspark Structured Streaming reference <https://spark.apache.org/docs/latest/api/python/reference/pyspark.ss.html>`__.
9494

95+
- id: scala
96+
content: |
97+
98+
Specify write stream configuration settings on your streaming
99+
Dataset or DataFrame using the ``writeStream`` property. You
100+
must specify the following configuration settings to write
101+
to MongoDB:
102+
103+
.. list-table::
104+
:header-rows: 1
105+
:stub-columns: 1
106+
:widths: 10 40
107+
108+
* - Setting
109+
- Description
110+
111+
* - ``writeStream.format()``
112+
- The format to use for write stream data. Use
113+
``mongodb``.
114+
115+
* - ``writeStream.option()``
116+
- Use the ``option`` method to specify your MongoDB
117+
deployment connection string with the
118+
``spark.mongodb.connection.uri`` option key.
119+
120+
You must specify a database and collection, either as
121+
part of your connection string or with additional
122+
``option`` methods using the following keys:
123+
124+
- ``spark.mongodb.database``
125+
- ``spark.mongodb.collection``
126+
127+
* - ``writeStream.outputMode()``
128+
- The output mode to use. To view a list of all supported
129+
output modes, see `the pyspark outputMode documentation <https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.sql.streaming.DataStreamWriter.outputMode.html#pyspark.sql.streaming.DataStreamWriter.outputMode>`__.
130+
131+
132+
The following code snippet shows how to use the preceding
133+
configuration settings to stream data to MongoDB:
134+
135+
.. code-block:: python
136+
:copyable: true
137+
:emphasize-lines: 3-4, 7
138+
139+
<streaming Dataset/ DataFrame> \
140+
.writeStream \
141+
.format("mongodb") \
142+
.option("spark.mongodb.connection.uri", <mongodb-connection-string>) \
143+
.option("spark.mongodb.database", <database-name>) \
144+
.option("spark.mongodb.collection", <collection-name>) \
145+
.outputMode("append")
146+
147+
For a complete list of methods, see the
148+
`pyspark Structured Streaming reference <https://spark.apache.org/docs/latest/api/python/reference/pyspark.ss.html>`__.
149+
95150
.. _read-structured-stream:
96151
.. _continuous-processing:
97152

0 commit comments

Comments
 (0)