@@ -46,10 +46,6 @@ You can configure the following properties when reading data from MongoDB in str
46
46
* - ``collection``
47
47
- | **Required.**
48
48
| The collection name configuration.
49
- | You can specify multiple collections by separating the collection names
50
- with a comma.
51
- |
52
- | To learn more about specifying multiple collections, see :ref:`spark-specify-multiple-collections`.
53
49
54
50
* - ``comment``
55
51
- | The comment to append to the read operation. Comments appear in the
@@ -172,7 +168,7 @@ You can configure the following properties when reading a change stream from Mon
172
168
omit the ``fullDocument`` field and publishes only the value of the
173
169
field.
174
170
- If you don't specify a schema, the connector infers the schema
175
- from the change stream document.
171
+ from the change stream document rather than from the underlying collection .
176
172
177
173
**Default**: ``false``
178
174
@@ -207,91 +203,4 @@ You can configure the following properties when reading a change stream from Mon
207
203
Specifying Properties in ``connection.uri``
208
204
-------------------------------------------
209
205
210
- .. include:: /includes/connection-read-config.rst
211
-
212
- .. _spark-specify-multiple-collections:
213
-
214
- Specifying Multiple Collections in the ``collection`` Property
215
- --------------------------------------------------------------
216
-
217
- You can specify multiple collections in the ``collection`` change stream
218
- configuration property by separating the collection names
219
- with a comma. Do not add a space between the collections unless the space is a
220
- part of the collection name.
221
-
222
- Specify multiple collections as shown in the following example:
223
-
224
- .. code-block:: java
225
-
226
- ...
227
- .option("spark.mongodb.collection", "collectionOne,collectionTwo")
228
-
229
- If a collection name is "*", or if the name includes a comma or a backslash (\\),
230
- you must escape the character as follows:
231
-
232
- - If the name of a collection used in your ``collection`` configuration
233
- option contains a comma, the {+connector-short+} treats it as two different
234
- collections. To avoid this, you must escape the comma by preceding it with
235
- a backslash (\\). Escape a collection named "my,collection" as follows:
236
-
237
- .. code-block:: java
238
-
239
- "my\,collection"
240
-
241
- - If the name of a collection used in your ``collection`` configuration
242
- option is "*", the {+connector-short+} interprets it as a specification
243
- to scan all collections. To avoid this, you must escape the asterisk by preceding it
244
- with a backslash (\\). Escape a collection named "*" as follows:
245
-
246
- .. code-block:: java
247
-
248
- "\*"
249
-
250
- - If the name of a collection used in your ``collection`` configuration
251
- option contains a backslash (\\), the
252
- {+connector-short+} treats the backslash as an escape character, which
253
- might change how it interprets the value. To avoid this, you must escape
254
- the backslash by preceding it with another backslash. Escape a collection named "\\collection" as follows:
255
-
256
- .. code-block:: java
257
-
258
- "\\collection"
259
-
260
- .. note::
261
-
262
- When specifying the collection name as a string literal in Java, you must
263
- further escape each backslash with another one. For example, escape a collection
264
- named "\\collection" as follows:
265
-
266
- .. code-block:: java
267
-
268
- "\\\\collection"
269
-
270
- You can stream from all collections in the database by passing an
271
- asterisk (*) as a string for the collection name.
272
-
273
- Specify all collections as shown in the following example:
274
-
275
- .. code-block:: java
276
-
277
- ...
278
- .option("spark.mongodb.collection", "*")
279
-
280
- If you create a collection while streaming from all collections, the new
281
- collection is automatically included in the stream.
282
-
283
- You can drop collections at any time while streaming from multiple collections.
284
-
285
- .. important:: Inferring the Schema with Multiple Collections
286
-
287
- If you set the ``change.stream.publish.full.document.only``
288
- option to ``true``, the {+connector-short+} infers the schema of a ``DataFrame``
289
- by using the schema of the scanned documents.
290
-
291
- Schema inference happens at the beginning of streaming, and does not take
292
- into account collections that are created during streaming.
293
-
294
- When streaming from multiple collections and inferring the schema, the connector samples
295
- each collection sequentially. Streaming from a large number of
296
- collections can cause the schema inference to have noticeably slower
297
- performance. This performance impact occurs only while inferring the schema.
206
+ .. include:: /includes/connection-read-config.rst
0 commit comments