Skip to content

Commit 558b8a7

Browse files
Rindblack15/gh 2812 update of the transactions section (#2857)
* Delete atomic.rst * Add files via upload * Create test * Add files via upload * Delete test * Change formatting according to the style guide * Update atomic.rst Add an introduction * Update atomic.rst Update an introduction Co-authored-by: Patience Daur <[email protected]>
1 parent f5d1569 commit 558b8a7

File tree

6 files changed

+292
-286
lines changed

6 files changed

+292
-286
lines changed

doc/book/box/atomic.rst

Lines changed: 20 additions & 286 deletions
Original file line numberDiff line numberDiff line change
@@ -1,286 +1,20 @@
1-
.. _atomic-atomic_execution:
2-
3-
================================================================================
4-
Transactions
5-
================================================================================
6-
7-
Transactions in Tarantool occur in **fibers** on a single **thread**.
8-
That is why Tarantool has a guarantee of execution atomicity.
9-
That requires emphasis.
10-
11-
Since :tarantool-release:`2.10.0-beta1`, Tarantool supports streams and interactive transactions over them.
12-
See :ref:`Streams <box_stream>`.
13-
14-
.. _atomic-threads_fibers_yields:
15-
16-
--------------------------------------------------------------------------------
17-
Threads, fibers and yields
18-
--------------------------------------------------------------------------------
19-
20-
How does Tarantool process a basic operation? As an example, let's take this
21-
query:
22-
23-
.. code-block:: tarantoolsession
24-
25-
tarantool> box.space.tester:update({3}, {{'=', 2, 'size'}, {'=', 3, 0}})
26-
27-
This is equivalent to the following SQL statement for a table that stores
28-
primary keys in ``field[1]``:
29-
30-
.. code-block:: SQL
31-
32-
UPDATE tester SET "field[2]" = 'size', "field[3]" = 0 WHERE "field[1]" = 3
33-
34-
Assuming this query is received by Tarantool via network,
35-
it will be processed with three operating system **threads**:
36-
37-
1. The **network thread** on the server side receives the query, parses
38-
the statement, checks if it's correct, and then transforms it into a special
39-
structure--a message containing an executable statement and its options.
40-
41-
2. The network thread ships this message to the instance's
42-
**transaction processor thread** using a lock-free message bus.
43-
Lua programs execute directly in the transaction processor thread,
44-
and do not require parsing and preparation.
45-
46-
The instance's transaction processor thread uses the primary-key index on
47-
field[1] to find the location of the tuple. It determines that the tuple
48-
can be updated (not much can go wrong when you're merely changing an
49-
unindexed field value).
50-
51-
3. The transaction processor thread sends a message to the
52-
:ref:`write-ahead logging (WAL) thread <internals-wal>` to commit the
53-
transaction. When done, the WAL thread replies with a COMMIT or ROLLBACK
54-
result to the transaction processor which gives it back to the network thread,
55-
and the network thread returns the result to the client.
56-
57-
Notice that there is only one transaction processor thread in Tarantool.
58-
Some people are used to the idea that there can be multiple threads operating
59-
on the database, with (say) thread #1 reading row #x, while thread #2 writes
60-
row #y. With Tarantool, no such thing ever happens.
61-
Only the transaction processor thread can access the database, and there is
62-
only one transaction processor thread for each Tarantool instance.
63-
64-
Like any other Tarantool thread, the transaction processor thread can handle
65-
many :ref:`fibers <fiber-fibers>`. A fiber is a set of computer instructions
66-
that may contain "**yield**" signals. The transaction processor thread will
67-
execute all computer instructions until a yield, then switch to execute the
68-
instructions of a different fiber. Thus (say) the thread reads row #x for the
69-
sake of fiber #1, then writes row #y for the sake of fiber #2.
70-
71-
Yields must happen, otherwise the transaction processor thread would stick
72-
permanently on the same fiber. There are two types of yields:
73-
74-
* :ref:`implicit yields <atomic-implicit-yields>`: every data-change operation
75-
or network-access causes an implicit yield, and every statement that goes
76-
through the Tarantool client causes an implicit yield.
77-
78-
* explicit yields: in a Lua function, you can (and should) add
79-
:ref:`"yield" <fiber-yield>` statements to prevent hogging. This is called
80-
**cooperative multitasking**.
81-
82-
.. _atomic-cooperative_multitasking:
83-
84-
--------------------------------------------------------------------------------
85-
Cooperative multitasking
86-
--------------------------------------------------------------------------------
87-
88-
Cooperative multitasking means: unless a running fiber deliberately yields
89-
control, it is not preempted by some other fiber. But a running fiber will
90-
deliberately yield when it encounters a “yield point”: a transaction commit,
91-
an operating system call, or an explicit :ref:`"yield" <fiber-yield>` request.
92-
Any system call which can block will be performed asynchronously, and any running
93-
fiber which must wait for a system call will be preempted, so that another
94-
ready-to-run fiber takes its place and becomes the new running fiber.
95-
96-
This model makes all programmatic locks unnecessary: cooperative multitasking
97-
ensures that there will be no concurrency around a resource, no race conditions,
98-
and no memory consistency issues. The way to achieve this is quite simple:
99-
in critical sections, don't use yields, explicit or implicit, and no one
100-
can interfere into the code execution.
101-
102-
When requests are small, for example simple UPDATE or INSERT or DELETE or SELECT,
103-
fiber scheduling is fair: it takes only a little time to process the request,
104-
schedule a disk write, and yield to a fiber serving the next client.
105-
106-
However, a function might perform complex computations or might be written in
107-
such a way that yields do not occur for a long time. This can lead to
108-
unfair scheduling, when a single client throttles the rest of the system, or to
109-
apparent stalls in request processing. Avoiding this situation is
110-
the responsibility of the function’s author.
111-
112-
.. _atomic-transactions:
113-
114-
--------------------------------------------------------------------------------
115-
Transactions
116-
--------------------------------------------------------------------------------
117-
118-
In the absence of transactions, any function that contains yield points may see
119-
changes in the database state caused by fibers that preempt.
120-
Multi-statement transactions exist to provide **isolation**: each transaction
121-
sees a consistent database state and commits all its changes atomically.
122-
At :doc:`commit </reference/reference_lua/box_txn_management/commit>` time,
123-
a yield happens and all transaction changes
124-
are written to the :ref:`write ahead log <internals-wal>` in a single batch.
125-
Or, if needed, transaction changes can be rolled back --
126-
:doc:`completely </reference/reference_lua/box_txn_management/rollback>` or to
127-
a specific
128-
:doc:`savepoint </reference/reference_lua/box_txn_management/rollback_to_savepoint>`.
129-
130-
In Tarantool, `transaction isolation level <https://en.wikipedia.org/wiki/Isolation_(database_systems)#Isolation_levels>`_
131-
is *serializable* with the clause "if no failure during writing to WAL". In
132-
case of such a failure that can happen, for example, if the disk space
133-
is over, the transaction isolation level becomes *read uncommitted*.
134-
135-
In :ref:`vinyl <engines-chapter>`, to implement isolation Tarantool uses a simple optimistic scheduler:
136-
the first transaction to commit wins. If a concurrent active transaction
137-
has read a value modified by a committed transaction, it is aborted.
138-
139-
The cooperative scheduler ensures that, in absence of yields,
140-
a multi-statement transaction is not preempted and hence is never aborted.
141-
Therefore, understanding yields is essential to writing abort-free code.
142-
143-
Sometimes while testing the transaction mechanism in Tarantool you can notice
144-
that yielding after ``box.begin()`` but before any read/write operation does not
145-
cause an abort as it should according to the description. This happens because
146-
actually ``box.begin()`` does not start a transaction. It is a mark telling
147-
Tarantool to start a transaction after some database request that follows.
148-
149-
In memtx, if an instruction that implies yields, explicit or implicit, is
150-
executed during a transaction, the transaction is fully rolled back. In vinyl,
151-
we use more complex transactional manager that allows yields.
152-
153-
.. note::
154-
155-
You can’t mix storage engines in a transaction today.
156-
157-
.. _atomic-implicit-yields:
158-
159-
--------------------------------------------------------------------------------
160-
Implicit yields
161-
--------------------------------------------------------------------------------
162-
163-
The only explicit yield requests in Tarantool are :ref:`fiber.sleep() <fiber-sleep>`
164-
and :ref:`fiber.yield() <fiber-yield>`, but many other requests "imply" yields
165-
because Tarantool is designed to avoid blocking.
166-
167-
Database requests imply yields if and only if there is disk I/O.
168-
For memtx, since all data is in memory, there is no disk I/O during a read request.
169-
For vinyl, since some data may not be in memory, there may be disk I/O
170-
for a read (to fetch data from disk) or for a write (because a stall
171-
may occur while waiting for memory to be free).
172-
For both memtx and vinyl, since data-change requests must be recorded in the WAL,
173-
there is normally a commit.
174-
A commit happens automatically after every request in default "autocommit" mode,
175-
or a commit happens at the end of a transaction in "transaction" mode,
176-
when a user deliberately commits by calling :doc:`/reference/reference_lua/box_txn_management/commit`.
177-
Therefore for both memtx and vinyl, because there can be disk I/O,
178-
some database operations may imply yields.
179-
180-
Many functions in modules :ref:`fio <fio-section>`, :ref:`net_box <net_box-module>`,
181-
:ref:`console <console-module>` and :ref:`socket <socket-module>`
182-
(the "os" and "network" requests) yield.
183-
184-
That is why executing separate commands such as ``select()``, ``insert()``,
185-
``update()`` in the console inside a transaction will cause an abort. This is
186-
due to implicit yield happening after each chunk of code is executed in the console.
187-
188-
**Example #1**
189-
190-
* *Engine = memtx* |br|
191-
The sequence ``select() insert()`` has one yield, at the end of insertion, caused by
192-
implicit commit; ``select()`` has nothing to write to the WAL and so does not
193-
yield.
194-
195-
* *Engine = vinyl* |br|
196-
The sequence ``select() insert()`` has one to three yields, since ``select()``
197-
may yield if the data is not in cache, ``insert()`` may yield waiting for
198-
available memory, and there is an implicit yield at commit.
199-
200-
* The sequence ``begin() insert() insert() commit()`` yields only at commit
201-
if the engine is memtx, and can yield up to 3 times if the engine is vinyl.
202-
203-
**Example #2**
204-
205-
Assume that in the memtx space ‘tester’ there are tuples in which the third field
206-
represents a positive dollar amount. Let's start a transaction, withdraw
207-
from tuple#1, deposit in tuple#2, and end the transaction, making its
208-
effects permanent.
209-
210-
.. code-block:: tarantoolsession
211-
212-
tarantool> function txn_example(from, to, amount_of_money)
213-
> box.begin()
214-
> box.space.tester:update(from, {{'-', 3, amount_of_money}})
215-
> box.space.tester:update(to, {{'+', 3, amount_of_money}})
216-
> box.commit()
217-
> return "ok"
218-
> end
219-
---
220-
...
221-
tarantool> txn_example({999}, {1000}, 1.00)
222-
---
223-
- "ok"
224-
...
225-
226-
If :ref:`wal_mode <cfg_binary_logging_snapshots-wal_mode>` = ‘none’, then
227-
implicit yielding at commit time does not take place, because there are
228-
no writes to the WAL.
229-
230-
If a task is interactive -- sending requests to the server and receiving responses --
231-
then it involves network I/O, and therefore there is an implicit yield, even if the
232-
request that is sent to the server is not itself an implicit yield request.
233-
Therefore, the following sequence
234-
235-
.. cssclass:: highlight
236-
.. parsed-literal::
237-
238-
conn.space.test:select{1}
239-
conn.space.test:select{2}
240-
conn.space.test:select{3}
241-
242-
causes yields three times sequentially when sending requests to the network
243-
and awaiting the results. On the server side, the same requests are executed
244-
in common order possibly mixing with other requests from the network and
245-
local fibers. Something similar happens when using clients that operate
246-
via telnet, via one of the connectors, or via the
247-
:ref:`MySQL and PostgreSQL rocks <dbms_modules>`, or via the interactive mode when
248-
:ref:`using Tarantool as a client <admin-using_tarantool_as_a_client>`.
249-
250-
After a fiber has yielded and then has regained control, it immediately issues
251-
:ref:`testcancel <fiber-testcancel>`.
252-
253-
.. _atomic-transactional-manager:
254-
255-
--------------------------------------------------------------------------------
256-
Transactional manager
257-
--------------------------------------------------------------------------------
258-
259-
Since version :doc:`2.6.1 </release/2.6.1>`,
260-
Tarantool has another option for transaction behavior that
261-
allows yielding inside a memtx transaction. This is controled by
262-
the *transactional manager*.
263-
264-
The transactional manager is designed for isolation of concurrent transactions
265-
and provides *serializable* `transaction isolation level <https://en.wikipedia.org/wiki/Isolation_(database_systems)#Isolation_levels>`_.
266-
It consists of two parts:
267-
268-
* *MVCC engine*
269-
* *conflict manager*.
270-
271-
The MVCC engine provides personal read views for transactions if necessary.
272-
The conflict manager tracks transactions' changes and determines their correctness
273-
in serialization order. Of course, once yielded, a transaction could interfere
274-
with other transactions and could be aborted due to conflict.
275-
276-
Another important thing to mention is that the transaction manager
277-
provides non-classic snapshot isolation level. It means that a transaction
278-
can get a consistent snapshot of the database (that is common) but this snapshot
279-
is not necessarily bound to the moment of the beginning of the transaction
280-
(that is not common).
281-
The conflict manager makes decisions on whether and when each transaction gets
282-
which snapshot. That allows to avoid some conflicts comparing with classical
283-
snapshot isolation approach.
284-
285-
The transactional manager can be switched on and off by the ``box.cfg`` option
286-
:ref:`memtx_use_mvcc_engine <cfg_basic-memtx_use_mvcc_engine>`.
1+
:noindex:
2+
:fullwidth:
3+
4+
.. _atomic-atomic_execution:
5+
6+
Transactions
7+
============
8+
9+
Transactions allow users to perform multiple operations atomically.
10+
11+
For more information on how transactions work in Tarantool, see the following sections:
12+
13+
.. toctree::
14+
:maxdepth: 2
15+
16+
atomic/atomic-threads_fibers_yields
17+
atomic/atomic-cooperative_multitasking
18+
atomic/atomic-transactions
19+
atomic/atomic-implicit-yields
20+
atomic/atomic-transactional-manager
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
.. _atomic-cooperative_multitasking:
2+
3+
Cooperative multitasking
4+
========================
5+
6+
Cooperative multitasking means that unless a running fiber deliberately yields
7+
control, it is not preempted by some other fiber. But a running fiber will
8+
deliberately yield when it encounters a “yield point”: a transaction commit,
9+
an operating system call, or an explicit :ref:`"yield" <fiber-yield>` request.
10+
Any system call which can block will be performed asynchronously, and any running
11+
fiber which must wait for a system call will be preempted, so that another
12+
ready-to-run fiber takes its place and becomes the new running fiber.
13+
14+
This model makes all programmatic locks unnecessary: cooperative multitasking
15+
ensures that there will be no concurrency around a resource, no race conditions,
16+
and no memory consistency issues. The way to achieve this is simple:
17+
Use no yields, explicit or implicit in critical sections, and no one can
18+
interfere with code execution.
19+
20+
When dealing with small requests, such as simple UPDATE or INSERT or DELETE or
21+
SELECT, fiber scheduling is fair: it takes only a little time to process the
22+
request, schedule a disk write, and yield to a fiber serving the next client.
23+
24+
However, a function may perform complex computations or be written in
25+
such a way that yields take a long time to occur. This can lead to
26+
unfair scheduling when a single client throttles the rest of the system, or to
27+
apparent stalls in request processing. Avoiding this situation is
28+
the responsibility of the function’s author.

0 commit comments

Comments
 (0)