-
Notifications
You must be signed in to change notification settings - Fork 466
Closed
Milestone
Description
What problem would the feature you're requesting solve? Please describe.
When transferring parameters to/from the out-of-proc workers (input/output bindings), the data flows over gRPC which does not provide high throughput. Since the Functions host and the worker processes are generally on the same VM, it would make more sense to share parameters between the two over shared memory.
Describe the solution you'd like
- Azure Functions Host receives a function invocation request. It goes through the binding process and obtains a
Stream
to the input being read - Two of the following options can be taken:
- The
Stream
is directly written into aMemoryMappedFile
(lower memory footprint, faster, trickier to implement) - The
Stream
is read into abyte[]
which is then written into aMemoryMappedFile
(slower, easier to implement for a first draft)
- The
- A unique name is generated for the
MemoryMappedFile
from two of the following options:- A
GUID
is generated- Much more generic for all binding types
InvokeString
of the object or theURI
is used
- A
- When sending the
InvocationRequest
message to the worker, theParameterBinding
has a newTypedData
field calledSharedMemoryData
with the following sub-fields:MemoryMappedFileName
Offset
(offset within the specifiedMemoryMappedFile
to start reading data from)Count
(number of bytes starting from the offset to read)ConversionType
(after reading the bytes, how to convert them before passing them to the function. e.g. convert to astring
or leave as abyte[]
etc.)
- The worker upon receiving the
InvocationRequest
reads each parameter as it does currently; if it encounters a parameter of withTypedData
asSharedMemoryData
then it will read the specifiedMemoryMappedFile
fromOffset
, readCount
bytes and finally convert the read bytes intoConversionType
before passing it to the function - The worker will produce an output which will also be transferred over shared memory so almost the same process which was performed by the Functions Host for
InvocationRequest
will now be performed by the worker forInvocationResponse
- The worker will also generate a new
MemoryMappedFile
(or in case of Python worker, calledmmap
) with a unique name, write the output data into it, and then respond back to the Functions Host with the same details in aSharedMemoryData
field - The Functions Host upon receiving
InvocationResponse
will open theMemoryMappedFile
, read the output data and write it to the bound outputStream
so it can be persisted (e.g. to Azure Blob Storage)
Sequence of Events
- Invocation come to Functions Host
- Downloads content (i.e. binding process)
- Write to
MemoryMappedFile
- Hold a reference open to the
MemoryMappedFile
- As part of the invocation message, send the
MemoryMappedFile
name/offset/count - Python reads from
MemoryMappedFile
- Function invocation happens
- If any outputs produced, Python writes to
MemoryMappedFile
- Python also holds a reference open to the
MemoryMappedFile
it produces (until Functions Host is done reading it, otherwise theMemoryMappedFile
gets cleaned up if all references go away) - Functions Host gets back response of function invocation from the worker
- Frees the
MemoryMappedFile
that Functions Host was holding (inputs) - Sends another message (using gRPC) to Python to now also free the
MemoryMappedFile
it produced (outputs) - this cane be done in the background (in a fire-and-forget manner) to not add any more latency to the function invocation compared to what we have today - Done
Additional context
Metadata
Metadata
Labels
No labels