Dynamic batching?

Torch serve mentions it is derived from the Multi Model Server https://github.com/awslabs/multi-model-server

As far as I remember, the MMS allows dynamic batching: the method for processing instances always gets an array of instances. 
Depending on the configuration, if the server receives more than BATCHSIZE requests within a configurable timespan, then these requess are dynamically collected into batches, run through the model and returned individually again.

This is a crucial feature for models where running single instances through the model is highly inefficient. 

I could not figure out if/how this is supported by torch serve already, and I could not find anything in the documentation about this either. 

Could somebody confirm that this is actually missing in torch serve or tell me where to find information about it if it is already implemented? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dynamic batching? #1132

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dynamic batching? #1132

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions