RFC: Adding configurable Filesystems #277

samikama · 2020-08-06T19:05:39Z

This RFC is a proposal to add two new methods to Filesystem API to expose handles to users for modifying tuning parameters of persistent storage access such as in-memory buffer sizes, open handles, number of threads and such. These parameters have hardcoded defaults for that works for most cases but not necessarily optimal under all circumstances. With this change users will be able to modify such parameters that developers chose to expose through configuration API.

Status	Accepted
RFC #	NNN (update when you have community PR #)
Author(s)	Sami Kama ([email protected])
Sponsor	Mihai Maruseac ([email protected])
Updated	2020-08-04

Objective

The aim of this RFC to extend filesystem API to enable users to pass configuration parameters to tune the behavior of implementation to their use cases.

Motivation

There are many FileSystem implementations in Tensorflow that enable interaction with various storage solutions. Most of these implementations have internal parameters that are suitable for generic use case but not necessarily optimal for all cases. For example accessing remote filesystems through multiple threads can improve the throughput if there is a high bandwidth connection to the remote thus increasing number of connections might be beneficial. On the other hand if the connection is slow, a higher number of threads will just waste resources and may even reduce the throughput. Depending on the resources available during the execution, users should be able to alter some of the parameters of the Filesystems to improve the performance of their execution. This can be especially useful for the the cases where the execution is data i/o bound.

samikama · 2020-08-06T19:06:00Z

Tagging @mihaimaruseac

martinwicke · 2020-08-06T19:08:17Z

rfcs/20200804-configurable-filesystems.md

+}
+```
+
+Since each filesystem will likely to have different set of tunable parameters, a `FilesystemConfig` object can be used to unify the API and allow discovery of existing tunable parameters at runtime. We propose a protobuf object with the following schema


I'd strongly prefer not to add protos to the API. Let's go with a plain struct, which is easier to make ABI safe?

Hi Martin, Thanks for the first review. Main use case of the protobuf is to handle serialization-deserialization, structured and simple editing from user side in C++ and python domains, potentially expose the documentation as well as have a free schema. The ABI boundary will be crossed by prototbuf serialized to prototxt so it will be human readable plain text. It is to ensure that there is no ABI issue or problems due to small differences in protobuf library used in plugin or tensorflow. Plain struct will not be able to do easily unless we restrict the data types and arguments and how these data is serialized and deserialized in ABI compatible way. Does this alleviate some of your concerns about protobuf?

How about

typedef struct key_val { char *key; // null terminated int version; int type_tag; union { int64 inv_val; double real_val; struct { char* buf; int buf_length; } buffer_val; } value; } key_val;

And then we can send an array of key_val elements.

It seems the option values can be either integer, reals or some buffers (always null terminated?). In that case, we can also have this API:

int64 get_int_option(const char* key); double get_double_option(const char* key); const char* get_char_option(const char* key); void get_int_option(const char* key, int64 val); void get_double_option(const char* key, double val); void get_char_option(const char* key, const char* val); int get_plugin_options(char** keys);

(modulo status codes). The second API proposal has the downside that user needs to first ask the plugin what options are available and then iterate over the returned array and set them. However, it is much more extensible (both API and ABI compatible by default -- already handled by the filesystem ABI/API compatibility layer). Whereas the struct-union variant from the beginning of this comment allows getting all the config options at once but requires adding a new version parameter.

Actually, the second API allows setting options per path by adding a path argument to the methods

Bools can be passed in via int options, lists can be passed via multiple calls or by adding 3 more API entries

We can have the function API proposed above between plugins and core TF (the C layer). Then, inside TF we can use a class/struct to hold them together and present the info to user in a class wrapper too (Python).

The C++/Python classes don't need to be backwards compatible from the point of view of the plugin, so we can change them as needed. But for the plugin interface, we need compatibility (both API and ABI).

The only people seeing the functional API above are plugin implementers and people who contribute to the filesystem layer in core TF.

Using protos forces users to use protobuf, which has been a source of issues (it's not exactly a light dependency). Passing only serialized protos removes the ABI concerns, but using proto for configuration isn't something we love to do (maybe surprisingly, since it's all over our APIs, but that's just why we decided it might not be the best idea).

I agree that a set/get API is a bit of an overhead, but a struct would be fairly straightforward, I believe?

@mihaimaruseac Yes I agree. Though your get_* api needs extension since having the name of the key doesn't tell you its type. With the struct approach, memory management might require some work but it is doable. I just believe some existing infrastructure like protobuf or alternatively json/yaml/xml serialized format would simplify things and reduce amount of code needed when developing plugins.

@martinwicke I just believe using a library and human readable serialized data would be the best. It doesn't have to be protobuf. Only reason the protobuf suggested is that it is already used in TF. I am perfectly happy with other standard formats json/yaml/xml etc. Only advantage of protobuf over these is binary data and type information. Which could easily be implemented in other formats as well.

I agree we can also use struct. But we need to address

Memory ownership. Since there needs to be a query mechanism for existing values we will get the structs from plugins and need to manage the memory. There are methods in existing plugin API, although they are not documented in plugin RFC I believe, We can use these methods to manage memory and define lifetimes and ownership clearly.

Multiple entries. We need to support passing a vector/list of values. It is possible to encode maps and other structures with lists.

The documentation. We all know that developers hardly write documentation. Having a key and its value type is usually not sufficient to figure out what it means. We need to have a description field in the struct that documents the key.

It would have been nice if everybody didn't have to implement Struct generation/parsing code from scratch but I can't see a way around it right now.

We can use the following in the plugin interface

typedef struct TF_Filesystem_Option_Value { int type_tag; int num_values; union { int64 inv_val; double real_val; struct { char* buf; int buf_length; } buffer_val; } *values; // owned } TF_Filesystem_Option_Value; typedef struct TF_Filesystem_Option { char* name; // null terminated, owned char* description; // null terminated, owned int per_file; // bool actually, but bool is not a C type TF_Filesystem_Option_Value *value; // owned } TF_Filesystem_Option;

This handles both documentation (provided by plugin) and list values (I'm not conviced we need maps here, but it could be possible to extent the union to add them). Plus, since above is in filesystem interface, it is used by core TF and we can provide functions in the filesystem interface to manage these structs. This way, plugins only need to know the layout of the structs and read/write to them directly.

Memory allocation will have to come from the plugin's side, using the same routines plugins currently use for the other filesystem operations. It's the only way this can work on windows so we cannot choose anything else.

We probably need a new ABI number but adding it to the filesystem registration function would be an ABI breakage. We can handle this by increasing existing ABI numbers from 0 to 1 or we can use the options ABI number only in the function that core TF uses to get filesystem's default options.

Plugins will need to fill these structs in with all the options they support, change them (if plugin allows changing them), and read from them when implementing operations that depend on these options. Core TF only needs to propagate these options up to the C++/Python layers and display them to users.

I would expect that a user trying a new filesystem plugin will do something like

options = tf.io.get_filesystem_options_for_scheme("new_uri_scheme://") tf.io.display_filesystem_options(options) # will print current options and the help text too tf.io.set_option(options, key, value) ...

We have a lot of pointer chasing in this structure but since this is IO I doubt it will result in significant performance degradation.

martinwicke · 2020-08-06T19:11:01Z

rfcs/20200804-configurable-filesystems.md

+
+### Alternatives Considered
+
+Alternative to this proposal is to use a side-channel such as an environment variable to modify the internal parameters. However this is cumbersome, error prone and may not be possible to use at all under certain circumstances.


There's another alternative I can imagine: some of these parameters might be useful to set per file (e.g., caching policies). Should this be possible? Or is a global per filesystem switch enough? What are the use cases?

We leave this to the filesystem implementations. This proposal is to expose such information. If a filesystem can support per-file configuration, it can expose it and then user can make use of it. For example I was thinking for networked file systems, exposing file size and thread count thresholds such that below filesize threshold, number or parallel download request is different than above threshold. Some specific example could be single thread if <1MB, 10 threads if >1GB. How does this sound.

I'm thinking of a user-facing API of something like:

f1 = tf.io.gfile.Open("crazyfs://rw_file", {'cache_policy': 'LRU', 'cache_size': 4096})
f2 = tf.io.gfile.Open("crazyfs://append_only_file", {'cache_policy': 'off'})

How would you implement something like this using only the end-points proposed here?

It is possible but having a uniform behavior would be though unless all filesystems agree on some convention. Assume that all these are non-issue one way to implement would be in pseudo code

def Open(fname,mode,conf_dict): FS=env.GetFileSystemForFile("fname") fsconfig=FS.GetConfiguration() newconfig=FileConfiguration() newconfig.CopyFrom(fsconfig) for o,v in conf_dict.keys(): if o in newconfig.options: newconfig.options[o]=v FS.SetConfiguration(newconfig) if mode == read_write: return FS.NewWritableFile(fname) elif mode == appendable_file: return FS.NewAppendOnlyFile(fname)

Didn't check the actual method names and signatures but this could be a way to implement per file arguments if Filesystem implementation supports it.

terrytangyuan · 2020-08-06T19:11:24Z

rfcs/20200804-configurable-filesystems.md

+
+## Motivation
+
+There are many FileSystem implementations in Tensorflow that enable interaction with various storage solutions. Most of these implementations have internal parameters that are suitable for generic use case but not necessarily optimal for all cases. For example accessing remote filesystems through multiple threads can improve the throughput if there is a high bandwidth connection to the remote thus increasing number of connections might be beneficial. On the other hand if the connection is slow, a higher number of threads will just waste resources and may even reduce the throughput. Depending on the resources available during the execution, users should be able to alter some of the parameters of the Filesystems to improve the performance of their execution. This can be especially useful for the the cases where the execution is data i/o bound.


What's the relation between this and SIG IO's filesystem extensions? @mihaimaruseac @samikama @tensorflow/sig-io-maintainers

@terrytangyuan Are you talking about plugin based filesystem or something else? Could you please be more specific?

I mean the existing file system extensions in TF IO kernels, e.g. gstpu, oss, azure, etc. For example, we can use Azure like the following (full tutorial):

pathname = 'az://{}/aztest'.format(account_name) tf.io.gfile.mkdir(pathname)

Hi @terrytangyuan,

Sorry if the document was not clear. This proposal do not change existing behavior. It extends existing api so that you can do things like @martinwicke mentioned above.

Got it. Perhaps mentioning the relationship with TF IO would avoid some confusion. Thanks.

martinwicke · 2020-08-06T20:10:42Z

That implicitly assumes that all these options only affect new files, and do not represent global state which would change the behaviors of already open files. I guess that does answer my question: if these config options only affect newly opened files, this pattern is just fine. I suppose I'd like it if that was obvious (e.g. by having two configs, one for true globals, and one for flags to the subsequent opens), but I can see that this API can fulfill both.

mihaimaruseac · 2020-09-15T22:51:52Z

Notes from the design meeting:

There are 4 options to pass options:

as a struct
as key-value
mix between 1 and 2.
using a serialization format (proto, xml, etc.)
use a <fs>-meta:// meta filesystem registered in parallel
- This will make it impossible to write filesystem agnostic configuration code
- This will equire some changes to modular filesystem API/ABI interface

Meta filesystem is very unconstrained, though very powerful. Meta filesystems is like /proc filesystem in Linux. The first 4 options require language implementers to write bindings to more APIs. The first 4 options allow documenting most common options.

Adam (@ajbouh) will write a short RFC to document the meta filesystem approach, since that has a lot of value. But we will go with option 3 for this RFC.

Another idea: this API could be used for performance counters too, so maybe we should update the API names. ControlPlane{Key,Value} might be a better name. Or, for this, we could use the meta filesystem approach, since it is more suitable. We will leave profiling API to a different RFC, as TBD

samikama · 2020-12-09T01:02:48Z

Sorry for not getting back to this earlier but is there anything else for this to be approved/merged? @ewilderj @ematejska @theadactyl

mihaimaruseac · 2020-12-09T17:37:53Z

It should be good to merge.

theadactyl

LGTM!

samikama · 2020-12-09T20:12:26Z

Thanks. I will start opening PRs shortly

PiperOrigin-RevId: 370587801 Change-Id: I04f1b2e30be7ef010a4445aaf657ed0de70e3af6

PiperOrigin-RevId: 370681107 Change-Id: I06d48c974357b494f875b6f1247dd4032119e9f9

Adding configurable Filesystems RFC

6f7d10b

samikama requested review from ematejska, ewilderj, martinwicke and theadactyl as code owners August 6, 2020 19:05

googlebot added the cla: yes label Aug 6, 2020

martinwicke reviewed Aug 6, 2020

View reviewed changes

terrytangyuan reviewed Aug 6, 2020

View reviewed changes

ematejska changed the title ~~RFC - Adding configurable Filesystems~~ RFC: Adding configurable Filesystems Aug 21, 2020

Add discussion updates.

a803e32

ematejska added the RFC: Proposed RFC Design Document label Aug 26, 2020

mihaimaruseac mentioned this pull request Sep 10, 2020

Implement __reduce__ on tf.keras.Model to copy.deepcopy and pickle tensorflow/tensorflow#39609

Closed

mihaimaruseac approved these changes Sep 16, 2020

View reviewed changes

theadactyl approved these changes Dec 9, 2020

View reviewed changes

theadactyl merged commit a799e8d into tensorflow:master Dec 9, 2020

theadactyl added RFC: Accepted RFC Design Document: Accepted by Review and removed RFC: Proposed RFC Design Document labels Dec 9, 2020

samikama deleted the filesystem_config branch December 9, 2020 20:12

samikama mentioned this pull request Dec 10, 2020

Adding Configurable Filesystem API part 1 tensorflow/tensorflow#45563

Merged

jsimsa mentioned this pull request Apr 26, 2021

Add preliminary methods to env class tensorflow/tensorflow#42642

Merged

copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 27, 2021

Reverting tensorflow/community#245 in favor of tensorflow/community#277.

ab4922b

PiperOrigin-RevId: 370587801 Change-Id: I04f1b2e30be7ef010a4445aaf657ed0de70e3af6

copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 27, 2021

Reverting tensorflow/community#245 in favor of tensorflow/community#277.

fc2adf1

PiperOrigin-RevId: 370681107 Change-Id: I06d48c974357b494f875b6f1247dd4032119e9f9


		### Alternatives Considered

		Alternative to this proposal is to use a side-channel such as an environment variable to modify the internal parameters. However this is cumbersome, error prone and may not be possible to use at all under certain circumstances.


		## Motivation

		There are many FileSystem implementations in Tensorflow that enable interaction with various storage solutions. Most of these implementations have internal parameters that are suitable for generic use case but not necessarily optimal for all cases. For example accessing remote filesystems through multiple threads can improve the throughput if there is a high bandwidth connection to the remote thus increasing number of connections might be beneficial. On the other hand if the connection is slow, a higher number of threads will just waste resources and may even reduce the throughput. Depending on the resources available during the execution, users should be able to alter some of the parameters of the Filesystems to improve the performance of their execution. This can be especially useful for the the cases where the execution is data i/o bound.

RFC: Adding configurable Filesystems #277

RFC: Adding configurable Filesystems #277

Uh oh!

Conversation

samikama commented Aug 6, 2020 • edited by theadactyl Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Objective

Motivation

Uh oh!

samikama commented Aug 6, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

samikama Aug 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

samikama Aug 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

samikama Aug 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martinwicke commented Aug 6, 2020 via email

Uh oh!

mihaimaruseac commented Sep 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Notes from the design meeting:

Uh oh!

samikama commented Dec 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mihaimaruseac commented Dec 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

theadactyl left a comment

Choose a reason for hiding this comment

Uh oh!

samikama commented Dec 9, 2020

Uh oh!

Uh oh!

samikama commented Aug 6, 2020 •

edited by theadactyl

Loading

samikama Aug 6, 2020 •

edited

Loading

samikama Aug 7, 2020 •

edited

Loading

samikama Aug 6, 2020 •

edited

Loading

mihaimaruseac commented Sep 15, 2020 •

edited

Loading

samikama commented Dec 9, 2020 •

edited

Loading

mihaimaruseac commented Dec 9, 2020 •

edited

Loading