-
Notifications
You must be signed in to change notification settings - Fork 575
RFC: Adding configurable Filesystems #277
Conversation
Tagging @mihaimaruseac |
} | ||
``` | ||
|
||
Since each filesystem will likely to have different set of tunable parameters, a `FilesystemConfig` object can be used to unify the API and allow discovery of existing tunable parameters at runtime. We propose a protobuf object with the following schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd strongly prefer not to add protos to the API. Let's go with a plain struct, which is easier to make ABI safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Martin, Thanks for the first review. Main use case of the protobuf is to handle serialization-deserialization, structured and simple editing from user side in C++ and python domains, potentially expose the documentation as well as have a free schema. The ABI boundary will be crossed by prototbuf serialized to prototxt so it will be human readable plain text. It is to ensure that there is no ABI issue or problems due to small differences in protobuf library used in plugin or tensorflow. Plain struct will not be able to do easily unless we restrict the data types and arguments and how these data is serialized and deserialized in ABI compatible way. Does this alleviate some of your concerns about protobuf?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
typedef struct key_val {
char *key; // null terminated
int version;
int type_tag;
union {
int64 inv_val;
double real_val;
struct {
char* buf;
int buf_length;
} buffer_val;
} value;
} key_val;
And then we can send an array of key_val
elements.
It seems the option values can be either integer, reals or some buffers (always null terminated?). In that case, we can also have this API:
int64 get_int_option(const char* key);
double get_double_option(const char* key);
const char* get_char_option(const char* key);
void get_int_option(const char* key, int64 val);
void get_double_option(const char* key, double val);
void get_char_option(const char* key, const char* val);
int get_plugin_options(char** keys);
(modulo status codes). The second API proposal has the downside that user needs to first ask the plugin what options are available and then iterate over the returned array and set them. However, it is much more extensible (both API and ABI compatible by default -- already handled by the filesystem ABI/API compatibility layer). Whereas the struct-union variant from the beginning of this comment allows getting all the config options at once but requires adding a new version parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the second API allows setting options per path by adding a path argument to the methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bools can be passed in via int options, lists can be passed via multiple calls or by adding 3 more API entries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can have the function API proposed above between plugins and core TF (the C layer). Then, inside TF we can use a class/struct to hold them together and present the info to user in a class wrapper too (Python).
The C++/Python classes don't need to be backwards compatible from the point of view of the plugin, so we can change them as needed. But for the plugin interface, we need compatibility (both API and ABI).
The only people seeing the functional API above are plugin implementers and people who contribute to the filesystem layer in core TF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using protos forces users to use protobuf, which has been a source of issues (it's not exactly a light dependency). Passing only serialized protos removes the ABI concerns, but using proto for configuration isn't something we love to do (maybe surprisingly, since it's all over our APIs, but that's just why we decided it might not be the best idea).
I agree that a set/get API is a bit of an overhead, but a struct would be fairly straightforward, I believe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mihaimaruseac Yes I agree. Though your get_* api needs extension since having the name of the key doesn't tell you its type. With the struct approach, memory management might require some work but it is doable. I just believe some existing infrastructure like protobuf or alternatively json/yaml/xml serialized format would simplify things and reduce amount of code needed when developing plugins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martinwicke I just believe using a library and human readable serialized data would be the best. It doesn't have to be protobuf. Only reason the protobuf suggested is that it is already used in TF. I am perfectly happy with other standard formats json/yaml/xml etc. Only advantage of protobuf over these is binary data and type information. Which could easily be implemented in other formats as well.
I agree we can also use struct. But we need to address
-
Memory ownership. Since there needs to be a query mechanism for existing values we will get the structs from plugins and need to manage the memory. There are methods in existing plugin API, although they are not documented in plugin RFC I believe, We can use these methods to manage memory and define lifetimes and ownership clearly.
-
Multiple entries. We need to support passing a vector/list of values. It is possible to encode maps and other structures with lists.
-
The documentation. We all know that developers hardly write documentation. Having a key and its value type is usually not sufficient to figure out what it means. We need to have a description field in the struct that documents the key.
-
It would have been nice if everybody didn't have to implement Struct generation/parsing code from scratch but I can't see a way around it right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use the following in the plugin interface
typedef struct TF_Filesystem_Option_Value {
int type_tag;
int num_values;
union {
int64 inv_val;
double real_val;
struct {
char* buf;
int buf_length;
} buffer_val;
} *values; // owned
} TF_Filesystem_Option_Value;
typedef struct TF_Filesystem_Option {
char* name; // null terminated, owned
char* description; // null terminated, owned
int per_file; // bool actually, but bool is not a C type
TF_Filesystem_Option_Value *value; // owned
} TF_Filesystem_Option;
This handles both documentation (provided by plugin) and list values (I'm not conviced we need maps here, but it could be possible to extent the union
to add them). Plus, since above is in filesystem interface, it is used by core TF and we can provide functions in the filesystem interface to manage these structs. This way, plugins only need to know the layout of the structs and read/write to them directly.
Memory allocation will have to come from the plugin's side, using the same routines plugins currently use for the other filesystem operations. It's the only way this can work on windows so we cannot choose anything else.
We probably need a new ABI number but adding it to the filesystem registration function would be an ABI breakage. We can handle this by increasing existing ABI numbers from 0 to 1 or we can use the options ABI number only in the function that core TF uses to get filesystem's default options.
Plugins will need to fill these structs in with all the options they support, change them (if plugin allows changing them), and read from them when implementing operations that depend on these options. Core TF only needs to propagate these options up to the C++/Python layers and display them to users.
I would expect that a user trying a new filesystem plugin will do something like
options = tf.io.get_filesystem_options_for_scheme("new_uri_scheme://")
tf.io.display_filesystem_options(options) # will print current options and the help text too
tf.io.set_option(options, key, value)
...
We have a lot of pointer chasing in this structure but since this is IO I doubt it will result in significant performance degradation.
|
||
### Alternatives Considered | ||
|
||
Alternative to this proposal is to use a side-channel such as an environment variable to modify the internal parameters. However this is cumbersome, error prone and may not be possible to use at all under certain circumstances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's another alternative I can imagine: some of these parameters might be useful to set per file (e.g., caching policies). Should this be possible? Or is a global per filesystem switch enough? What are the use cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We leave this to the filesystem implementations. This proposal is to expose such information. If a filesystem can support per-file configuration, it can expose it and then user can make use of it. For example I was thinking for networked file systems, exposing file size and thread count thresholds such that below filesize threshold, number or parallel download request is different than above threshold. Some specific example could be single thread if <1MB, 10 threads if >1GB. How does this sound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking of a user-facing API of something like:
f1 = tf.io.gfile.Open("crazyfs://rw_file", {'cache_policy': 'LRU', 'cache_size': 4096})
f2 = tf.io.gfile.Open("crazyfs://append_only_file", {'cache_policy': 'off'})
How would you implement something like this using only the end-points proposed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible but having a uniform behavior would be though unless all filesystems agree on some convention. Assume that all these are non-issue one way to implement would be in pseudo code
def Open(fname,mode,conf_dict):
FS=env.GetFileSystemForFile("fname")
fsconfig=FS.GetConfiguration()
newconfig=FileConfiguration()
newconfig.CopyFrom(fsconfig)
for o,v in conf_dict.keys():
if o in newconfig.options:
newconfig.options[o]=v
FS.SetConfiguration(newconfig)
if mode == read_write:
return FS.NewWritableFile(fname)
elif mode == appendable_file:
return FS.NewAppendOnlyFile(fname)
Didn't check the actual method names and signatures but this could be a way to implement per file arguments if Filesystem implementation supports it.
|
||
## Motivation | ||
|
||
There are many FileSystem implementations in Tensorflow that enable interaction with various storage solutions. Most of these implementations have internal parameters that are suitable for generic use case but not necessarily optimal for all cases. For example accessing remote filesystems through multiple threads can improve the throughput if there is a high bandwidth connection to the remote thus increasing number of connections might be beneficial. On the other hand if the connection is slow, a higher number of threads will just waste resources and may even reduce the throughput. Depending on the resources available during the execution, users should be able to alter some of the parameters of the Filesystems to improve the performance of their execution. This can be especially useful for the the cases where the execution is data i/o bound. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the relation between this and SIG IO's filesystem extensions? @mihaimaruseac @samikama @tensorflow/sig-io-maintainers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@terrytangyuan Are you talking about plugin based filesystem or something else? Could you please be more specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the existing file system extensions in TF IO kernels, e.g. gstpu, oss, azure, etc. For example, we can use Azure like the following (full tutorial):
pathname = 'az://{}/aztest'.format(account_name)
tf.io.gfile.mkdir(pathname)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @terrytangyuan,
Sorry if the document was not clear. This proposal do not change existing behavior. It extends existing api so that you can do things like @martinwicke mentioned above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Perhaps mentioning the relationship with TF IO would avoid some confusion. Thanks.
That implicitly assumes that all these options only affect new files, and
do not represent global state which would change the behaviors of already
open files. I guess that does answer my question: if these config options
only affect newly opened files, this pattern is just fine.
I suppose I'd like it if that was obvious (e.g. by having two configs, one
for true globals, and one for flags to the subsequent opens), but I can see
that this API can fulfill both.
|
Notes from the design meeting:There are 4 options to pass options:
Meta filesystem is very unconstrained, though very powerful. Meta filesystems is like Adam (@ajbouh) will write a short RFC to document the meta filesystem approach, since that has a lot of value. But we will go with option 3 for this RFC. Another idea: this API could be used for performance counters too, so maybe we should update the API names. |
Sorry for not getting back to this earlier but is there anything else for this to be approved/merged? @ewilderj @ematejska @theadactyl |
It should be good to merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks. I will start opening PRs shortly |
PiperOrigin-RevId: 370587801 Change-Id: I04f1b2e30be7ef010a4445aaf657ed0de70e3af6
PiperOrigin-RevId: 370681107 Change-Id: I06d48c974357b494f875b6f1247dd4032119e9f9
This RFC is a proposal to add two new methods to Filesystem API to expose handles to users for modifying tuning parameters of persistent storage access such as in-memory buffer sizes, open handles, number of threads and such. These parameters have hardcoded defaults for that works for most cases but not necessarily optimal under all circumstances. With this change users will be able to modify such parameters that developers chose to expose through configuration API.
Objective
The aim of this RFC to extend filesystem API to enable users to pass configuration parameters to tune the behavior of implementation to their use cases.
Motivation
There are many FileSystem implementations in Tensorflow that enable interaction with various storage solutions. Most of these implementations have internal parameters that are suitable for generic use case but not necessarily optimal for all cases. For example accessing remote filesystems through multiple threads can improve the throughput if there is a high bandwidth connection to the remote thus increasing number of connections might be beneficial. On the other hand if the connection is slow, a higher number of threads will just waste resources and may even reduce the throughput. Depending on the resources available during the execution, users should be able to alter some of the parameters of the Filesystems to improve the performance of their execution. This can be especially useful for the the cases where the execution is data i/o bound.