Skip to content
This repository was archived by the owner on Jul 10, 2025. It is now read-only.

RFC: Adding configurable Filesystems #277

Merged
merged 2 commits into from
Dec 9, 2020

Conversation

samikama
Copy link
Contributor

@samikama samikama commented Aug 6, 2020

This RFC is a proposal to add two new methods to Filesystem API to expose handles to users for modifying tuning parameters of persistent storage access such as in-memory buffer sizes, open handles, number of threads and such. These parameters have hardcoded defaults for that works for most cases but not necessarily optimal under all circumstances. With this change users will be able to modify such parameters that developers chose to expose through configuration API.

Status Accepted
RFC # NNN (update when you have community PR #)
Author(s) Sami Kama ([email protected])
Sponsor Mihai Maruseac ([email protected])
Updated 2020-08-04

Objective

The aim of this RFC to extend filesystem API to enable users to pass configuration parameters to tune the behavior of implementation to their use cases.

Motivation

There are many FileSystem implementations in Tensorflow that enable interaction with various storage solutions. Most of these implementations have internal parameters that are suitable for generic use case but not necessarily optimal for all cases. For example accessing remote filesystems through multiple threads can improve the throughput if there is a high bandwidth connection to the remote thus increasing number of connections might be beneficial. On the other hand if the connection is slow, a higher number of threads will just waste resources and may even reduce the throughput. Depending on the resources available during the execution, users should be able to alter some of the parameters of the Filesystems to improve the performance of their execution. This can be especially useful for the the cases where the execution is data i/o bound.

@samikama
Copy link
Contributor Author

samikama commented Aug 6, 2020

Tagging @mihaimaruseac

}
```

Since each filesystem will likely to have different set of tunable parameters, a `FilesystemConfig` object can be used to unify the API and allow discovery of existing tunable parameters at runtime. We propose a protobuf object with the following schema
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd strongly prefer not to add protos to the API. Let's go with a plain struct, which is easier to make ABI safe?

Copy link
Contributor Author

@samikama samikama Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Martin, Thanks for the first review. Main use case of the protobuf is to handle serialization-deserialization, structured and simple editing from user side in C++ and python domains, potentially expose the documentation as well as have a free schema. The ABI boundary will be crossed by prototbuf serialized to prototxt so it will be human readable plain text. It is to ensure that there is no ABI issue or problems due to small differences in protobuf library used in plugin or tensorflow. Plain struct will not be able to do easily unless we restrict the data types and arguments and how these data is serialized and deserialized in ABI compatible way. Does this alleviate some of your concerns about protobuf?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

typedef struct key_val {
  char *key; // null terminated
  int version;
  int type_tag;
  union {
    int64 inv_val;
    double real_val;
    struct {
      char* buf;
      int buf_length;
    } buffer_val;
  } value; 
} key_val;

And then we can send an array of key_val elements.

It seems the option values can be either integer, reals or some buffers (always null terminated?). In that case, we can also have this API:

int64 get_int_option(const char* key);
double get_double_option(const char* key);
const char* get_char_option(const char* key);

void get_int_option(const char* key, int64 val);
void get_double_option(const char* key, double val);
void get_char_option(const char* key, const char* val);

int get_plugin_options(char** keys);

(modulo status codes). The second API proposal has the downside that user needs to first ask the plugin what options are available and then iterate over the returned array and set them. However, it is much more extensible (both API and ABI compatible by default -- already handled by the filesystem ABI/API compatibility layer). Whereas the struct-union variant from the beginning of this comment allows getting all the config options at once but requires adding a new version parameter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the second API allows setting options per path by adding a path argument to the methods

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bools can be passed in via int options, lists can be passed via multiple calls or by adding 3 more API entries

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have the function API proposed above between plugins and core TF (the C layer). Then, inside TF we can use a class/struct to hold them together and present the info to user in a class wrapper too (Python).

The C++/Python classes don't need to be backwards compatible from the point of view of the plugin, so we can change them as needed. But for the plugin interface, we need compatibility (both API and ABI).

The only people seeing the functional API above are plugin implementers and people who contribute to the filesystem layer in core TF.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using protos forces users to use protobuf, which has been a source of issues (it's not exactly a light dependency). Passing only serialized protos removes the ABI concerns, but using proto for configuration isn't something we love to do (maybe surprisingly, since it's all over our APIs, but that's just why we decided it might not be the best idea).

I agree that a set/get API is a bit of an overhead, but a struct would be fairly straightforward, I believe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihaimaruseac Yes I agree. Though your get_* api needs extension since having the name of the key doesn't tell you its type. With the struct approach, memory management might require some work but it is doable. I just believe some existing infrastructure like protobuf or alternatively json/yaml/xml serialized format would simplify things and reduce amount of code needed when developing plugins.

Copy link
Contributor Author

@samikama samikama Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martinwicke I just believe using a library and human readable serialized data would be the best. It doesn't have to be protobuf. Only reason the protobuf suggested is that it is already used in TF. I am perfectly happy with other standard formats json/yaml/xml etc. Only advantage of protobuf over these is binary data and type information. Which could easily be implemented in other formats as well.

I agree we can also use struct. But we need to address

  • Memory ownership. Since there needs to be a query mechanism for existing values we will get the structs from plugins and need to manage the memory. There are methods in existing plugin API, although they are not documented in plugin RFC I believe, We can use these methods to manage memory and define lifetimes and ownership clearly.

  • Multiple entries. We need to support passing a vector/list of values. It is possible to encode maps and other structures with lists.

  • The documentation. We all know that developers hardly write documentation. Having a key and its value type is usually not sufficient to figure out what it means. We need to have a description field in the struct that documents the key.

  • It would have been nice if everybody didn't have to implement Struct generation/parsing code from scratch but I can't see a way around it right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use the following in the plugin interface

typedef struct TF_Filesystem_Option_Value {
  int type_tag;
  int num_values;
  union {
    int64 inv_val;
    double real_val;
    struct {
      char* buf;
      int buf_length;
    } buffer_val;
  } *values;  // owned
} TF_Filesystem_Option_Value;

typedef struct TF_Filesystem_Option {
  char* name; // null terminated, owned
  char* description; // null terminated, owned
  int per_file;  // bool actually, but bool is not a C type
  TF_Filesystem_Option_Value *value;  // owned
} TF_Filesystem_Option;

This handles both documentation (provided by plugin) and list values (I'm not conviced we need maps here, but it could be possible to extent the union to add them). Plus, since above is in filesystem interface, it is used by core TF and we can provide functions in the filesystem interface to manage these structs. This way, plugins only need to know the layout of the structs and read/write to them directly.

Memory allocation will have to come from the plugin's side, using the same routines plugins currently use for the other filesystem operations. It's the only way this can work on windows so we cannot choose anything else.

We probably need a new ABI number but adding it to the filesystem registration function would be an ABI breakage. We can handle this by increasing existing ABI numbers from 0 to 1 or we can use the options ABI number only in the function that core TF uses to get filesystem's default options.

Plugins will need to fill these structs in with all the options they support, change them (if plugin allows changing them), and read from them when implementing operations that depend on these options. Core TF only needs to propagate these options up to the C++/Python layers and display them to users.

I would expect that a user trying a new filesystem plugin will do something like

options = tf.io.get_filesystem_options_for_scheme("new_uri_scheme://")
tf.io.display_filesystem_options(options)  # will print current options and the help text too
tf.io.set_option(options, key, value)
...

We have a lot of pointer chasing in this structure but since this is IO I doubt it will result in significant performance degradation.


### Alternatives Considered

Alternative to this proposal is to use a side-channel such as an environment variable to modify the internal parameters. However this is cumbersome, error prone and may not be possible to use at all under certain circumstances.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's another alternative I can imagine: some of these parameters might be useful to set per file (e.g., caching policies). Should this be possible? Or is a global per filesystem switch enough? What are the use cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We leave this to the filesystem implementations. This proposal is to expose such information. If a filesystem can support per-file configuration, it can expose it and then user can make use of it. For example I was thinking for networked file systems, exposing file size and thread count thresholds such that below filesize threshold, number or parallel download request is different than above threshold. Some specific example could be single thread if <1MB, 10 threads if >1GB. How does this sound.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of a user-facing API of something like:

f1 = tf.io.gfile.Open("crazyfs://rw_file", {'cache_policy': 'LRU', 'cache_size': 4096})
f2 = tf.io.gfile.Open("crazyfs://append_only_file", {'cache_policy': 'off'})

How would you implement something like this using only the end-points proposed here?

Copy link
Contributor Author

@samikama samikama Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible but having a uniform behavior would be though unless all filesystems agree on some convention. Assume that all these are non-issue one way to implement would be in pseudo code

def Open(fname,mode,conf_dict):
  FS=env.GetFileSystemForFile("fname")
  fsconfig=FS.GetConfiguration() 
  newconfig=FileConfiguration()
  newconfig.CopyFrom(fsconfig)
  for o,v in conf_dict.keys():
    if o in newconfig.options:
      newconfig.options[o]=v
  FS.SetConfiguration(newconfig)
  if mode == read_write:
   return FS.NewWritableFile(fname)
  elif mode == appendable_file:
   return FS.NewAppendOnlyFile(fname)

Didn't check the actual method names and signatures but this could be a way to implement per file arguments if Filesystem implementation supports it.


## Motivation

There are many FileSystem implementations in Tensorflow that enable interaction with various storage solutions. Most of these implementations have internal parameters that are suitable for generic use case but not necessarily optimal for all cases. For example accessing remote filesystems through multiple threads can improve the throughput if there is a high bandwidth connection to the remote thus increasing number of connections might be beneficial. On the other hand if the connection is slow, a higher number of threads will just waste resources and may even reduce the throughput. Depending on the resources available during the execution, users should be able to alter some of the parameters of the Filesystems to improve the performance of their execution. This can be especially useful for the the cases where the execution is data i/o bound.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the relation between this and SIG IO's filesystem extensions? @mihaimaruseac @samikama @tensorflow/sig-io-maintainers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@terrytangyuan Are you talking about plugin based filesystem or something else? Could you please be more specific?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the existing file system extensions in TF IO kernels, e.g. gstpu, oss, azure, etc. For example, we can use Azure like the following (full tutorial):

pathname = 'az://{}/aztest'.format(account_name)
tf.io.gfile.mkdir(pathname)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @terrytangyuan,

Sorry if the document was not clear. This proposal do not change existing behavior. It extends existing api so that you can do things like @martinwicke mentioned above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Perhaps mentioning the relationship with TF IO would avoid some confusion. Thanks.

@martinwicke
Copy link
Member

martinwicke commented Aug 6, 2020 via email

@ematejska ematejska changed the title RFC - Adding configurable Filesystems RFC: Adding configurable Filesystems Aug 21, 2020
@mihaimaruseac
Copy link
Contributor

mihaimaruseac commented Sep 15, 2020

Notes from the design meeting:

There are 4 options to pass options:

  1. as a struct
  2. as key-value
  3. mix between 1 and 2.
  4. using a serialization format (proto, xml, etc.)
  5. use a <fs>-meta:// meta filesystem registered in parallel
    • This will make it impossible to write filesystem agnostic configuration code
    • This will equire some changes to modular filesystem API/ABI interface

Meta filesystem is very unconstrained, though very powerful. Meta filesystems is like /proc filesystem in Linux. The first 4 options require language implementers to write bindings to more APIs. The first 4 options allow documenting most common options.

Adam (@ajbouh) will write a short RFC to document the meta filesystem approach, since that has a lot of value. But we will go with option 3 for this RFC.

Another idea: this API could be used for performance counters too, so maybe we should update the API names. ControlPlane{Key,Value} might be a better name. Or, for this, we could use the meta filesystem approach, since it is more suitable. We will leave profiling API to a different RFC, as TBD

@samikama
Copy link
Contributor Author

samikama commented Dec 9, 2020

Sorry for not getting back to this earlier but is there anything else for this to be approved/merged? @ewilderj @ematejska @theadactyl

@mihaimaruseac
Copy link
Contributor

mihaimaruseac commented Dec 9, 2020

It should be good to merge.

Copy link
Contributor

@theadactyl theadactyl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@theadactyl theadactyl merged commit a799e8d into tensorflow:master Dec 9, 2020
@theadactyl theadactyl added RFC: Accepted RFC Design Document: Accepted by Review and removed RFC: Proposed RFC Design Document labels Dec 9, 2020
@samikama samikama deleted the filesystem_config branch December 9, 2020 20:12
@samikama
Copy link
Contributor Author

samikama commented Dec 9, 2020

Thanks. I will start opening PRs shortly

copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 27, 2021
PiperOrigin-RevId: 370587801
Change-Id: I04f1b2e30be7ef010a4445aaf657ed0de70e3af6
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 27, 2021
PiperOrigin-RevId: 370681107
Change-Id: I06d48c974357b494f875b6f1247dd4032119e9f9
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cla: yes RFC: Accepted RFC Design Document: Accepted by Review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants