-
Notifications
You must be signed in to change notification settings - Fork 129
[WIP] caching of transpiled circuit (continue of PR 815) #878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This is implemented by adding a cached_method decorator for BaseExperiment methods to cache the return value of the method. This decorator is currently only applied to the `_transpiled_circuits` method. Changing experiment or transpile options will automatically clear the cache. The cache can be manually cleared by calling the `cache_clear` method.
Fix non-standard use of experiment options in Tphi so they work with circuit caching and clearing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only now that I've finished to review this PR, I see that it's WIP and doesn't pass CI, so perhaps it wasn't ready for review yet.
Regarding the discussion about experiment options, that started already in #815, while reviewing the PR I've had an idea, I don't know if it's a good one: instead of using the options in the cache key, we can keep caching only on the method name (without the options), and store the options in the value - namely, the cached value will be a tuple of the cached circuits (or whatever is returned by the method, for an arbitrary method which is not necessarily _transpiled_circuits), together with all the experiment options. When the method is called, we check if the options are the same. If yes then we return the cached circuits, if no then we calculate the circuits and override the cache entry for this method, with the new circuits and options. @ItamarGoldman @nkanazawa1989 What do you think? Then we won't have to handle hashability.
@chriseclectic Why not handle method argument as well? (either in key or value).
Remember to add a flag, whose default is False
, as discussed in #815.
|
||
This stores the output of a method in the experiment object instance | ||
in a `_cache` dict attribute. Note that the value is cached only on | ||
the object instance method name, not any values of its arguments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cached on the method name and the options
def wrapped_method(self, *args, **kwargs): | ||
name = f"{type(self).__name__}.{method.__name__}" | ||
|
||
# making a tuple from the options value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# making a tuple from the options value. | |
# making a tuple from the option values. |
val, Hashable | ||
): | ||
continue | ||
# if one of the values in option isn't hashable, we raise a warning and we use the name as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you've already made the effort of walking over the options one-by-one, you can just remove the non-hashable options, while keeping the hashable ones.
@@ -210,6 +210,30 @@ def test_return_same_circuit(self): | |||
self.assertEqual(circs1[1].decompose(), circs2[1].decompose()) | |||
self.assertEqual(circs1[2].decompose(), circs2[2].decompose()) | |||
|
|||
def test_experiment_cache(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to test RB and IRB, and not any other experiment?
Note that caching will happen to any experiment that does not override _transpiled_circuits
, which is most of the experiments. And also to RB, that overrides _transpiled_circuits
and attaches the decorator.
So I don't see a particular reason to test RB over other experiments. And if you want to test some experiment, and arbitrarily pick RB, that's ok, but then why test both RB and IRB.
self.assertEqual(exp0_transpiled_circ[0].decompose(), exp0_transpiled_cache[0].decompose()) | ||
self.assertEqual(exp0_transpiled_circ[1].decompose(), exp0_transpiled_cache[1].decompose()) | ||
self.assertEqual(exp0_transpiled_circ[2].decompose(), exp0_transpiled_cache[2].decompose()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use a for
loop
# Checking that the cache is cleared when setting options | ||
exp0.set_experiment_options(lengths=[10, 20, 30, 40]) | ||
self.assertEqual(exp0._cache, {}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Add a check that a new call to
_transpiled_circuits
returns four circuits. - Verify that setting the options via
exp.experiment_options.length
does not clear the cache, but adds a new entry to it, and also correctly returns four circuits.
In #878 (review) I wrote:
Is it simply a trivial check if two dictionaries are equal, or is there something more tricky in here, that I don't foresee? |
Copying here discussion that started in private in Slack: @nkanazawa1989: For example, rb_exp = StandardRB(**rb_settings)
data1 = rb_exp.run(backend)
backend.defaults().instruction_schedule_map.add("x", (0,), schedule=my_calibrated_xgate)
data2 = rb_exp.run(backend)
So at least the cache mechanism should be optional and defaults to False, and the user understands the mechanism very well can intentionally enable for performance. This is really difficult problem to debug, because the most of users don't know how to check the executed program. @chriseclectic: I guess we could also add an option to cache circuits to I think most of Naoki's issues are basically cases where you just don't want cache enabled, and if you do enabled it manually its up to you to make sure you aren't breaking things. @yaelbh: @nkanazawa1989: |
OptionalThe caching will be optional, and will have to be explicitly activated. What's in the cache?Cache structureCurrent strucutreIn #815, the cache key is a method name, and the value is the method's output. For example: {"_transpiled_circuits": <list of transpiled circuits>} In this PR, at least currently, the key becomes a tuple, consisting of the method name and the hashable experiment options (non-hashable options are usually dropped, except for the common case of lists, which enjoys a special treatment in the form of conversion to (hashable) tuples). For example: {("_transpiled_circuits", {"delays": (5, 20, 40)}): <list of transpiled circuits>} This change comes to solve the problem of users setting options by directly accessing Change proposalI propose to change this structure, such that the experiment options will live in the cache value. For example: {"transpiled_circuits": (<list of transpiled circuits>, {"delays": [5, 20, 40]})} Then, when the method (e.g. The benefit of this suggestion is:
My first question, please let me know what you think: Do you like this proposal? Storing method argumentsWe can also store method arguments in the cache. For example: {"run_analysis": (<output experiment data>, {"delays": [5, 20, 40]}, {"experiment_data": <input experiment data>, "replace_results": False})} Or alternatively, we'd probably better use a dictionary instead of a tuple: {"run_analysis": {"method_output": <output experiment data>, "experiment_options": {"delays": [5, 20, 40]}, "method_arguments": {"experiment_data": <input experiment data>, "replace_results": False}}} My second question, please let me know what you think: Should we store method arguments in the cache? Storing class data membersThis one, in my opinion, is an exaggeration. It falls back to users having to know what they do when they activate caching. My third question, please let me know what you think: Should we store data members in the cache? If yes, then, technically, how to do it? Restrict to
|
I always prefer formal RFC markdown as a PR since it's easier to comment and discuss.
If this is aware only of experiment options, you can still clear cache with
Yes, we should. But this means all value should implement proper equality check logic. I'm not sure if we can compare two experiment instances, i.e.
This is relevant to my comment above. Technically you can get
Yes, you should start from transpile which is an obvious bottleneck and you can do followup later with profiling. This makes code much simpler and robust. My concern is this mechanism breaks current workflow, because of the silent cache invoking. If some protected method is cached randomly, usually this is really hard to debug. In addition we don't have mechanism to check final circuit before execution. |
@nkanazawa1989 We're continuing #815, where the caching occurs at the object level. In #815, you raised a concern that, in spite of the cache being cleared in Question 1 asks if to store the experiment options in the cache key or value. I tend to store them in the value. Do you agree? Following your answer to Question 4, we will probably give up the general decorator, in favor of inheritance from As for storing data members, I find it too complicated, and unnecessary. I'm emphasizing again that caching will have to be explicitly activated by the user. Since it won't happen implicitly, I find it reasonable to expect the user to be aware of the dependence on data members. @chriseclectic what do you think? |
I was looking at the CVXPY cache function as an example, which is basically a modification of LRU cache to work better with methods. I tried playing around with this as a starting point and working on what I suggested before and came up with this as an idea for how one could implement instance caching of custom methods controlled by users. General method caching decorator for use in package def method_cache(
cache: Union[Dict, str] = "_cache",
cache_args: bool = True,
require_hashable: bool = True
) -> Callable:
"""Decorator for caching class instance methods.
Args:
cache: The cache or cache attribute name to use. If a dict it will
be used directly, if a str a cache dict will be created under
that attribute name if one is not already present.
cache_args: If True include method arg and kwarg values when
caching the method. If False only a single return will
be cached for the method regardless of any args.
require_hashable: If True require all cached args and kwargs are
hashable. If False un-hashable values are allowed
but will be excluded from the cache key.
Returns:
The decorator for caching methods.
"""
def method_cache_decorator(method: Callable) -> Callable:
"""Decorator for caching method.
Args:
method: A method to cache.
Returns:
The wrapped cached method.
"""
cache_key_fn = _cache_key_function(cache_args, require_hashable)
@functools.wraps(method)
def _cached_method(self, *args, **kwargs):
# Initialize cache if none provided
if isinstance(cache, str):
if not hasattr(self, cache):
setattr(self, cache, {})
instance_cache = getattr(self, cache)
else:
instance_cache = cache
name = method.__name__
if name not in instance_cache:
instance_cache[name] = {}
meth_cache = instance_cache[name]
key = cache_key_fn(*args, **kwargs)
if key in meth_cache:
return meth_cache[key]
result = method(self, *args, **kwargs)
meth_cache[key] = result
return result
return _cached_method
return method_cache_decorator
def _cache_key_function(cache_args: bool = True, require_hashable: bool = True) -> Callable:
"""Return function for generating cache keys.
Args:
cache_args: If True include method arg and kwarg values when
caching the method. If False only a single return will
be cached for the method regardless of any args.
require_hashable: If True require all cached args and kwargs are
hashable. If False un-hashable values are allowed
but will be excluded from the cache key.
Returns:
The functions for generating cache keys.
"""
if not cache_args:
def _cache_key(*args, **kwargs):
# pylint: disable = unused-argument
return tuple()
elif require_hashable:
def _cache_key(*args, **kwargs):
return args + tuple(list(kwargs.items()))
else:
def _cache_key(*args, **kwargs):
args_key = tuple(arg for arg in args if getattr(arg, "__hash__", None))
kwargs_key = tuple(
(key, val) for key, val in kwargs.items()
if getattr(val, "__hash__", None)
)
return args_key+kwargs_key
return _cache_key Using the decorator to max a class mixing for caching and uncaching methods class CacheMethodMixin:
"""Mixing for adding optional caching to methods"""
def _initialize_cache(self):
if not hasattr(self, "_cache"):
self._cache = {}
if not hasattr(self, "_cache_methods"):
self._cache_methods = {}
def cache(self, *methods):
"""Modify instance methods to cache returned values."""
self._initialize_cache()
decorator = method_cache(cache=self._cache, require_hashable=False)
for method in methods:
if isinstance(method, str):
method = getattr(self, method)
name = method.__name__
self._cache_methods[name] = method
cached = decorator(method)
setattr(self, method.__name__, cached.__get__(self))
def uncache(self, *methods):
"""Uncache specified methods.
If called without args all cached methods will be uncached.
"""
self._initialize_cache()
if not methods and self._cache_methods:
return self.uncache(*self._cache_methods.keys())
for method in methods:
if not isinstance(method, str):
method = method.__name__
orig_method = self._cache_methods.pop(method)
setattr(self, method, orig_method.__get__(self))
def cache_clear(self, *methods):
"""Clear cache values for cached methods.
If called with no args all cached values will be cleared.
"""
self._initialize_cache()
if not methods:
self._cache.clear()
return
for method in methods:
if not isinstance(method, str):
method = method.__name__
self._cache.pop(method, None) Using mixing in a test class (ie. BaseExperiment) class Test(CacheMethodMixin):
def __init__(self, *args, **kwargs):
pass
def foo(self, *args, **kwargs):
print("CALLED FOO")
return True In this case you have to explicitly choose which methods to cache (and you can also uncache them if you decide you don't want to cache them anymore), you can also clear cached values (and could add method to save and load the cache). I'm not sure about the caching by args/kwargs or not. You could go either way. I set it up in the above example so it will include any hashable args/kwargs in cache key but will work for non-hashable ones by skipping them. Another option would be to add a kwarg to Using it would look something like this in above example # Instance to cache
a = Test()
# Specify method to cache
a.cache("foo") # a.cache(a.foo) ok too
# Call cached method multiple times, first only first time should print as rest are cached
a.foo(1,2,3)
a.foo(1,2,3)
# If we want to uncache method (note this doesnt clear cache, but cache will no longer be used)
# If method is re-cached the uncleared cache will be used again
a.uncache("foo") # or a.uncache() to clear all
# If we wanted to clear cache instead of uncaching, this will clear cache without uncaching the method
a.cache_clear("foo") # or a.cache_clear() to clear all |
Summary
Add transpiled circuit cache to base experiment
Details and comments