-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Support rechunking to seasonal frequency with SeasonalResampler #10519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
|
||
# Test error on missing season (should fail with incomplete seasons) | ||
with pytest.raises(ValueError): | ||
ds.chunk(x=SeasonResampler(["DJF", "MAM", "SON"])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't raise any more, now that I moved it here. That's because the Resampler isn't raising an error, but our check in dataset.chunk was. Let's raise a nice error message here. The check could be len("".join(self.seasons)) == 12
rechunked = ds.chunk(x=2, time=SeasonResampler(["DJF", "MAM", "JJA", "SON"])) | ||
# With 2 years of data starting Jan 1, we get 9 seasonal chunks: | ||
# partial DJF (Jan-Feb), MAM, JJA, SON, DJF, MAM, JJA, SON, partial DJF (Dec) | ||
assert len(rechunked.chunksizes["time"]) == 9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please write out the chunks tuple here (and below)?
Almost there! I have a couple of small requests. @keewis does this API look like it can work with the |
whats-new.rst
api.rst
users could not use
SeasonResampler
for chunking operations in xarray, despite it being a natural fit for seasonal data analysis. When attemptingds.chunk(time=SeasonResampler(["DJF", "MAMJ", "JAS", "ON"]))
, users encountered obscure errors because the chunking logic was hardcoded to only work withTimeResampler
objects. This limitation prevented efficient seasonal analysis workflows and forced users to use workarounds or manual chunking strategies.Now Added a generalized chunking approach by adding a
resolve_chunks
method to theResampler
base class and updating the chunking logic to work with allResampler
objects, not justTimeResampler
. We also added a_for_chunking
method toSeasonResampler
that ensuresdrop_incomplete=False
during chunking operations to prevent silent data loss. The solution maintains full backward compatibility with existingTimeResampler
functionality while enabling seamless seasonal chunking