Skip to content

fbtl/posix: fix data-sieving calculations -v 5.0 #11933

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 21, 2023

Conversation

edgargabriel
Copy link
Member

@edgargabriel edgargabriel commented Sep 19, 2023

as part of introducing atomicity support for ompi v5.0, we also tried to improve the robustness in some file I/O routines. Unfortunately, this also introduced a bug since ret_code returned by a function does not necessarily contain the number of bytes read or written,
but could contain the last value (e.g. 0). The value was however used in a subsequent calculation and we ended not copying data out of the temporary buffer used in the data sieving at all.

This commit also simplifies some of the logic in the while loop, no need to retry to read past the end of the file multiple times.

Fixes issue #11917

Code was tested with the reproducer provided as part of the issue, our internal testsuite, and the hdf5-1.14.2 testsuite, all tests pass.

Signed-off-by: Edgar Gabriel [email protected]
(cherry picked from commit fb3b68f)

as part of introducing atomicity support for ompi v5.0, we also tried to improve the robustness in some file I/O routines.
Unfortunately, this also introduced a bug since ret_code returned by a function does not necessarily contain the number of
bytes read or written,
but could contain the last value (e.g. 0). The value was however used in a subsequent calculation and we ended not copying
data out of the temporary buffer used in the data sieving at all.

This commit also simplifies some of the logic in the while loop, no need to retry to read past the end of the file
multiple times.

Fixes issue open-mpi#11917

Code was tested with the reproducer provided as part of the issue, our internal testsuite, and the hdf5-1.4.2 testsuite, all tests pass.

Signed-off-by: Edgar Gabriel <[email protected]>
(cherry picked from commit fb3b68f)
@github-actions github-actions bot added this to the v5.0.0 milestone Sep 19, 2023
@edgargabriel edgargabriel changed the title fbtl/posix: fix data-sieving calculations fbtl/posix: fix data-sieving calculations -v 5.0 Sep 19, 2023
@edgargabriel
Copy link
Member Author

edgargabriel commented Sep 19, 2023

hm, how can we free up space on the device? The compile-rocm step failed because it is out of disk space

utoreconf: configure.ac: not using Gettext
autoreconf: running: aclocal -I config -I config/oac --force --warnings=all,no-obsolete,no-override -I ./config
/usr/bin/m4: ERROR: copying inserted file: No space left on device
/usr/bin/m4: write error
autom4te: /usr/bin/m4 failed with exit status: 1
aclocal: error: echo failed with exit status: 1
autoreconf: aclocal failed with exit status: 1

@jsquyres
Copy link
Member

hm, how can we free up space on the device? The compile-rocm step failed because it is out of disk space

We can't -- we don't maintain the infrastructure on which the Github Action jobs run.

Just run it again (kinda horrible, but true).

@janjust janjust merged commit f01eff2 into open-mpi:v5.0.x Sep 21, 2023
@edgargabriel edgargabriel deleted the pr/data-sieving-fix-v5.0 branch July 12, 2024 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants