-
Notifications
You must be signed in to change notification settings - Fork 912
Description
There is a bug with MPI_File_write* operations in Open MPI which truncates files or writes incorrect data to files.
I broke the problem down to the POSIX writev call used in ompi/mca/fbtl/posix/fbtl_posix_pwritev.c that will not write more than 2,147,479,552 bytes to disk. From man 2 write NOTES:
On Linux, write() (and similar system calls) will transfer at most 0x7ffff000 (2,147,479,552) bytes,
returning the number of bytes actually transferred. (This is true on both 32-bit and 64-bit systems.)
We do not catch the case when the iov contains elements with iov_len > 2,147,479,552 at the moment which causes this issues with large I/O blocks and files.
My suggestions:
Either (1) check for writes completing with the correct number of written bytes and adding another write or (2) modify the convertor used in mca_io_ompio_build_io_array/ompi_io_ompio_decode_datatype to build an iov struct with elements not larger than 2,147,479,552 bytes when using POSIX I/O on Linux systems.