- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

Lustre short read: Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
 
(9 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Short reads. ==
== Short reads. ==


If you are trying to read a certain amount of data (i.e. 1 MB) the POSIX standard allows to return the read with less data read than actually requested (i.e. 500 kB). This is called a short read.
If you are trying to read a certain amount of data (i.e. 1 MB) the POSIX standard allows the read to return with less data than actually requested (i.e. 500 kB). This is called a short read.
The read command will return the length of the record read.  
The read command will return the length of the record read.  


Line 9: Line 9:
The following example ensures that all data requested has been read:
The following example ensures that all data requested has been read:
     read:
     read:
     if(len > 0) {                 //len length of the array that is being read.
     if(len > 0) {                                                     //len   - length of the array that is being read.
         count = fread(addr, (size_t)len, (size_t)1, (FILE*) data);    //addr buffer, where the data will be read to  
         count = fread(addr, (size_t)len, (size_t)1, (FILE*) data);    //addr - buffer, where the data will be read to  
                                                                       //fread returns the number of bytes read
                                                                       //fread - returns the number of bytes read
         if((count<len && count>0 ) || (count==-1 && errno==EINTR)){     
         if((count<len && count>0 ) || (count==-1 && errno==EINTR)){     
             addr+=count;
             addr+=count;                                          
             len-=count;
             len-=count;
             goto read;
             goto read;
Line 28: Line 28:
* HDF5 parallel
* HDF5 parallel
* NETCDF 4  
* NETCDF 4  
* Fortran I/O
 




Issues occur with the following libraries
Issues occur with the following libraries
* C or C++ I/O The programmer has to handle the short read himself
* C or C++ I/O The programmer has to handle the short read himself
* NETCDF 3 leads to an unwanted behavior: If a short read occurs netcdf will fill up the missing data in the array with zeroes and returs successfully.
* NETCDF version 4.3.2 or lower leads to an unwanted behavior when using files with a NETCDF version 3 header: If a short read occurs netcdf will fill up the missing data in the array with zeroes and returns successfully.
 
== How to convert posix IO to MPIIO ==
 
Here is a short summary on how to covert from POSIX IO to MPIIO, which is short read save, so you do not have to handle it in your application.
 
{| class="wikitable"
! Operation
! C
! Fortran
! MPIIO
! Comment
|-
| Open a file
| fopen
| Open
| MPI_File_open(comm, filename, amode, info, fh)
| fh corresponds to the C filepointer and the Fortran Unitnumber, info can be MPI_INFO_NULL
|-
| Write
| fwrite
| write
| MPI_FILE_WRITE(fh, buf, count, datatype, status)
|-
| Read
| fread
| read
| MPI_FILE_READ(fh, buf, count, datatype, status)
|-
| Close
| fclose
| close
| MPI_FILE_CLOSE(fh)
| Has to be called before MPI_Finalize
|-
|}
 
For performance you should consider using the Collective IO operations. (see manual)

Latest revision as of 15:23, 4 May 2017

Short reads.

If you are trying to read a certain amount of data (i.e. 1 MB) the POSIX standard allows the read to return with less data than actually requested (i.e. 500 kB). This is called a short read. The read command will return the length of the record read.

It is the programmers responsiblity to check if the application actually read the amount data requested. If less data have been read this should be handled by the programm i.e. by re-reading the missing data.


The following example ensures that all data requested has been read:

   read:
   if(len > 0) {                                                      //len   - length of the array that is being read.
       count = fread(addr, (size_t)len, (size_t)1, (FILE*) data);     //addr  - buffer, where the data will be read to 
                                                                      //fread - returns the number of bytes read
       if((count<len && count>0 ) || (count==-1 && errno==EINTR)){     
           addr+=count;                                            
           len-=count;
           goto read;
       }
   }


Of course you should also check for reading errors.

Short reads and Other IO libraries

The following I/O libraries handle short reads in a proper way:

  • MPIIO
  • HDF5 parallel
  • NETCDF 4


Issues occur with the following libraries

  • C or C++ I/O The programmer has to handle the short read himself
  • NETCDF version 4.3.2 or lower leads to an unwanted behavior when using files with a NETCDF version 3 header: If a short read occurs netcdf will fill up the missing data in the array with zeroes and returns successfully.

How to convert posix IO to MPIIO

Here is a short summary on how to covert from POSIX IO to MPIIO, which is short read save, so you do not have to handle it in your application.

Operation C Fortran MPIIO Comment
Open a file fopen Open MPI_File_open(comm, filename, amode, info, fh) fh corresponds to the C filepointer and the Fortran Unitnumber, info can be MPI_INFO_NULL
Write fwrite write MPI_FILE_WRITE(fh, buf, count, datatype, status)
Read fread read MPI_FILE_READ(fh, buf, count, datatype, status)
Close fclose close MPI_FILE_CLOSE(fh) Has to be called before MPI_Finalize

For performance you should consider using the Collective IO operations. (see manual)