- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -

Difference between revisions of "HPE Hawk News"

From HLRS Platforms
Jump to navigationJump to search
Line 1: Line 1:
* '''2020/11/19:'''''Italic text'' If your job breaks with  
+
* '''2020/11/19:''' If your job breaks with  
  
 
         MPT ERROR: MPI_COMM_WORLD rank <x> has terminated without calling MPI_Finalize()
 
         MPT ERROR: MPI_COMM_WORLD rank <x> has terminated without calling MPI_Finalize()
Line 7: Line 7:
 
changes are good that it was due to running out of memory (i.e. your jobs tried to access more memory than available on one of the nodes). We are currently working towards providing you with a more specific error message in this case.
 
changes are good that it was due to running out of memory (i.e. your jobs tried to access more memory than available on one of the nodes). We are currently working towards providing you with a more specific error message in this case.
  
* '''''2020/11/27:''''' If your job breaks with "IB timeout"-messages, this might be due to crashed nodes, although one would suspect problems with InfiniBand network due to the message. There are still various reasons why nodes might be crashed.
+
* '''2020/11/27:''' If your job breaks with "IB timeout"-messages, this might be due to crashed nodes, although one would suspect problems with InfiniBand network due to the message. There are still various reasons why nodes might be crashed.

Revision as of 11:46, 27 November 2020

  • 2020/11/19: If your job breaks with
       MPT ERROR: MPI_COMM_WORLD rank <x> has terminated without calling MPI_Finalize()
       aborting job 
       MPT: Received signal 9

changes are good that it was due to running out of memory (i.e. your jobs tried to access more memory than available on one of the nodes). We are currently working towards providing you with a more specific error message in this case.

  • 2020/11/27: If your job breaks with "IB timeout"-messages, this might be due to crashed nodes, although one would suspect problems with InfiniBand network due to the message. There are still various reasons why nodes might be crashed.