- Infos im HLRS Wiki sind nicht rechtsverbindlich und ohne Gewähr -
- Information contained in the HLRS Wiki is not legally binding and HLRS is not responsible for any damages that might result from its use -

HPE Hawk/powertestbed: Difference between revisions

From HLRS Platforms
Jump to navigationJump to search
 
(2 intermediate revisions by the same user not shown)
Line 11: Line 11:


Irrespective of the type of execution, two types of files are generated and rsynced to
Irrespective of the type of execution, two types of files are generated and rsynced to
/lustre/hpe/ws10/logs/powersched: hdf5 raw output, containing
- RETIRED_INSTRUCTIONS
- CPU_CLOCKS_UNHALTED
- RAPL_PKG_ENERGY


/lustre/hpe/ws10/logs/powersched


as well as a set of corresponding plots.  
hdf5 raw output, containing
 
* RETIRED_INSTRUCTIONS
* CPU_CLOCKS_UNHALTED
* RAPL_PKG_ENERGY
 
as well as a set of corresponding plots. We have two mode of operations in the power testbed:
[[File:2176433.combined.amd zen2.png|thumb|gss]]
[[File:2176433.combined.amd zen2.png|thumb|gss]]


we have two mode of operations in powertestbed
=== Golden Section Search ===
 
=== golden section search ===


dynamic power steering via golden section search for the minimum of a predefined metric.
Job script for dynamic power steering for the minimum of a predefined metric.


<pre>
<pre>
Line 50: Line 52:




=== Predefined Power per Socket ===


Job script for setting a fixed, predefined power per socket.


<pre>
<pre>

Latest revision as of 13:47, 18 July 2023

Power Testbed

Jobs for the power testbed can be submitted for a specific queue.

qsub -q R_powertestbed -l select=16:ncpus=128:mpiprocs=32 -l walltime=24:00:00 ./run.job.16N.mid_power.sh

Depending on the mode of operation the testbed either dynamically steers power consumption per socket of uses a given predefined power per socket level.

Irrespective of the type of execution, two types of files are generated and rsynced to

/lustre/hpe/ws10/logs/powersched

hdf5 raw output, containing

  • RETIRED_INSTRUCTIONS
  • CPU_CLOCKS_UNHALTED
  • RAPL_PKG_ENERGY

as well as a set of corresponding plots. We have two mode of operations in the power testbed:

gss

Golden Section Search

Job script for dynamic power steering for the minimum of a predefined metric.

#!/bin/bash
#PBS -N mr_job_interactive
#PBS -q R_powertestbed
#PBS -l select=16:mpiprocs=32
#PBS -l walltime=10:00:00
#PBS -j oe
#PBS -m abe

cd /lustre/hpe/ws10/ws10.0/ws/hpcmaros-power_bed

#GSS
export POWERSCHED_REDIS_HOST=hawk-monitor2
export JOBID=echo $PBS_JOBID | cut -d. -f1
export NODELIST=uniq $PBS_NODEFILE | cut -d. -f1 | paste -sd,

/usr/local/bin/powersched-debug start-job --id=$JOBID --nodes=$NODELIST

./run_N32_640k.sh > run_ptb_GSS_16_node

/usr/local/bin/powersched-debug end-job --id=$JOBID


Predefined Power per Socket

Job script for setting a fixed, predefined power per socket.

#!/bin/bash
#PBS -N HPE_test
#PBS -l walltime=14:00:00
#PBS -l select=8:node_type=rome:mpiprocs=32:ompthreads=4
#PBS -j oe
#PBS -m abe

# ----------------
# go to workspace:
# ----------------
cd $PBS_O_WORKDIR

# job settings:
#--------------
export OMP_NUM_THREADS=8
export OMP_SCHEDULE='STATIC'
export OMP_WAIT_POLICY='ACTIVE'

# load modules:
# -------------
# module load intel
module load hlrs-software-stack/previous
module load intel
module load amd-libm

# execute program:
#-----------------
export POWERSCHED_REDIS_HOST=hawk-monitor2
export JOBID=`echo $PBS_JOBID | cut -d. -f1`
export NODELIST=`uniq $PBS_NODEFILE | cut -d. -f1 | paste -sd,`
export POWER=172W

# powersched prologue
#--------------------
/usr/local/bin/powersched-debug start-job --id=$JOBID --nodes=$NODELIST --static-power amd_zen2=$POWER

mpirun -ppn 16 -np 256 omplace -c 0-:bs=$OMP_NUM_THREADS+st=$OMP_NUM_THREADS ./ns3d_neo.out ns3d.i > logfile.16N.mid_power.out.$PBS_JOBID 2>&1
EXIT_CODE=$?

# powersched epilogue
#--------------------
/usr/local/bin/powersched-debug end-job --id=$JOBID

# clean up:
#----------
rm -rf output/*
rm -rf restart_out/*

# exit:
#------
exit $EXIT_CODE