[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Load Balancing and KSPSolve
Satish,
Logs attached...hope they help.
Thanks,
Tim.
Satish Balay wrote:
Can you send the -log_summary for your runs [say p=1, p=8]
Satish
On Tue, 20 Nov 2007, Tim Stitt wrote:
Hi all (again),
I finally got some data back from the KSP PETSc code that I put together to
solve this sparse inverse matrix problem I was looking into. Ideally I am
aiming for a O(N) (time complexity) approach to getting the first 'k' columns
of the inverse of a sparse matrix.
To recap the method: I have my solver which uses KSPSolve in a loop that
iterates over the first k columns of an identity matrix B and computes the
corresponding x vector.
I am just a bit curious about some of the timings I am obtaining...which I
hope someone can explain. Here are the timings I obtained for a global sparse
matrix (4704 x 4704) and solving for the first 1176 columns in the identity
using P processes (processors) on our cluster.
(Timings are given in seconds for each process performing work in the loop and
were obtained by encapsulating the loop with the cpu_time() Fortran intrinsic.
The MUMPS package was requested for factorisation/solving, although similar
timings were obtained for both the native solver and SUPERLU)
P=1 [30.92]
P=2 [15.47, 15.54]
P=4 [4.68, 5.49, 4.67, 5.07]
P=8 [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15]
P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, 0.73, 0.25,
0.43, 1.09, 1.08, 1.1]
Firstly, I notice very good scalability up to 16 processes...is this expected
(by those people who use these solvers regularly)?
Also I notice that the timings per process vary as we scale up. Is this a
load-balancing problem related to more non-zero values being on a given
processor than others? Once again is this expected?
Please excuse my ignorance of matters relating to these solvers and their
operation...as it really isn't my field of expertise.
Regards,
Tim.
--
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)
Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland
+353-1-6621333 (tel) / +353-1-6621477 (fax)
Creating /localscratch/pbstmp.159913.l2cu33.ichec.ie
Working directory is /ichec/work/staff/tstitt/SolverCode
Running with 1 processes
Matrix has order 4704 rows by 4704 columns
Number of RHS is: 4704
Master Solve Time is: 110.194252
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./solver on a pathscale named h2au40 with 1 processor, by tstitt Tue Nov 20 20:32:25 2007
Using Petsc Release Version 2.3.3, Patch 7, Fri Oct 26 14:21:35 CDT 2007 HG revision: 2e223033ba960114833e1f9713ab393ec78c056f
Max Max/Min Avg Total
Time (sec): 1.173e+02 1.00000 1.173e+02
Objects: 1.100e+01 1.00000 1.100e+01
Flops: 9.561e+10 1.00000 9.561e+10 9.561e+10
Flops/sec: 8.152e+08 1.00000 8.152e+08 8.152e+08
MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
MPI Reductions: 5.000e+00 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 1.1728e+02 100.0% 9.5606e+10 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 5.000e+00 100.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops/sec: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
##########################################################
# #
# WARNING!!! #
# #
# The code for various complex numbers numerical #
# kernels uses C++, which generally is not well #
# optimized. For performance that is about 4-5 times #
# faster, specify --with-fortran-kernels=generic #
# when running config/configure.py. #
# #
##########################################################
##########################################################
# #
# WARNING!!! #
# #
# This code was run without the PreLoadBegin() #
# macros. To get timing results we always recommend #
# preloading. otherwise timing numbers may be #
# meaningless. #
##########################################################
Event Count Time (sec) Flops/sec --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatSolve 4704 1.0 1.0548e+02 1.0 8.63e+08 1.0 0.0e+00 0.0e+00 0.0e+00 90 95 0 0 0 90 95 0 0 0 863
MatLUFactorNum 1 1.0 4.4833e+00 1.0 1.01e+09 1.0 0.0e+00 0.0e+00 0.0e+00 4 5 0 0 0 4 5 0 0 0 1011
MatILUFactorSym 1 1.0 5.4705e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 20 0 0 0 0 20 0
MatAssemblyBegin 1 1.0 5.9605e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 1 1.0 5.0177e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRowIJ 1 1.0 1.0262e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 7.6232e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 80 0 0 0 0 80 0
VecSet 4704 1.0 9.1822e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAssemblyBegin 4704 1.0 7.7252e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAssemblyEnd 4704 1.0 9.2847e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetup 1 1.0 2.1458e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 4704 1.0 1.1025e+02 1.0 8.67e+08 1.0 0.0e+00 0.0e+00 5.0e+00 94100 0 0100 94100 0 0100 867
PCSetUp 1 1.0 4.6143e+00 1.0 9.83e+08 1.0 0.0e+00 0.0e+00 5.0e+00 4 5 0 0100 4 5 0 0100 983
PCApply 4704 1.0 1.0551e+02 1.0 8.63e+08 1.0 0.0e+00 0.0e+00 0.0e+00 90 95 0 0 0 90 95 0 0 0 863
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
--- Event Stage 0: Main Stage
Matrix 2 2 155965032 0
Index Set 5 5 86120 0
Vec 2 2 151872 0
Krylov Solver 1 1 0 0
Preconditioner 1 1 168 0
========================================================================================================================
Average time to get PetscTime(): 1.50204e-06
OptionTable: -log_summary
OptionTable: -mat_type aijmumps
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 16
Configure run at: Thu Nov 15 23:52:44 2007
Configure options: --with-cxx=mpiCC --with-cc=mpicc --with-mpi-dir=/usr/local/mpich2/path3.0/ --with-blas-lib=/opt/packages/path-compat/acml/pathscale64/lib/libacml.a --with-lapack-lib=/opt/packages/path-compat/acml/pathscale64/lib/libacml.a --with-timer=mpi --with-fc=mpif77 --download-mumps=1 --download-scalapack=1 --download-superlu_dist=1 --download-superlu=1 --with-shared=0 --CXXOPTFLAGS=-fast --FOPTFLAGS=-fast --COPTFLAGS=-fast --download-blacs=1 --with-scalar-type=complex --with-debugging=0 --download-spooles=1
-----------------------------------------
Libraries compiled on Thu Nov 15 23:52:50 GMT 2007 on l2cu28
Machine characteristics: Linux l2cu28 2.6.5-7.287.3-smp_perfctr #3 SMP Wed Oct 17 21:27:48 BST 2007 x86_64 x86_64 x86_64 GNU/Linux
Using PETSc directory: /ichec/work/staff/tstitt/petsc-2.3.3-p7
Using PETSc arch: pathscale_O3
-----------------------------------------
Using C compiler: mpicc -fPIC
Using Fortran compiler: mpif77 -fPIC
-----------------------------------------
Using include paths: -I/ichec/work/staff/tstitt/petsc-2.3.3-p7 -I/ichec/work/staff/tstitt/petsc-2.3.3-p7/bmake/pathscale_O3 -I/ichec/work/staff/tstitt/petsc-2.3.3-p7/include -I/usr/X11R6/include -I/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/MUMPS_4.7.3/pathscale_O3/include -I/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SCALAPACK/pathscale_O3/include -I/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/blacs-dev/pathscale_O3/include -I/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SuperLU_DIST_2.0-Jan_5_2006/pathscale_O3/SRC -I/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/spooles-2.2/pathscale_O3/ -I/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SuperLU_3.0-Jan_5_2006/pathscale_O3/SRC -I/usr/local/mpich2/path3.0/include
------------------------------------------
Using C linker: mpicc -fPIC
Using Fortran linker: mpif77 -fPIC
Using libraries: -Wl,-rpath,/ichec/work/staff/tstitt/petsc-2.3.3-p7/lib/pathscale_O3 -L/ichec/work/staff/tstitt/petsc-2.3.3-p7/lib/pathscale_O3 -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc -L/usr/X11R6/lib64 -lX11 -Wl,-rpath,/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/MUMPS_4.7.3/pathscale_O3/lib -L/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/MUMPS_4.7.3/pathscale_O3/lib -lcmumps -ldmumps -lsmumps -lzmumps -lpord -Wl,-rpath,/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SCALAPACK/pathscale_O3 -L/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SCALAPACK/pathscale_O3 -lscalapack -Wl,-rpath,/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/blacs-dev/pathscale_O3 -L/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/blacs-dev/pathscale_O3 -lblacs -Wl,-rpath,/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SuperLU_DIST_2.0-Jan_5_2006/pathscale_O3 -L/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SuperLU_DIST_2.0-Jan_5_2006/pathscale_O3 -lsuperlu_dist_2.0 /ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/spooles-2.2/pathscale_O3/MPI/src/spoolesMPI.a /ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/spooles-2.2/pathscale_O3/spooles.a -Wl,-rpath,/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SuperLU_3.0-Jan_5_2006/pathscale_O3 -L/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SuperLU_3.0-Jan_5_2006/pathscale_O3 -lsuperlu_3.0 -Wl,-rpath,/opt/packages/path-compat/acml/pathscale64/lib -L/opt/packages/path-compat/acml/pathscale64/lib -lacml -Wl,-rpath,/opt/packages/path-compat/acml/pathscale64/lib -L/opt/packages/path-compat/acml/pathscale64/lib -lacml -Wl,-rpath,/usr/local/mpich2/path3.0/lib64 -L/usr/local/mpich2/path3.0/lib64 -Wl,-rpath,/opt/packages/pathscale3.0/lib/3.0 -L/opt/packages/pathscale3.0/lib/3.0 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -Wl,-rpath,/lib/../lib64 -L/lib/../lib64 -Wl,-rpath,/usr/lib/../lib64 -L/usr/lib/../lib64 -ldl -lpmpich -lmpich -lpthread -lrt -lpscrt -lgcc_eh -lpathfstart -lpathfortran -lmv -lmpath -lm -lm -Wl,-rpath,/usr/local/mpich2/path3.0/lib64 -Wl,-rpath,/opt/packages/pathscale3.0/lib/3.0 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -Wl,-rpath,/lib/../lib64 -Wl,-rpath,/usr/lib/../lib64 -lm -lm -Wl,-rpath,/usr/local/mpich2/path3.0/lib64 -L/usr/local/mpich2/path3.0/lib64 -Wl,-rpath,/opt/packages/pathscale3.0/lib/3.0 -L/opt/packages/pathscale3.0/lib/3.0 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -Wl,-rpath,/lib/../lib64 -L/lib/../lib64 -Wl,-rpath,/usr/lib/../lib64 -L/usr/lib/../lib64 -ldl -lpmpich -lmpich -lpthread -lrt -lpscrt -lgcc_eh -lpathfstart -lpathfortran -lmv -lmpath -lm -lm -Wl,-rpath,/usr/local/mpich2/path3.0/lib64 -Wl,-rpath,/opt/packages/pathscale3.0/lib/3.0 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -Wl,-rpath,/lib/../lib64 -Wl,-rpath,/usr/lib/../lib64 -lm -lm -lm -Wl,-rpath,/usr/local/mpich2/path3.0/lib64 -L/usr/local/mpich2/path3.0/lib64 -Wl,-rpath,/opt/packages/pathscale3.0/lib/3.0 -L/opt/packages/pathscale3.0/lib/3.0 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -Wl,-rpath,/lib/../lib64 -L/lib/../lib64 -Wl,-rpath,/usr/lib/../lib64 -L/usr/lib/../lib64 -ldl -lpmpich -lmpich -lpthread -lrt -lpscrt -lgcc_eh -lpathfstart -lpathfortran -lmv -lmpath -lm -lm -Wl,-rpath,/usr/local/mpich2/path3.0/lib64 -Wl,-rpath,/opt/packages/pathscale3.0/lib/3.0 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -Wl,-rpath,/lib/../lib64 -Wl,-rpath,/usr/lib/../lib64 -lm -lm -Wl,-rpath,/usr/local/mpich2/path3.0/lib64 -L/usr/local/mpich2/path3.0/lib64 -Wl,-rpath,/opt/packages/pathscale3.0/lib/3.0 -L/opt/packages/pathscale3.0/lib/3.0 -ldl -lpmpich -lmpich -lpthread -lrt -lpscrt -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -Wl,-rpath,/lib/../lib64 -L/lib/../lib64 -Wl,-rpath,/usr/lib/../lib64 -L/usr/lib/../lib64 -lgcc_eh -ldl
------------------------------------------
Deleting /localscratch/pbstmp.159913.l2cu33.ichec.ie
Creating /localscratch/pbstmp.159912.l2cu33.ichec.ie
Working directory is /ichec/work/staff/tstitt/SolverCode
Running with 8 processes
Matrix has order 4704 rows by 4704 columns
Number of RHS is: 4704
Worker Solve Time is: 6.86795616
Master Solve Time is: 8.66668129
Worker Solve Time is: 8.85465431
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./solver on a pathscale named h3cu06 with 8 processors, by tstitt Tue Nov 20 20:30:38 2007
Using Petsc Release Version 2.3.3, Patch 7, Fri Oct 26 14:21:35 CDT 2007 HG revision: 2e223033ba960114833e1f9713ab393ec78c056f
Max Max/Min Avg Total
Worker Solve Time is: 10.0444736
Worker Solve Time is: 13.47995
Worker Solve Time is: 12.8490467
Worker Solve Time is: 16.845438
Worker Solve Time is: 11.9151878
Time (sec): 2.533e+01 1.00016 2.533e+01
Objects: 2.400e+01 1.00000 2.400e+01
Flops: 7.214e+09 1.89167 5.406e+09 4.325e+10
Flops/sec: 2.847e+08 1.89142 2.134e+08 1.707e+09
MPI Messages: 4.000e+00 1.33333 3.500e+00 2.800e+01
MPI Message Lengths: 6.336e+03 1.20824 1.676e+03 4.693e+04
MPI Reductions: 1.766e+03 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 2.5333e+01 100.0% 4.3245e+10 100.0% 2.800e+01 100.0% 1.676e+03 100.0% 1.413e+04 100.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops/sec: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
##########################################################
# #
# WARNING!!! #
# #
# The code for various complex numbers numerical #
# kernels uses C++, which generally is not well #
# optimized. For performance that is about 4-5 times #
# faster, specify --with-fortran-kernels=generic #
# when running config/configure.py. #
# #
##########################################################
##########################################################
# #
# WARNING!!! #
# #
# This code was run without the PreLoadBegin() #
# macros. To get timing results we always recommend #
# preloading. otherwise timing numbers may be #
# meaningless. #
##########################################################
Event Count Time (sec) Flops/sec --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatSolve 4704 1.0 1.6348e+01 2.6 6.21e+08 1.8 0.0e+00 0.0e+00 0.0e+00 42 98 0 0 0 42 98 0 0 0 2590
MatLUFactorNum 1 1.0 1.6377e-01 4.7 1.46e+09 1.2 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 5479
MatILUFactorSym 1 1.0 4.3809e-03 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyBegin 1 1.0 4.8079e-02101.2 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 1 1.0 4.6219e-02 1.2 0.00e+00 0.0 2.8e+01 1.7e+03 7.0e+00 0 0100100 0 0 0100100 0 0
MatGetRowIJ 1 1.0 5.0068e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 1.0180e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 9408 1.0 3.2616e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAssemblyBegin 4704 1.0 1.2538e+01 5.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+04 32 0 0 0100 32 0 0 0100 0
VecAssemblyEnd 4704 1.0 1.8426e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetup 2 1.0 4.7684e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 4704 1.0 1.6674e+01 2.5 6.19e+08 1.8 0.0e+00 0.0e+00 5.0e+00 43100 0 0 0 43100 0 0 0 2594
PCSetUp 2 1.0 1.6838e-01 4.5 1.36e+09 1.2 0.0e+00 0.0e+00 5.0e+00 0 2 0 0 0 0 2 0 0 0 5329
PCSetUpOnBlocks 4704 1.0 1.8029e-01 3.7 1.14e+09 1.1 0.0e+00 0.0e+00 5.0e+00 0 2 0 0 0 0 2 0 0 0 4977
PCApply 4704 1.0 1.6482e+01 2.6 6.15e+08 1.8 0.0e+00 0.0e+00 0.0e+00 42 98 0 0 0 42 98 0 0 0 2569
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
--- Event Stage 0: Main Stage
Matrix 6 4 29267076 0
Index Set 7 7 17312 0
Vec 6 6 41352 0
Vec Scatter 1 1 0 0
Krylov Solver 2 2 0 0
Preconditioner 2 2 256 0
========================================================================================================================
Average time to get PetscTime(): 2.40803e-06
Average time for MPI_Barrier(): 0.00820498
Average time for zero size MPI_Send(): 0.000125885
OptionTable: -log_summary
OptionTable: -mat_type aijmumps
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 16
Configure run at: Thu Nov 15 23:52:44 2007
Configure options: --with-cxx=mpiCC --with-cc=mpicc --with-mpi-dir=/usr/local/mpich2/path3.0/ --with-blas-lib=/opt/packages/path-compat/acml/pathscale64/lib/libacml.a --with-lapack-lib=/opt/packages/path-compat/acml/pathscale64/lib/libacml.a --with-timer=mpi --with-fc=mpif77 --download-mumps=1 --download-scalapack=1 --download-superlu_dist=1 --download-superlu=1 --with-shared=0 --CXXOPTFLAGS=-fast --FOPTFLAGS=-fast --COPTFLAGS=-fast --download-blacs=1 --with-scalar-type=complex --with-debugging=0 --download-spooles=1
-----------------------------------------
Libraries compiled on Thu Nov 15 23:52:50 GMT 2007 on l2cu28
Machine characteristics: Linux l2cu28 2.6.5-7.287.3-smp_perfctr #3 SMP Wed Oct 17 21:27:48 BST 2007 x86_64 x86_64 x86_64 GNU/Linux
Using PETSc directory: /ichec/work/staff/tstitt/petsc-2.3.3-p7
Using PETSc arch: pathscale_O3
-----------------------------------------
Using C compiler: mpicc -fPIC
Using Fortran compiler: mpif77 -fPIC
-----------------------------------------
Using include paths: -I/ichec/work/staff/tstitt/petsc-2.3.3-p7 -I/ichec/work/staff/tstitt/petsc-2.3.3-p7/bmake/pathscale_O3 -I/ichec/work/staff/tstitt/petsc-2.3.3-p7/include -I/usr/X11R6/include -I/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/MUMPS_4.7.3/pathscale_O3/include -I/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SCALAPACK/pathscale_O3/include -I/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/blacs-dev/pathscale_O3/include -I/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SuperLU_DIST_2.0-Jan_5_2006/pathscale_O3/SRC -I/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/spooles-2.2/pathscale_O3/ -I/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SuperLU_3.0-Jan_5_2006/pathscale_O3/SRC -I/usr/local/mpich2/path3.0/include
------------------------------------------
Using C linker: mpicc -fPIC
Using Fortran linker: mpif77 -fPIC
Using libraries: -Wl,-rpath,/ichec/work/staff/tstitt/petsc-2.3.3-p7/lib/pathscale_O3 -L/ichec/work/staff/tstitt/petsc-2.3.3-p7/lib/pathscale_O3 -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc -L/usr/X11R6/lib64 -lX11 -Wl,-rpath,/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/MUMPS_4.7.3/pathscale_O3/lib -L/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/MUMPS_4.7.3/pathscale_O3/lib -lcmumps -ldmumps -lsmumps -lzmumps -lpord -Wl,-rpath,/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SCALAPACK/pathscale_O3 -L/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SCALAPACK/pathscale_O3 -lscalapack -Wl,-rpath,/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/blacs-dev/pathscale_O3 -L/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/blacs-dev/pathscale_O3 -lblacs -Wl,-rpath,/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SuperLU_DIST_2.0-Jan_5_2006/pathscale_O3 -L/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SuperLU_DIST_2.0-Jan_5_2006/pathscale_O3 -lsuperlu_dist_2.0 /ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/spooles-2.2/pathscale_O3/MPI/src/spoolesMPI.a /ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/spooles-2.2/pathscale_O3/spooles.a -Wl,-rpath,/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SuperLU_3.0-Jan_5_2006/pathscale_O3 -L/ichec/work/staff/tstitt/petsc-2.3.3-p7/externalpackages/SuperLU_3.0-Jan_5_2006/pathscale_O3 -lsuperlu_3.0 -Wl,-rpath,/opt/packages/path-compat/acml/pathscale64/lib -L/opt/packages/path-compat/acml/pathscale64/lib -lacml -Wl,-rpath,/opt/packages/path-compat/acml/pathscale64/lib -L/opt/packages/path-compat/acml/pathscale64/lib -lacml -Wl,-rpath,/usr/local/mpich2/path3.0/lib64 -L/usr/local/mpich2/path3.0/lib64 -Wl,-rpath,/opt/packages/pathscale3.0/lib/3.0 -L/opt/packages/pathscale3.0/lib/3.0 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -Wl,-rpath,/lib/../lib64 -L/lib/../lib64 -Wl,-rpath,/usr/lib/../lib64 -L/usr/lib/../lib64 -ldl -lpmpich -lmpich -lpthread -lrt -lpscrt -lgcc_eh -lpathfstart -lpathfortran -lmv -lmpath -lm -lm -Wl,-rpath,/usr/local/mpich2/path3.0/lib64 -Wl,-rpath,/opt/packages/pathscale3.0/lib/3.0 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -Wl,-rpath,/lib/../lib64 -Wl,-rpath,/usr/lib/../lib64 -lm -lm -Wl,-rpath,/usr/local/mpich2/path3.0/lib64 -L/usr/local/mpich2/path3.0/lib64 -Wl,-rpath,/opt/packages/pathscale3.0/lib/3.0 -L/opt/packages/pathscale3.0/lib/3.0 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -Wl,-rpath,/lib/../lib64 -L/lib/../lib64 -Wl,-rpath,/usr/lib/../lib64 -L/usr/lib/../lib64 -ldl -lpmpich -lmpich -lpthread -lrt -lpscrt -lgcc_eh -lpathfstart -lpathfortran -lmv -lmpath -lm -lm -Wl,-rpath,/usr/local/mpich2/path3.0/lib64 -Wl,-rpath,/opt/packages/pathscale3.0/lib/3.0 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -Wl,-rpath,/lib/../lib64 -Wl,-rpath,/usr/lib/../lib64 -lm -lm -lm -Wl,-rpath,/usr/local/mpich2/path3.0/lib64 -L/usr/local/mpich2/path3.0/lib64 -Wl,-rpath,/opt/packages/pathscale3.0/lib/3.0 -L/opt/packages/pathscale3.0/lib/3.0 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -Wl,-rpath,/lib/../lib64 -L/lib/../lib64 -Wl,-rpath,/usr/lib/../lib64 -L/usr/lib/../lib64 -ldl -lpmpich -lmpich -lpthread -lrt -lpscrt -lgcc_eh -lpathfstart -lpathfortran -lmv -lmpath -lm -lm -Wl,-rpath,/usr/local/mpich2/path3.0/lib64 -Wl,-rpath,/opt/packages/pathscale3.0/lib/3.0 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -Wl,-rpath,/lib/../lib64 -Wl,-rpath,/usr/lib/../lib64 -lm -lm -Wl,-rpath,/usr/local/mpich2/path3.0/lib64 -L/usr/local/mpich2/path3.0/lib64 -Wl,-rpath,/opt/packages/pathscale3.0/lib/3.0 -L/opt/packages/pathscale3.0/lib/3.0 -ldl -lpmpich -lmpich -lpthread -lrt -lpscrt -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../x86_64-suse-linux/lib -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../../../lib64 -Wl,-rpath,/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -L/usr/lib64/gcc-lib/x86_64-suse-linux/3.3.3/../../.. -Wl,-rpath,/lib/../lib64 -L/lib/../lib64 -Wl,-rpath,/usr/lib/../lib64 -L/usr/lib/../lib64 -lgcc_eh -ldl
------------------------------------------
Deleting /localscratch/pbstmp.159912.l2cu33.ichec.ie