So is MatAssemblyBegin's time included in KSPSolve? If not, does it
mean that there's something wrong about my MatAssemblyBegin?
Thank you
For 1 processor:
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -
r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance
Summary: ----------------------------------------------
./a.out on a atlas3 named atlas3-c28 with 1 processor, by g0306332
Fri Apr 18 08:46:11 2008
Using Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST
2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b
Max Max/Min Avg Total
Time (sec): 1.322e+02 1.00000 1.322e+02
Objects: 2.200e+01 1.00000 2.200e+01
Flops: 2.242e+08 1.00000 2.242e+08 2.242e+08
Flops/sec: 1.696e+06 1.00000 1.696e+06 1.696e+06
MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
MPI Reductions: 2.100e+01 1.00000
Flop counting convention: 1 flop = 1 real number operation of type
(multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of
length N --> 2N flops
and VecAXPY() for complex vectors of
length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- ---
Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts
%Total Avg %Total counts %Total
0: Main Stage: 1.3217e+02 100.0% 2.2415e+08 100.0% 0.000e
+00 0.0% 0.000e+00 0.0% 2.100e+01 100.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on
interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops/sec: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all
processors
Mess: number of messages sent
Avg. len: average message length
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with
PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in
this phase
%M - percent messages in this phase %L - percent message
lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max
time over all processors)
------------------------------------------------------------------------------------------------------------------------
##########################################################
# #
# WARNING!!! #
# #
# This code was run without the PreLoadBegin() #
# macros. To get timing results we always recommend #
# preloading. otherwise timing numbers may be #
# meaningless. #
##########################################################
Event Count Time (sec) Flops/
sec --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg
len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 6 1.0 1.8572e-01 1.0 3.77e+08 1.0 0.0e+00 0.0e
+00 0.0e+00 0 31 0 0 0 0 31 0 0 0 377
MatConvert 1 1.0 1.1636e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
MatAssemblyBegin 1 1.0 2.1458e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 1 1.0 8.8531e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRow 1296000 1.0 2.6576e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRowIJ 1 1.0 4.0531e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatZeroEntries 1 1.0 4.4700e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPGMRESOrthog 6 1.0 2.1104e-01 1.0 5.16e+08 1.0 0.0e+00 0.0e
+00 6.0e+00 0 49 0 0 29 0 49 0 0 29 516
KSPSetup 1 1.0 6.5601e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 1.2883e+01 1.0 1.74e+07 1.0 0.0e+00 0.0e
+00 1.5e+01 10100 0 0 71 10100 0 0 71 17
PCSetUp 1 1.0 4.4342e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 2.0e+00 3 0 0 0 10 3 0 0 0 10 0
PCApply 7 1.0 7.7337e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 6 0 0 0 0 6 0 0 0 0 0
VecMDot 6 1.0 9.8586e-02 1.0 5.52e+08 1.0 0.0e+00 0.0e
+00 6.0e+00 0 24 0 0 29 0 24 0 0 29 552
VecNorm 7 1.0 6.9757e-02 1.0 2.60e+08 1.0 0.0e+00 0.0e
+00 7.0e+00 0 8 0 0 33 0 8 0 0 33 260
VecScale 7 1.0 2.9803e-02 1.0 3.04e+08 1.0 0.0e+00 0.0e
+00 0.0e+00 0 4 0 0 0 0 4 0 0 0 304
VecCopy 1 1.0 6.1009e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 9 1.0 3.1438e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 1 1.0 7.5161e-03 1.0 3.45e+08 1.0 0.0e+00 0.0e
+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 345
VecMAXPY 7 1.0 1.4444e-01 1.0 4.85e+08 1.0 0.0e+00 0.0e
+00 0.0e+00 0 31 0 0 0 0 31 0 0 0 485
VecAssemblyBegin 2 1.0 4.2915e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 6.0e+00 0 0 0 0 29 0 0 0 0 29 0
VecAssemblyEnd 2 1.0 6.9141e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNormalize 7 1.0 9.9603e-02 1.0 2.73e+08 1.0 0.0e+00 0.0e
+00 7.0e+00 0 12 0 0 33 0 12 0 0 33 273
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants'
Mem.
--- Event Stage 0: Main Stage
Matrix 1 1 98496004 0
Krylov Solver 1 1 17216 0
Preconditioner 1 1 272 0
Vec 19 19 186638392 0
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
======================================================================
Average time to get PetscTime(): 9.53674e-08
OptionTable: -log_summary
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
sizeof(PetscScalar) 8
Configure run at: Wed Jan 9 14:33:02 2008
Configure options: --with-cc=icc --with-fc=ifort --with-x=0 --with-
blas-lapack-dir=/opt/intel/cmkl/8.1.1/lib/em64t --with-shared --with-
mpi-dir=/lsftmp/g0306332/mpich2/ --with-debugging=0 --with-hypre-
dir=/home/enduser/g0306332/lib/hypre_shared
-----------------------------------------
Libraries compiled on Wed Jan 9 14:33:36 SGT 2008 on atlas3-c01
Machine characteristics: Linux atlas3-c01 2.6.9-42.ELsmp #1 SMP Wed
Jul 12 23:32:02 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux
Using PETSc directory: /home/enduser/g0306332/petsc-2.3.3-p8
Using PETSc arch: atlas3
-----------------------------------------
Using C compiler: icc -fPIC -O
for 4 processors
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -
r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance
Summary: ----------------------------------------------
./a.out on a atlas3-mp named atlas3-c23 with 4 processors, by
g0306332 Fri Apr 18 08:22:11 2008
Using Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST
2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b
Max Max/Min Avg Total
0.000000000000000E+000 58.1071298622710
0.000000000000000E+000 58.1071298622710
0.000000000000000E+000 58.1071298622710
0.000000000000000E+000 58.1071298622710
Time (sec): 3.308e+02 1.00177 3.305e+02
Objects: 2.900e+01 1.00000 2.900e+01
Flops: 5.605e+07 1.00026 5.604e+07 2.242e+08
Flops/sec: 1.697e+05 1.00201 1.695e+05 6.782e+05
MPI Messages: 1.400e+01 2.00000 1.050e+01 4.200e+01
MPI Message Lengths: 1.248e+05 2.00000 8.914e+03 3.744e+05
MPI Reductions: 7.500e+00 1.00000
Flop counting convention: 1 flop = 1 real number operation of type
(multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of
length N --> 2N flops
and VecAXPY() for complex vectors of
length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- ---
Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts
%Total Avg %Total counts %Total
0: Main Stage: 3.3051e+02 100.0% 2.2415e+08 100.0% 4.200e+01
100.0% 8.914e+03 100.0% 3.000e+01 100.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on
interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops/sec: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all
processors
Mess: number of messages sent
Avg. len: average message length
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with
PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in
this phase
%M - percent messages in this phase %L - percent message
lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max
time over all processors)
------------------------------------------------------------------------------------------------------------------------
##########################################################
# #
# WARNING!!! #
# #
# This code was run without the PreLoadBegin() #
# macros. To get timing results we always recommend #
# preloading. otherwise timing numbers may be #
# meaningless. #
# preloading. otherwise timing numbers may be #
# meaningless. #
##########################################################
Event Count Time (sec) Flops/
sec --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg
len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 6 1.0 8.2640e-02 1.6 3.37e+08 1.6 3.6e+01 9.6e
+03 0.0e+00 0 31 86 92 0 0 31 86 92 0 846
MatConvert 1 1.0 2.1472e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyBegin 1 1.0 1.5158e+022254.7 0.00e+00 0.0 0.0e+00
0.0e+00 2.0e+00 22 0 0 0 7 22 0 0 0 7 0
MatAssemblyEnd 1 1.0 1.5766e-01 1.1 0.00e+00 0.0 6.0e+00 4.8e
+03 7.0e+00 0 0 14 8 23 0 0 14 8 23 0
MatGetRow 324000 1.0 8.9608e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRowIJ 2 1.0 5.9605e-06 2.8 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatZeroEntries 1 1.0 5.8902e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPGMRESOrthog 6 1.0 1.1247e-01 1.7 4.11e+08 1.7 0.0e+00 0.0e
+00 6.0e+00 0 49 0 0 20 0 49 0 0 20 968
KSPSetup 1 1.0 1.5483e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 4.7041e+00 1.0 1.19e+07 1.0 3.6e+01 9.6e
+03 1.5e+01 1100 86 92 50 1100 86 92 50 48
PCSetUp 1 1.0 1.5953e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 2.0e+00 0 0 0 0 7 0 0 0 0 7 0
PCApply 7 1.0 2.6580e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
VecMDot 6 1.0 7.3443e-02 2.2 4.13e+08 2.2 0.0e+00 0.0e
+00 6.0e+00 0 24 0 0 20 0 24 0 0 20 741
VecNorm 7 1.0 2.5193e-01 1.1 1.94e+07 1.1 0.0e+00 0.0e
+00 7.0e+00 0 8 0 0 23 0 8 0 0 23 72
VecScale 7 1.0 6.6319e-03 2.8 9.64e+08 2.8 0.0e+00 0.0e
+00 0.0e+00 0 4 0 0 0 0 4 0 0 0 1368
VecCopy 1 1.0 2.3100e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 9 1.0 1.4173e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 1 1.0 2.9502e-03 1.7 3.72e+08 1.7 0.0e+00 0.0e
+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 879
VecMAXPY 7 1.0 4.9046e-02 1.4 5.09e+08 1.4 0.0e+00 0.0e
+00 0.0e+00 0 31 0 0 0 0 31 0 0 0 1427
VecAssemblyBegin 2 1.0 4.3297e-04 3.1 0.00e+00 0.0 0.0e+00 0.0e
+00 6.0e+00 0 0 0 0 20 0 0 0 0 20 0
VecAssemblyEnd 2 1.0 5.2452e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e
+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 6 1.0 6.9666e-04 6.3 0.00e+00 0.0 3.6e+01 9.6e
+03 0.0e+00 0 0 86 92 0 0 0 86 92 0 0
VecScatterEnd 6 1.0 1.4806e-02102.6 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNormalize 7 1.0 2.5431e-01 1.1 2.86e+07 1.1 0.0e+00 0.0e
+00 7.0e+00 0 12 0 0 23 0 12 0 0 23 107
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants'
Mem.
--- Event Stage 0: Main Stage
Matrix 3 3 49252812 0
Krylov Solver 1 1 17216 0
Preconditioner 1 1 272 0
Index Set 2 2 5488 0
Vec 21 21 49273624 0
Vec Scatter 1 1 0 0
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
======================================================================
Average time to get PetscTime(): 1.90735e-07
Average time for MPI_Barrier(): 5.62668e-06
Average time for zero size MPI_Send(): 6.73532e-06
OptionTable: -log_summary
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
sizeof(PetscScalar) 8
Configure run at: Tue Jan 8 22:22:08 2008