gpu_hpl_sample.out 8.01 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
================================================================================
HPLinpack 2.2  --  High-Performance Linpack benchmark  --   February 24, 2016
Written by A. Petitet and R. Clint Whaley,  Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
================================================================================

An explanation of the input/output parameters follows:
T/V    : Wall time / encoded variant.
N      : The order of the coefficient matrix A.
NB     : The partitioning blocking factor.
P      : The number of process rows.
Q      : The number of process columns.
Time   : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N      :    8192 
NB     :     512 
PMAP   : Column-major process mapping
P      :       4 
Q      :       1 
PFACT  :   Right 
NBMIN  :      32 
NDIV   :       2 
RFACT  :   Right 
BCAST  :  1ringM 
DEPTH  :       1 
SWAP   : Spread-roll (long)
L1     : transposed form
U      : transposed form
EQUIL  : no
ALIGN  : 8 double precision words

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
      ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be               1.110223e-16
- Computational tests pass if scaled residuals are less than                16.0

================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WC11R2R32        8192   512     4     1               0.71              5.167e+02
HPL_pdgesv() start time Web Apr 22 00:00:00 2026

HPL_pdgesv() end time   Web Apr 22 00:00:00 2026

--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV-
+ Max aggregated wall time rfact . . :               0.32
+ + Max aggregated wall time pfact . :               0.30
+ + Max aggregated wall time mxswp . :               0.19
Max aggregated wall time laswp . . . :               0.37
Max aggregated wall time update  . . :               0.00
Max aggregated wall time up tr sv  . :               0.00
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0002689 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WC11R2R32        8192   512     4     1               0.69              5.338e+02
HPL_pdgesv() start time Web Apr 22 00:00:00 2026

HPL_pdgesv() end time   Web Apr 22 00:00:00 2026

--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV-
+ Max aggregated wall time rfact . . :               0.30
+ + Max aggregated wall time pfact . :               0.30
+ + Max aggregated wall time mxswp . :               0.18
Max aggregated wall time laswp . . . :               0.36
Max aggregated wall time update  . . :               0.00
Max aggregated wall time up tr sv  . :               0.00
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0002689 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WC11R2R32        8192   512     4     1               0.67              5.437e+02
HPL_pdgesv() start time Web Apr 22 00:00:00 2026

HPL_pdgesv() end time   Web Apr 22 00:00:00 2026

--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV-
+ Max aggregated wall time rfact . . :               0.30
+ + Max aggregated wall time pfact . :               0.29
+ + Max aggregated wall time mxswp . :               0.18
Max aggregated wall time laswp . . . :               0.36
Max aggregated wall time update  . . :               0.00
Max aggregated wall time up tr sv  . :               0.00
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0002689 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WC11R2R32        8192   512     4     1               0.67              5.450e+02
HPL_pdgesv() start time Web Apr 22 00:00:00 2026

HPL_pdgesv() end time   Web Apr 22 00:00:00 2026

--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV-
+ Max aggregated wall time rfact . . :               0.30
+ + Max aggregated wall time pfact . :               0.29
+ + Max aggregated wall time mxswp . :               0.18
Max aggregated wall time laswp . . . :               0.36
Max aggregated wall time update  . . :               0.00
Max aggregated wall time up tr sv  . :               0.00
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0002689 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WC11R2R32        8192   512     4     1               0.68              5.391e+02
HPL_pdgesv() start time Web Apr 22 00:00:00 2026

HPL_pdgesv() end time   Web Apr 22 00:00:00 2026

--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV-
+ Max aggregated wall time rfact . . :               0.30
+ + Max aggregated wall time pfact . :               0.29
+ + Max aggregated wall time mxswp . :               0.18
Max aggregated wall time laswp . . . :               0.36
Max aggregated wall time update  . . :               0.00
Max aggregated wall time up tr sv  . :               0.00
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0002689 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WC11R2R32        8192   512     4     1               0.69              5.330e+02
HPL_pdgesv() start time Web Apr 22 00:00:00 2026

HPL_pdgesv() end time   Web Apr 22 00:00:00 2026

--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV--VVV-
+ Max aggregated wall time rfact . . :               0.31
+ + Max aggregated wall time pfact . :               0.30
+ + Max aggregated wall time mxswp . :               0.19
Max aggregated wall time laswp . . . :               0.36
Max aggregated wall time update  . . :               0.00
Max aggregated wall time up tr sv  . :               0.00
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0002689 ...... PASSED
================================================================================

Finished      6 tests with the following results:
              6 tests completed and passed residual checks,
              0 tests completed and failed residual checks,
              0 tests skipped because of illegal input values.
--------------------------------------------------------------------------------

End of Tests.
================================================================================