Title: C Design Techniques for High Performance
1C Design Techniques for High Performance
- Todd Veldhuizen
- Presented by Ganesh Bikshandi
2C Goals
- Library development
- Data Abstraction
- vector ltintgt, Stack, Queue etc.
- Object-oriented Programming
- Virtual functions, inheritance
- Safety
- Strict type conformance
- Parameterized types
- Unbounded set of 'related' types
- Listltintgt, Listltdoublegt
- Class Templates
- Function Templates
template ltclass Tgt class Array class
Arrayltdoublegt int main () Arrayltintgt
a Arrayltdoublegt b
template ltclass T, int rankgt class
Array class Arrayltdouble, 2gt int
main () Arrayltint, 3gt a Arrayltdouble, 2gt
6Virtual Functions
struct base virtual void vf1() class
derived public base public void
vf1() void g (base bp) bp-gtvf1() in
t main () derived d base b g (d)
//calls derivedvf1 g (b) //calls basevf1
7Virtual Function mechanics (single inheritance)
..... vptr .....
..... vptr .....
8Virtual Functions
struct base virtual void vf1() class
derived public base public void
vf1() void g (base bp) bp-gtvf1() ((bp-
gtvptr0))(bp) int main () derived
d base b g (d) //calls derivedvf1 g
(b) //calls basevf1
9Space Cost
- Space
- One vptr per object
- One vtable per class (usually).
- Growth factor sizeof(object) 1
- -----------------------
----- - sizeof(object)
10Time Cost
- Direct cost
- Extra memory references/call
- Indirect cost
- Restricts inlining
- Restricts loop invariant removal
- Restricts some more optimizations
- Pipeline Stalls due to branch misprediction
class Matrix //Abstract class public virtual
double operator()(int i, int j) 0 class
SymmetricMatrix public Matrix //concrete
class double operator()(int i, int j) ...
class UpperTriMatrix public Matrix
//concrete class double operator()(int i, int
j) ... double sum (Matrix A) ...
SymmetricMatrix A sum (A)
- Sizeof (f) is small freqency(f) is high
- This is not a rare occurrence
for (int i 0 i lt 100000 i ) for (int j
0 j lt 100000 j) sum a(i, j)
13Static polymorphism
templateltclass T_leaftypegt class Matrix
public double operator()(int i, int j)
return leaf (i,j) private T_leaftype
leaf class SymmetricMatrix double operator
() (int i, int j) ... class UpperTriMatrix
double operator () (int i, int j) ....
template ltclass T_leaftypegt double sum (Matrix
ltT_leaftypegt a) ... MatrixltSymmetricMatrixgt
A sum (A)
14Static polymorphism
templateltclass T_leaftypegt class Matrix
public T_leaftype asLeaf() return
static_castltT_leaftypegt(this) double
operator()(int i, int j) return
asLeaf()(i,j) // delegate to
leaf class SymmetricMatrix public
MatrixltSymmetricMatrixgt class
UpperTriMatrix public MatrixltUpperTriMatrixgt
SymmetricMatrix A sum (A)
15Usage in HTALib
Class HTAltLgt public HTAltL-1gt operator
() (int i, int j) return wrapped_(i, j)
private HTAImplltLgt
wrapped_ Class HTAImplltLgt //HTA class
HTAImpllt0gt //Leaf
16Operator Overloading
- Enables clean syntax
- Array a c d
- String s a b
- Known restrictions
- Only valid C operators
- Operator can take only one argument
17Cost of operator overloading
Z A B C ..... tmp1 A.clone() for (int
i 0 i lt size i) tmp1.data_i
A.data_i B.data_i ..... tmp2
tmp1.clone() for (int i 0 i lt size
i) tmp2.data_i tmp1.data_i
C.data_i .... for (int i 0 i lt size
i) Z.data_i tmp2.data_i
18Cost of operator overloading
Z A B C ..... tmp1 A.clone() for (int
i 0 i lt size i) tmp1.data_i
A.data_i B.data_i ..... tmp2
tmp1.clone() for (int i 0 i lt size
i) tmp2.data_i tmp1.data_i
C.data_i .... for (int i 0 i lt size
i) Z.data_i tmp2.data_i
new overhead
loop overhead
2N/ M More memory traffic
19Stencil Computations
B AI, J AI-1, J AI1, J AI,
J-1 AI,J1 AI-1, J-1 AI1, J-1
AI-1, J1 AI1, J1 (factor of 16 slow
20Expression Templates
- Idea
- Delay the evaluation of expression
- Construct an parse tree of expression
- Evaluate it on assignment (use)
21Expression Templates
Array A, B, C, D D A B C
D XltXltArray,plus,Arraygt,plus,Arraygt()
22Expression Templates
struct plus // Represents addition class
Array // some array class templateltclass
Left, class Op, class Rightgt class X
templateltclass Leftgt XltT, plus, Arraygt
operator(Left A, Array B) return XltLeft,
plus, Arraygt()
23Expression Templates
Array A, B, C, D D A B C
XltArray,plus,Arraygt() C XltXltArray,plus,Arra
24Expression Templates
struct Array .... templateltclass
Left,class Op, class Rightgt void
operator(XltLeft,Op,Rightgt expression)
for (int i0 i lt N_ i)
data_i expressioni double
operator(int i) return data_i
25Expression Templates
templateltclass Left, class Op, class
Rightgt struct X Left leftNode_ Right
rightNode_ X(Left t1, Right
t2) leftNode_(t1), rightNode_(t2)
double operator(int i) return
struct plus static double
apply(double a, double b) return ab
26Expression Templates
for (int i0 i lt D.N_ i) D.data_i
A.data_i B.data_i C.data_i
27Template Meta programs
- Compile time programs
- Sophisticated than MACROs
- Compile time specialization of algorithms
- Partial evaluation of programs
- P (S, V) Ps (V)
- Turing Complete
e.g . ? fj e(2pikj)
28Template Meta Programming
- template ltint Ngt
- struct fact
- static const int value N factltN-1gtval
- struct factlt1gt
- static const int val 1
- int main (int argc, char argv)
- cout ltlt factlt5gtval ltlt endl
29Template Meta Programs
- Traditional loop optimizations
template lttypename T, int DIMgt class VectorOps
static inline int dotProduct(const T x, const T
y) return xDIM-1 yDIM-1
VectorOpsltT, DIM-1gtdotProduct (x, y)
VectorOpsltintDIM, DIMgtdotProduct(x, y)
30Template Meta Programs
- Avoiding costly if-else switches
template ltint L, int Mgt HTA lt(L gt M) ? L
Mgt operator (HTAltLgt lhs, HTAltMgt rhs)
return add_ltM, (L gt M)gtcompute (lhs, rhs)
template ltint L, bool flaggt struct
add_.... template ltint Lgt struct
add_ltL, falsegt ...
31Other Patterns
- Productivity
- Traits
- Type promotion
- Packages
- Automated generation of ETs
- Large scale libraries have numerous overheads
from C constructs - Virtual functions, operator overloading.
- Side-effects of templates are accidental
discoveries, but effective - Fills the C - FORTRAN performance gap
- Drawbacks
- complex design, code growth.
- Todd Veldhuizen
- (osl.iu.edu/tveldhui)