EFFICIENT ANALYTICAL MODELING OF DATA CACHE BEHAVIOUR - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

EFFICIENT ANALYTICAL MODELING OF DATA CACHE BEHAVIOUR

Description:

Conditions for inter-loop reuse between two loop nests L1 and L2 for a cache line l : There is no memory access to cache line l between the two loop nests. ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 30
Provided by: gau126
Category:

less

Transcript and Presenter's Notes

Title: EFFICIENT ANALYTICAL MODELING OF DATA CACHE BEHAVIOUR


1
 EFFICIENT ANALYTICAL MODELING OF DATA CACHE
BEHAVIOUR
  • By
  • Japinder Singh Chawla
  • Anil Kumar Gadgotra

2
Input-Output
  • INPUT Any benchmark application and other cache
    parameters, e.g. Line size, Cache size.
  • OUTPUT Memory Performance Estimate for different
    cache parameter values.
  • Memory Performance Estimates include
    quantification of reuse in the program, cache
    hits and cache misses

3
Modeling of the Program
  • Any memory reference
  • Af1(i1,i2,..)f2(i1,i2,..)..fn(i1,i2,..)
  • for any stride values s1,s2,.,sn and loop
    variable limits
  • (l1,h1), (l2,h2),.,(ln,hn) can be expressed as
  • Aa1i1a2i2.anina
  • with stride values as 1 and loop variable limits
    as
  • (1,N1), (1,N2),.,(1,Nn)
  • We generate a data structure corresponding to
    each cache line (or cache set if associativity
    Kgt1)
  • The data structure contains information about the
    memory accesses that map to that cache line
  • The following example generates a data structure
    for the cache line L3

4
Modeling of the Program
  • L11 for i11 to N1 step 1
  • L21 for i21 to N2 step 1
  • aR 1 if (i2 2)
  • Read bi1i2-1
  • aR 2 if ((i1 2)(i1i2 10))
  • Read ai1-1i2
  • L22 for i21 to N2 step 1
  • aR 1 Read bi1i2
  • L12 for i11 to N1 step 1
  • L21 for i21 to N2 step 1
  • aR 1 Read ai1i2
  • Any memory access can be uniquely modeled by the
    access vector
  • ( L1 , i1 , L2 , i2 , , Ln , in , aR )

5
L 3
L1 2
L1 1
L2 1
L2 1
L2 2
ai1-1i2 (a2)
bi1i2-1 (a1)
bi1i2 (a1)
ai1i2 (a1)
6
Approach
  • The cache equations are

7
Solving the Equation
  • L11 for i11 to N1 step 1
  • L21 for i21 to N2 step 1
  • aR 1 if (i2 2) Read
    bi1i2-1
  • aR 2 if ((i1 2)(i1i2 10))
    Read ai1-1i2
  • L22 for i21 to N2 step 1
  • aR 1 Read bi1i2
  • L12 for i11 to N1 step 1
  • L21 for i21 to N2 step 1
  • aR 1 Read ai1i2
  • In this program, let N1N220, C32, L4 and for
    the cache line 3, memory reference bi1i2-1
    and _at_b 4, the equation is
  • 8 20i1 mod 32 i2 mod 32 18 lt 12

8
bi1i2-1 (a1)
ß1 20 i1 1,9,17
ß1 8 i1 2,10,18
ß1 16 i1 4,12,20
ß1 12 i1 7,15
ß1 24 i1 6,14
ß2(6,9) i2(6,9)
ß2(18,21) i2(18,21)
ß2(10,13) i2(10,13)
ß2(2,5) i2(2,5)
ß2(14,17) i2(14,17)


MIN (1,6)
MAX (20,13)
9
Inter-loop Reuse for Direct Mapped Cache
  • Conditions for inter-loop reuse between two loop
    nests L1 and L2 for a cache line l
  • There is no memory access to cache line l between
    the two loop nests.
  • The memory access vectors corresponding to the
    last memory access of L1 , a and the first memory
    access of L2 , b access the same array and lie on
    the same memory line,
  • i.e. mlR1(a) mlR2(b).

10
Inter-loop Reuse for Direct Mapped Cache
11
L 3
L1 2
L1 1
L2 1
L2 1
L2 2
ai1-1i2 (a2)
bi1i2-1 (a1)
bi1i2 (a1)
ai1i2 (a1)
MIN(2,9) MAX(20,4)
MIN(1,5) MAX(20,12)
MIN(3,9) MAX(19,12)
MIN(1,6) MAX(20,13)


MIN B(1,6)
MAX B(20,13)
12
L 3
L1 2
L1 1
L2 1
L2 1
L2 2
ai1-1i2 (a2)
bi1i2-1 (a1)
bi1i2 (a1)
ai1i2 (a1)
MIN(2,9) MAX(20,4)
MIN(1,5) MAX(20,12)
MIN(3,9) MAX(19,12)
MIN(1,6) MAX(20,13)


MIN A(2,9)
MAX B(20,12)
13
Inter-loop Reuse for K-way Set Associative Cache
  • Conditions for inter-loop reuse of the memory
    access vector aI , 1 I K , between two loop
    nests L1 and L2 for a cache set s
  • There are no more than I-1 memory accesses to the
    cache set between the two loop nests, which
    access different memory lines and are also
    different from mlR(aI) .
  • Let the above such accesses J. Then memory
    access vector aI is reused iff there exists a
    vector b in the first I-J minimum memory access
    vectors of L2 which access the same array as aI
    and lies on the memory line mlR2(b) mlR1(aI).

14
bi1i2-1 (a1)
ß1 20 i1 1,9,17
ß1 28 i1 3,11,19
ß1 16 i1 4,12,20
ß1 0 i1 8,16
ß1 24 i1 6,14
ß1 4 i1 5,13
ß2(14,20) i2(14,20)
ß2(6,13) i2(6,13)
ß2(18,20) i2(18,20)
ß2(2,6) i2(2,6)
ß2(10,17) i2(10,17)
ß2(2,9) i2(2,9)


MIN (1,14)
MAX (20,20)

Memory line 100, 2nd MAX should not lie on the
memory line 100
Memory line 3, 2nd MIN should not lie on the
memory line 3
15
bi1i2-1 (a1)
ß1 20 i1 1,9,17
ß1 28 i1 3,11,19
ß1 16 i1 4,12,20
ß1 0 i1 8,16
ß1 24 i1 6,14
ß1 4 i1 5,13
ß2(14,20) i2(14,20)
ß2(6,13) i2(6,13)
ß2(18,20) i2(18,20)
ß2(2,6) i2(2,6)
ß2(10,17) i2(10,17)
ß2(2,9) i2(2,9)


2nd MIN (1,18)
2nd MAX (19,13)
16
Inter-loop Reuse for K-way Set Associative Cache
17
L3
Ref A v1 (i11, i12)
Af1(i11)f2(i12)
18
L3
Ref A v2 (i21, i22)
Ref A v1 (i11, i12)
Af1(i21)f2(i22)
19
L3
Ref B v3 (i31, i32)
Ref A v1 (i11, i12)
Ref A v2 (i21, i22)
Bf1(i31)f2(i32)
20
L3
Ref B v3 (i31, i32)
Ref A v1 (i11, i12)
Ref A v2 (i21, i22)
Ref B v4 (i41, i42)
Bf1(i41)f2(i42)
21
L3
Ref B v3 (i31, i32)
Ref A v1 (i11, i12)
Ref A v2 (i21, i22)
Ref B v4 (i41, i42)
Ref A v5 (i51, i52)
Af1(i51)f2(i52)
22
L3
Ref B v3 (i31, i32)
Ref A v1 (i11, i12)
Ref A v2 (i21, i22)
Ref B v4 (i41, i42)
Ref A v5 (i51, i52)
(v1 ,v2), (v3 ,v4),(v5,v6),.. A B
A
Af1(i51)f2(i52)
23
Intra-loop Reuse for Direct Mapped Cache
24
Self-Spatial Reuse
  • Self-spatial reuse occurs when a memory reference
    accesses the same cache line in different
    iterations
  • Let the number of iteration vectors in a interval
    be A. The self spatial reuse within that interval
    Rint
  • (Rint A Nmiss) where Nmiss is the number of
    different memory lines brought into the cache
    line
  • Nmiss Nr_miss L/an if group spatial reuse
    with the preceding interval is zero otherwise
    Nmiss Nr_miss
  • Nr_miss A / (L/an) are the number of
    replacement misses

25
Intra-loop Reuse for K-Way Set Associative Cache
26
Group-Spatial Reuse
  • Conditions for group-spatial reuse of the memory
    access vector aI , 1 I K , between two
    intervals I1 and I2 for a cache set s
  • The memory references corresponding to the two
    intervals access the same array.
  • The memory access vector aI is reused iff there
    exists a vector b in the first I minimum memory
    access vectors of I2 which lies on the same
    memory line mlR2(b) mlR1(aI).

27
Self-Spatial Reuse
  • The self spatial reuse within a interval Rint
  • (Rint A Nmiss) where Nmiss is the number of
    different memory lines brought into the cache
    set.
  • Nmiss Nr_miss (K-m)L/an where group spatial
    reuse with the preceding interval is m.
  • Nr_miss A / (L/an) are the number of
    replacement misses
  • The number of iteration vectors in the interval
    (A) will increase because of associativity.

28
Space and Time Complexity
  • The approach includes the following steps
  • Building of a data structure in
    O(SlCSrefsMAX_REFS?i1N)
  • Computing Interloop reuse in O(KSlCSfMAX_FORSSrefs
    MAX_REFSN)
  • Computing Intraloop reuse in O(KSlCSfMAX_FORSSsC/(
    LXK)N)
  • So the time complexity of the approach is
    O(KSlCSfMAX_FORSSsC/(LXK)N)
  • The space complexity of the approach is
    O(SrefsMAX_REFSN)

29
  • THANKS
Write a Comment
User Comments (0)
About PowerShow.com