Title: The Biquad revisited
1ECE6580 Lecture 13
2Rules of Engagement
- Develop and test algorithm in C
- Register allocation
- Write non-parallel assembly code
- Create dependency graph
- Create Scheduling Chart
- Write parallel assembly code
3Biquad in C
include "ctypes.h" float BiquadC(float x,float
states,float pm coefs,U32 sections) float
y U32 i,j float xxx for(i0,j0iltsections
i5,j4) y coefsixx
coefsi1statesi coefsi2statesj1 -
coefsj3statesj2 - coefsj4statesj
3 statesi3 statesi2 statesi2y
statesi1statesi statesixx xx
y return(y)
4Register Allocation
- f0 states and return value
- f2 states
- f4 x and coefs
- f8 product
- f12 accumulator
- i4 pointer to states
- i12 pointer to coefs
5Non-parallel Assembly
include ltasm_sprt.hgt include ltdef21161.hgt //floa
t BiquadAsm(float x, // input, f4 //
float states, // pointer to states, r8 //
float pm coefs, // pointer to coefs,
r12 // U32 sections) // number of
biquad sections .segment /pm seg_pmco .global
_BiquadLinear _BiquadLinear entry b4 r8
// pointer to states b12 r12 // pointer to
coefs r1reads(1) // read number of
sections r12 r12 xor r12 // clear the
accumulator f2 f4 // save f4 in f2 lcntr
r1,do BiquadLinearEnd until lce f4
pm(i12,m14) // fetch b0 f12f2f4 //
b0x f4 pm(i12,m14) // fetch b1 f0
dm(i4,m5) // fetch x(n-1) dm(i4,m6) f2 //
x(n-1)x f8 f0f4 // b1x(n-1) f12
f8f12 // b0x b1x(n-1) f4
pm(i12,m14) // fetch b2 f4 dm(i4,m5) //
fetch x(n-2) dm(i4,m6) f0 //
x(n-2)x(n-1) f8 f4f1 // b2x(n-2) f12
f8f12 // b0x b1x(n-1) b2x(n-2) f4
pm(i12,m14) // fetch a1 f0 dm(i4,m6) //
fetch y(n-1) f8 f0f1 // a1y(n-1) f12
f8f12 // b0x b1x(n-1) b2x(n-2) -
a1y(n-1) f4 pm(i12,m14) // fetch a2 f4
dm(i4,m5) // fetch y(n-2) dm(i4,m7) f0 //
y(n-2)y(n-1) f8 f4f1 // a2y(n-2) f2
f8f12 // yb0xb1x(n-1)b2x(n-2)-a1y(n-1)-a2
y(n-2) r12 r12 xor r12 // clear the
accumulator BiquadLinearEnd dm(i4,2) f2 //
y(n-1)y f0 f4 exit
6f2f4
end
7Rules for Creating Scheduling Chart
- All parent nodes become first parallel
instruction. If not enough resources use 2
intructions - Identify next instructions that can be executed.
Create a parallel instruction with them. - Repeat until done.
8f2f4
end
9Scheduling Table
f4f2x
10f2f4
end
11Scheduling Table
f4f2x
12f2f4
end
13Scheduling Table
f4f2x
14f2f4
end
15Scheduling Table
f4f2x
16f2f4
end
17Scheduling Table
f4f2x
18f2f4
end
19Scheduling Table
f4f2x
20f2f4
end
21Scheduling Table
f4f2x
22f2f4
end
23Scheduling Table
f4f2x
24f2f4
end
25Scheduling Table
f4f2x