EASTERN MEDITERRANEAN UNIVERSITY - PowerPoint PPT Presentation

1 / 56

About This Presentation

Title:

EASTERN MEDITERRANEAN UNIVERSITY

Description:

1.E. Doukhnitch, 'Synthesis for the Discrete Quaternion Transform Algorithms Class', in Proc. ... of a 4-D quaternion processor. HOUSEHOLDER TRANSFORMATION ... – PowerPoint PPT presentation

Number of Views:114

Avg rating:3.0/5.0

Slides: 57

Provided by: analizs

Category:

more less

Transcript and Presenter's Notes

Title: EASTERN MEDITERRANEAN UNIVERSITY

1
EASTERN MEDITERRANEAN UNIVERSITY DEPARTMENT OF
COMPUTER ENGINEERING
Evgeny
Dukhnich PRESENTATION
FAST HARDWARE-ORIENTED ALGORITHMS for DSP
2
1. Introduction

The very fast growth of modern VLSI complexity
offers a hardware
realization of an ever-growing share of
mathematical means. It
essentially raises the computer performance.
However, familiar
computational algorithms for signal processing
are not hardware-
oriented. The exceptions are the famous
CORDIC-algorithms and
some FFT-algorithms.

3
MODERN SIGNAL PROCESSING

Applications
image processing, computer vision, speech, radar,
and so on
Algorithms
Matrix transforms, convolution, filtering,
positioning etc.
Main Problems
Linear systems eigenvalues, singular values,
least squares and so on
Candidate Solutions
QR-decomposition by Givens rotations or
Householder reflections

4
AIMS OF THE RESERCH

Increase the effectiveness of DSP hardware
Design new hardware-oriented algorithms for DSP
Implement algorithms by designing new VLSI- chips

5
Software (a) and hardware (b) realization of the
algorithm f ab/c d
6
VLSI-technology requirements

- algorithms must have a guaranteed accuracy and
convergence after a fixed number of steps
- every step of the algorithm must have a limited
set of simple operations (add, shift, etc) with
the same short realization time
- algorithms must have the possibility of
decomposition on equal parts with a limited set
of types
- algorithms must realize the highest possible
typical computing procedure which are frequently
found in signal processing methods.

7
What is the CORDIC Algorithm?

CORDIC
COordinate Rotation DIgital Computer
Introduced by J. Volder in 1959 1.
Aim to build a special-purpose digital computer
for
airborne navigation.
performing rotations
compute of sine, cosine and arctangent
to multiply or divide numbers using only
shift-and-add elementary steps.
In 1971, J. Walther 2 generalized the algorithm
computes logarithm, exponentials and square roots.

8
What is the CORDIC Algorithm?

Used in
HP35 (pocket calculator), June 1972.
Intel 8087 (arithmetic coprocessor), 1983.
Solving Linear Systems, 1982.
Signal Processing Applications, July 1995.
Filtering, 1984.
Single Value Decomposition (SVD), June 1993.
Complex SVD, June 2000

9
CORDIC AlgorithmKey Ideas
If we have a computationally efficient way of
rotating a vector, we can evaluate cos, sin, and
tan1 functions Rotation by an arbitrary angle is
difficult, so we perform psuedorotations Use
special angles to synthesize a desired angle z z
a(1) a(2) . . . a(m)
10
2-D CORDIC Algorithm
Y
Vector R from its initial angle ? will be rotated
by an angle ?.
X cos? -sin? X Y sin? cos? Y
V(X,Y)
? Requires multiplication
R
R
V(X,Y)
?
?
X
Using a series of smaller rotation angles ?i
,we can avoid the multiplication
?S?i?i where i0 ?n, ?i-1,1 ,
?iatan(2-i)
An elementary rotation matrix (Pi)could be
derived where elements will be zeros ones or twos
with integer power.
1 -?i2-i Pi
?i2-i 1
11
CORDIC AlgorithmKey Ideas
Rotate the vector OE (i) with end point at (x
(i), y (i)) by a (i) x (i1) x (i)cos a (i) y
(i) sin a (i) (x (i) y (i) tan a (i))/(1
tan2a (i))1/2 y (i1) y (i) cos a (i) x (i)
sin a (i) (y (i) x (i) tan a (i))/(1 tan2a
(i) ) 1/2 z (i1) z (i) a (i) Goal eliminate
the divisions by (1 tan2a (i)) 1/2 and choose a
(i) so that tan a(i) is a power of 2
12
Basic CORDIC Iterations
Pick a (i) such that tan a (i) di 2 i, di Î
1, 1 x(i1) x(i) di y(i)2i y (i1) y
(i) di x(i)2iCORDIC iteration z (i1) z
(i) di tan1 2i If we always pseudorotate by
the same set of angles (with or signs), then
the expansion factor K is a constant that can be
precomputed Example pseudorotation for 30
degrees 30.0 _at_ 45.0 26.6 14.0 7.1 3.6
1.8 0.9 0.4 0.2 0.1 30.1
e (i) tan 1 2-i
13
Basic CORDIC Iteration
14
CORDIC Rotation Mode
15
CORDIC Vectoring Mode
16
Generalized CORDIC
17
Rotation Modes
18
CORDIC Rotation/Vector Modes

Rotation Mode

Vector Mode

19
CORDIC processor
20
Iterative CORDIC Structure
Taken from A Survey of CORDIC Algorithms for
FPGA Based Computers, R. Andraka, FPGA98
21
Publications

Doukhnitch E. Highly parallel multidimensional
CORDIC-like algorithms, Artificial Intelligence,
No3, pp.284-293, Ukraine, 2001.
Doukhnitch E. One way to execute digital linear
transform, Kibernetica, (Cybernetics and Systems
Analysis, ISSN 1060-0396), No5, pp.96-98, Kiev,
May 1982
E. Doukhnitch, Multidimensional Cordic-like
Algorithms for DSP, in Proc.of The Sixteenth
Intern. Symp. on Computer and Information
Sciences, ISCIS XVI, pp.368-375, Antalya, Turkey,
Nov. 2001.
4. E. Doukhnitch, Octonion CORDIC Algorithms for
DSP, in Proc. of the 6th Symp. on Signal
Processing, DSPCS2002, pp. 158-163, Sydney,
Australia, Jan. 2002.
5. E. Doukhnitch, Hardware-oriented Algorithms
for Fast Householder Transform, in Proc. of the
First Intern. Conference on Signal Processing
and Applications DSPA-98, v.2e, pp.129-132,
Moscow, July, 1998

22
Performing any given matrix M into the triangular
form
Upper Triangular matrix
Orthogonal matrix x M
All element of the M
23
WHAT IS GIVENS ROTATION?

Givens Transformations is an orthogonal matrix
used for zeroing a selected entry of the matrix.
That is mean Givens Rotations introduce zeros one
at a time.
The Givens Rotation is realized by the well-known
hardware oriented CORDIC algorithm.

.
24
GIVENS ROTATION
25
2-D CORDIC Algorithm

kV PiV where i0?N
N is number of iterations (approximately number
of bits for result representation)
To zero the element Y, operator of rotation
direction can be taken. So, the computation of
elements for matrix P is not required and a
parallel implementation of rotation
transformation on many vectors by the same angle
is possible.

26
CORDIC modules array for 4X4 matrix
triangularization
27
CORDIC ALGORITHM of 2-D PLANE ROTATION
4-D QUTERNION CORDIC ALGORITHM

Extension of the original 2-D CORDIC
algorithms is being 4-D Euclidean algorithm

28
Publications
1.E. Doukhnitch, Synthesis for the Discrete
Quaternion Transform Algorithms Class, in Proc.
of Intern. Conf. On Intelligent Multiprocessor
Systems, pp.44-48, Taganrog, Russia, 1999 2.
E. Doukhnitch, O. Strelnikov, A. Andreev,
Application of Kronecker Matrix Product for the
Synthesis of Hardware-oriented DSP Algorithms,
in Proc. of Intern. Conf. on Signal Proc.
DSPA-99, pp. 78-83, Moscow, Sept. 1999
29
CONTROL SIGNS are either 1 or -1

The rotation parameters tk

f (k) is set of non-decreasing positive
integers
30
Architecture of a 4-D quaternion processor
31
HOUSEHOLDER TRANSFORMATION

Householder matrix P is a symmetric and
orthogonal matrix of the form PI-wwT with
unit matrix I and real vector w (wTw2)
If a1 first column of matrix A, then
a11-s
w ?u where u a21 and
s?(a1Ta1)1/2, ?(s2 - a11s)-1/2
am1
Then result of Householder transformation is
s a12 a1m
PA 0 a22 a2m
0 am2 amm

Repetitions of these macrooperations with vectors
wj (j2,3,,m) produces an upper triangular
matrix A
32
HOUSEHOLDER TRANSFORMATION Example
From aT(1,4,7) construct the auxiliary vector
uT(-7.124,4,7) Normalize to
wT(-0.662,0.372,0.651)
Treat the lower 2x2 submatrix of P1A. From
aT(4.602,-0.696) construct a 2x2 Householder
matrix, add a trivial first line and column to
promote to a 3x3 matrix
33
FAST HOUSEHOLDER TRANSFORM (1985)

Method with simplified operations shifts and
adds
Suitable for VLSI-designing

Factorization of matrix P ? P
?Pi i0 Iterative process for PA
transformation Ai1 PiAi (1)
i0,1,2,,n A0A. The result is An?PA, if
n?? The process is represented as a sequence of
elemental reflections for m2 with vector wiT
(-2-ic1(i), c2(i)) c1(i) , c2(i) -
direction for every coordinate
34
FAST HOUSEHOLDER TRANSFORM (Continue)

In this case wiTwi ki ? 2, therefore
Pi ki-1(kiI 2wiwiT) ki-1Ti
where
1 2-2i -2-i1c1(i)c2(i)
Ti
-2-i1c1(i)c2(i) 2-2i 1
FHT algorithm for m2
Ai1 ki-1TiAi,
c1(i) sgn a11(i),
c2(i) sgn a21(i),
i 0, 1, , n A0 A.

35
ITERATIVE REFLACTIONS
i0
i1
a21
0
i2
w2
w0
w1
sign a21(i)
1
2
-2-i
0
a11
After n steps (in)
36
FHT ALGORITHM FOR m3
wiT (-2-ic1(i), c2(i),c3(i)) ki
2-2i11. Pi ki-1(kiI 2wiwiT)
ki-1Ti where -2-2i 2
2-i1c1(i)c2(i) 2-i1c1(i)c3(i) Ti
2-i1c1(i)c2(i) 2-2i
-2c2(i)c3(i) 2-i1c1(i)c3(i)
-2c2(i)c3(i) 2-2i
Ai1 ki-1TiAi, c1(i) sign a11(i),
c2(i) sign a21(i), c3(i) sign a31(i), i
0, 1, , n A0 A.
37
DESIGN OF CHIP
Let us represent matrix Ti as -2-(2i1) 1
2-ic1(i)c2(i) 2-ic1(i)c3(i) Ti 2
2-ic1(i)c2(i) 2-(2i1)
-c2(i)c3(i) 2-ic1(i)c3(i)
-c2(i)c3(i) 2-(2i1) ai1 Tiai For
n8, coefficient will be 7 K? (ki/2)
where ki/2 2-(2i1)1 i0
38
DESIGN OF CHIP
a1 is the first column of matrix A and
a11(i)X(i), a21(i)Y(i), a31(i)Z(i)

39
UNITS FOR STEP i
Each unit performs FHT algorithm for m3
c1 a1 c2 u1(i) a2 c3 a3
a11(i1)
a11(i)
a12(i)
a12(i1)
a13(i)
a13(i1)
c1 a1 c2 u2(i) a2 c3 a3

s a12 a13
PA 0 a22 a23
0 a32 a33

a21(i1)
a21(i)
a22(i)
a22(i1)
a23(i)
a23(i1)
c1 a1 c2 u3(i) a2 c3 a3
a31(i1)
a31(i)
a32(i)
a32(i1)
a33(i)
a33(i1)
40
m-D HouseHolder CORDIC Algorithm

rotational matrix Rm,i is
Rm,i(1/(1(m-1)ti2))(1 Si)
1-(m-1)ti2 2ti 2ti
2ti
-2ti
1(m-3)ti2 -2ti2 -2ti2
-2ti -2ti2
1(m-3)ti2 -2ti2
. . .
.
. .
. .
(1 Si)
. .
. .
- 2ti -2ti2 -2ti2
1(m-3)ti2

41
Octonion CORDIC Algorithm (2002)

R8,i 1/cos?fi?8S?8 (E differs by e11-1 from a
unit matrix)
1 ?iti ?iti ?iti ?iti
?iti ?iti ?iti
-?iti 1 -?iti ?iti
?iti -?iti ?iti -?iti
-?iti ?iti 1
-?iti ?iti -?iti -?iti ?iti
R8,i -?iti -?iti ?iti 1
?iti ?iti -?iti -?iti
-?iti -?iti -?iti -?iti
1 ?iti ?iti ?iti
-?iti ?iti ?iti
-?iti -?iti 1 ?iti -?iti
-?iti -?iti ?iti ?iti
-?iti -?iti 1 ?iti
-?iti ?iti -?iti ?iti
-?iti ?iti -?iti 1
Control Signs are
?i fi?sign(y2,i) ?i fi?sign(y3,i)
f(i) 0, 1, 2, 3, 4, 5, 6, 7, 8, ,13,
14, ,32,
?i fi?sign(y4,i) ?i
fi?sign(y5,i)
?i fi?sign(y6,i) ?i
fi?sign(y7,i) ?i fi?sign(y8,i) fi
sign(y1,i) ti2-f(i) ? f(i) is the shift
sequence

Scaling Factor
42
Matrix Triangularization by Octonion CORDIC
Algorithm
8x8 Matrix OverTriangularization In 8 steps
matrix will be formed. Results with for each
step will be saved.
43
Some guessed Questions

Why not higher Dimensions like 16-D ?
Because Cayley numbers are the end-point of a
very interesting sequences of Algebras. Dimension
greather than 8 is not possible.
Since HouseHolder exists as m-D why will we need
Octonion CORDIC?
There is no shift sequence where the convergence
is proved for 8-D HouseHolder.
Octonion CORDIC generalizes the Quaternion CORDIC
where the Original CORDIC algorithm is its
particular case.
Hardware complexity of Octonion will be less than
the hardware complexity of HouseHolder and
hopefully will work faster than HouseHolder.

44
Some patents

E. Doukhnitch , O. Strelnikov, Special-purpose
processor for eigenvalue decomposition, Patent
2168760, published in Russian Patent Bulletin
16, 2001.
E. Doukhnitch, S. Derevenskov, Matrix
processor, Patent 2079879, published in Russian
Patent Bulletin 14, 1997.
. E. Doukhnitch, Unit for m-D coordinate
transformation, Patent 2029356, published in
Russian Patent Bulletin 5, 1995.

45
Fast Hardware-Oriented Algorithm for Cellular
Mobiles Positioning (2004)

Finding the location of a cellular mobile phone
is one of the important features of the 3G
wireless communication systems. Many valuable
location based services can be enabled by this
new feature. All location determination
techniques which are based on cellular system
signals and global positioning system (GPS) use
standard trigonometric complex computation
methods that are usually implemented in software

46
Publications
1. Doukhnitch E., Muhammed Salamah, Deniz
Devrim, A fast hardware-oriented algorithm for
cellular mobiles positioning, LNCS, Vol. 3280,
Spriger-Verlag (2004), pp.267-277, ISSN
0302-9743. 2. Doukhnitch E., Muhammed Salamah,
Fast hardwareoriented algorithm for 2-D
Positioning, Artificial Intelligence, No4,
pp.69-78, Ukraine, 2004, ISSN 1561-5359.
47
Time of arrival (TOA) position determination
method
MS
48
. The traditional algorithm
49
Positions of Base Stations
50
The SDD algorithm
51
Idea of Positioning
52
Parallel rotations of vectors
While( xixi1 gt d) Rotate d1 While(yigtyi1) Rot
ate d2 End while End while
53
FAST ROTATIONS

The rotation matrix M is as follows
We took the sin function as
sinA2-k
and we approximate the cos function as follows
cosA1-2-(2k1)
Therefore, BS or vector coordinates are
recursively rotated as follows
xi1xi-xiayib
yi1yi-yia-xib
Where, a sinA , and b cosA.

54
Weights of the Operations
55
Number of Operations Versus sin(s) for the
Traditional and Our Algorithms
56
Conclusion
There is presented a fast hardware-oriented
algorithm for locating a mobile in a cellular
network with a high accuracy. The main benefit of
the algorithm is that it avoids the calculations
of trigonometric functions. The calculations are
based on simple add, shift, and compare logical
operations and therefore it can be implemented
easily in hardware. In addition, its shown that
the number of involved operations is lower than
that of the traditional algorithm which implies
less computation time.

Write a Comment

User Comments (0)