Enumera CPU - PowerPoint PPT Presentation

About This Presentation
Title:

Enumera CPU

Description:

History of the CPU Core. 1968 Chuck Moore invents the FORTH Language ... 1994 Wells Fargo I worked on architecture to use 600 HPUX boxes located in the ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 27
Provided by: johns77
Category:
Tags: cpu | enumera | fargo | history | of | wells

less

Transcript and Presenter's Notes

Title: Enumera CPU


1
Enumera CPU
  • Low cost super computing solutions

2
History of the CPU Core
  • 1968 Chuck Moore invents the FORTH Language
  • 1980 Developed by Chuck Moore at Forth Inc.
  • 1983 Broke from Forth Inc to become Novix
  • 1985 Harris Buys Novix Chip becomes Harris RTX
    family
  • 1990 Started development of Sh-boom CPU
  • 1995 Patriot Scientific aquires it.
  • 1994 MPU 21 Developed for Offete Enterprises.
  • 1993 to 98 F21
  • 1996 to 1998 ITV I21 CPU
  • 2000 Chuck Moore joins Enumera Project

3
Patents Held by Chuck Moore
Harris US05070451 12/03/1991 Forth specific
language microprocessor US05319757 06/07/1994
FORTH specific language microprocessor
Nanotronics US05440749 08/08/1995 High
performance, low cost microprocessor
architecture US05530890 06/25/1996 High
performance, low cost microprocessor US05604915
02/18/1997 Data processing system having load
dependent bus timing Patriot Scientific US056597
03 08/19/1997 Microprocessor system with
hierarchical stack and method of
operation US05784584 07/21/1998 High performance
microprocessor using instructions that operate
within instruction groups US05809336 09/15/1998
High performance microprocessor having variable
speed system clock
4
The I21 CPU 0.8 Micron 500 Mhz. 1997
5
(No Transcript)
6
Stack Computers These chips are stack based
processors There are no registers in the normal
Sense they use a stack instead. Often they are
referred to as FORTH Chip's but the language they
use is very different from FORTH. The CPU's will
support more common languages such a C. The
Book cover to the Left is by Philip Koopman.
7
Advantages of Stack architecture. Context
switching for interrupts takes essentially zero
time Does not need to be pipelined for ALU and
operands, because the operands are immediately
available in the top of stack Code is smaller
than CISC programs by a factor of 2.5 to 8
Comments from Phil Koopmans Book Size and
Weight Power and Cooling Operating
Environment Cost high-performance Parallel
processing
8
Quick look at the F21 1996 versions of Chuck
Moores designs process .8 um (Current state
of the Art 0.18) Volts 5.0 Mips 500
(asynchronous design, No Crystal Voltage
dependent) mA 20 mW 100 The F21 comes
with a network co-processor for parallel
processing Video co-processor (NTSC out in the
I21 Design) Serial co-processor Analog processor
( 6 Bit 40 Mhz sample rate ) 1 Instruction per
clock cycle A total of 16,000
transitors!! Very predictable behavior. Full
code simulator for Windows. Designed with OKAD,
pronounced Oh CAD, is the VLSI design
environment.
9
A Closer look at OKAD Custom software written by
Chuck Will allow for fill electrical
simulation Will allow for Macro and Micro thermal
simulation (this allow for better optimization of
design performance)
10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
Enumeras Background
  • John Sokol started Enumera to be able to expand
    research into parallel computing.
  • This began as an extension from using clusters
    for video compression and high performance web
    and video serving.
  • The first project was 100 Pentium CPU's in a 6"
    rack. Ethernet Multicast Boot on disk less
    motherboards. Designs for over 300 CPU's in a 6"
    (single depth) rack were drafted but dropped when
    we found the Chuck Moore CPU's.

18
My background in Parallel computing 1980
experimenting with 8x 6809 CPU's with shared
memory (wire wrapped) developed a RAM
memory bank passing system. 1984 built a
cluster of 20 Commador 64's. 1988 -92 had
opportunity to work on a Cray YMP. Then ported
finite element of code for Maxwell's equations to
a SGI 4 CPU SMP box. 1993 At Sun Micro Systems I
used 384 globally distributed servers to deliver
live and recorded video to 13,000 viewers. 1994
Wells Fargo I worked on architecture to use 600
HPUX boxes located in the bank branched to reduce
load on the Main Frames. 1995 to Present, have
been using cluster of loosely coupled server
for video compression, serving, Neural Networks
and Genetic Programming
19
Current version of Chip being developed These
numbers are proprietary. process V Mips
mA mW .8 um 5.0 500 20 100
F21(tested) .35 3.0 1100 15 45
.35 3.3 1200 16 54 .18
1.5 2000 12 18 .18 1.8 2400
12 21 Numbers are guesstimates pending
simulation. Speed can be throttled by a factor of
1024, reducing power to uW.
20
Two major direction this can be used.
  • 1.) As low cost, low power devices, using a
    single core and co-processors.
  • 2.) As an ultra high performance cluster on a
    chip, still low power and cost.

21
As a stand alone processor the F21 already uses
lower power then the Dragon ball, ARM processor
or MIPS cores such as the NEC VR4181. The F21 is
also much lower cost, and run over 100x faster
then a typical Motorola Dragon Ball(2.7
Mips). Enumera plans to release a version of
this with out the video and some internal DRAM as
a single chip MP3 player. Co-processor dedicated
to audio or I/O processing can be easily added.
22
As a single chip computing cluster (multi cores
on one chip) there a many advantages. The Cores
are asynchronous so this will average out the
draw on the power supply. Each core will have 4
to 6, 1 Gigabit per second bi-directional serial
interfaces to it's neighboring CPU cores. Each
Gbps serial will be addressable as a simple
register read and write. A patent is being filed
for a cooling and power delivery system to allow
for enormous clusters and ram on a very large
chips. In theory it would be possible to build a
single device running at 10 TeraFlops!
23
The minimum size Mosis(a Fab) will charge for is
7 cm2. On this size Chip a 7x7 array (49 CPUs)
with ram could be build. Co-processors could also
be added. Each CPUs would be operating at 2400
MIPS x 49 for a total of 117 Billion operations
per second. The power consumption would be 1
watt 1.8 Volts a 500 mA. With this level of
computing power new applications that were
unthinkable before, now become possible.
24
The 0.18 um core is expected to be able to a/d at
8 bits at over 1 GHz.! A new tracking a/d
design is also expected to increase performance
futher. Ultra High speed PWM output design will
reduce the size and complexity of power drivers
for motors and actuators.
25
  • Disposable computing. (MP3 Players, PDA's, like
    Calculators and watches are today)
  • Portable real time ray traced images.
  • Software radio decoder / receivers, Phased
    arrays
  • More advance error correction and trellis coding
    schemes.
  • More complex servo algorithms.
  • Video and Audio codecs in low cost appliances.
    Encryption
  • Real time embedded systems. Different tasks could
    be assigned several processors.

26
Applications for the Enumera processor design in
hard drives Improved servos and more advanced
coding systems then Viterbi. (increase storage
capacity and reduce cost.) Reduction of
component count. Possibly eliminate external
RAM. Addition of Video and Audio
compression. Addition of TCP/IP servers on the
drive itself. Addition of RAID on the drive.
Write a Comment
User Comments (0)
About PowerShow.com