Title: An introduction to Apache Storm
1Apache Storm
- What is it ?
- Architecture
- Storm Vs Hadoop
- History
- Terms
www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
2Apache Storm What is it ?
- A real time big data processing system
- Stream based
- Fault tolerant and distributed
- Non persistent
- In the Apache incubator
- Written in Clojure and Java
- Released via an Eclipse license
www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
3Apache Storm Storm Vs Hadoop
- Hadoop
- Distributed fault tolerant
- Batch / file based
- Master/slave plus Zoo Keeper
- Persistent, uses HDFS
- Big Data Analysis
- Storm
- Distributed fault tolerant
- Real time / stream based
- Master/slave plus Zoo Keeper
- Non persistent
- Big Data analysis
www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
4Apache Storm Storm Vs Hadoop
- Hadoop Versus Storm
- They are complementary technologies
- They might both be used in a single system
- Storm to process real time streams of data
- Hadoop and M/R to process batched data on HDFS
www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
5Apache Storm Architecture
- Storm architecture at a high level
www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
6Apache Storm Architecture
- Composed of stream of tuples, bolted together
- sourced via spouts
www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
7Apache Storm Architecture
- From these components we form topologies
www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
8Apache Storm History
- What is Apache Storm's history ?
- Developed by BackType
- Acquired by Twitter
- Open sourced by Twitter in Sept 2011
- Added to Apache Incubator in 2013
www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
9Apache Storm Terms
- Tuple an ordered list of elements
- Stream an unbounded feed of tuples
- Spout like a tap or faucet, a source of
streams - Bolt Functions / Filters etc to process
streams - Topologies ETL like architectures built from
- Spouts, Streams, Bolts
- Nimbus master node, like Hadoop job tracker
- Supervisor controls worker processes
www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
10Contact Us
- Feel free to contact us at
- www.semtech-solutions.co.nz
- info_at_semtech-solutions.co.nz
- We offer IT project consultancy
- We are happy to hear about your problems
- You can just pay for those hours that you need
- To solve your problems