Foundation of High-Dimensional Data Visualization - PowerPoint PPT Presentation

About This Presentation

Title:

Foundation of High-Dimensional Data Visualization

Description:

Foundation of High-Dimensional Data Visualization (Clustering, Classification, and their Applications) Chaur-Chin Chen ( ) Institute of Information Systems ... – PowerPoint PPT presentation

Number of Views:94

Avg rating:3.0/5.0

Slides: 21

Provided by: CChen

Category:

Tags: character | data | dimensional | foundation | handwritten | high | recognition | visualization

Transcript and Presenter's Notes

Title: Foundation of High-Dimensional Data Visualization

1
Foundation of High-Dimensional Data Visualization

(Clustering, Classification, and their
Applications)
Chaur-Chin Chen (???)
Institute of Information Systems Applications
(Department of Computer Science)
National Tsing Hua University
HsinChu (??), Taiwan (??)
cchen_at_cs.nthu.edu.tw,
October 16, 2013

2
Outline

Motivation by Examples
Data Description and Representation
8OX and iris Data Sets
Supervised vs. Unsupervised Learning
Dendrograms of Hierarchical Clustering
PCA vs. LDA
A Comparison of PCA and LDA
Distribution of Volumes of Unit Spheres

3
Apple, Pineapple, Sugar Apple, Waxapple
4
Distinguish Starfruits (carambolas) from
Bellfruits (waxapples)

1. Features(characteristics)
Colors
Shapes
Size
Tree leaves
Other quantitative measurements
2. Decision rules Classifiers
3. Performance Evaluation
4. Classification / Clustering

5
(No Transcript)
6
IRIS Setosa, Virginica, Versicolor
7
Data Description

1. 8OX data set

2. IRIS data set

8 11, 3, 2, 3, 10, 3, 2, 4
O 4, 5, 2, 3, 4, 6, 3, 6
X 11, 2,10, 3,11,4,11,3
The 8OX data set is derived
from Munsons handprinted
character set. Included are
15 patterns from each of the
characters 8, O, X. Each
pattern consists of 8 feature measurements.

Setosa 5.1, 3.5, 1.4, 0.2
Virginica 7.0, 3.2, 4.7,1.4
Versicolor6.3, 3.3, 6.0, 2.5
The IRIS data set contains
the measurements of three
species of iris flowers, it
consists of 50 patterns from
each species on 4 features
(sepal length, sepal width,
petal length, petal width).

8
Supervised and Unsupervised Learning Problems

?The problem of supervised learning can be
defined as to design a function which takes the
training data xi(k), i1,2, ni, k1,2,, C, as
input
vectors with the output as either a single
category
or a regression curve.
?The unsupervised learning (Cluster Analysis) is
similar to that of the supervised learning
(Pattern Recognition) except that the
categories
are unknown in the training data.

9
Dendrograms of 8OX (30 patterns) and IRIS (30
paterns)
10
Problem Statement for PCA

Let X be an m-dimensional random vector with
the covariance matrix C. The problem is to
consecutively find the unit vectors a1, a2, . . .
, am such that yi xt ai with Yi Xt ai
satisfies
1. var(Y1) is the maximum.
2. var(Y2) is the maximum subject to cov(Y2,
Y1)0.
3. var(Yk) is the maximum subject to cov(Yk,
Yi)0,
where k 3, 4, ,m and k gt i.
Yi is called the i-th principal component
Feature extraction by PCA is called PCP

11
The Solutions

Let (?i, ui) be the pairs of eigenvalues and
eigenvectors of the covariance matrix C such
that
?1 ?2 . . . ?m ( 0 )
and
?ui ?2 1, ? 1 i m.
Then
ai ui and var(Yi)?i for 1 i m.

12
First and Second PCP for data8OX
13
First and Second PCP for datairis
14
Fundamentals of LDA

Given the training patterns x1, x2, . . . , xn
from K categories, where n1 n2 nK n
of m-dimensional column vectors. Let the
between-class scatter matrix B, the within-class
scatter matrix W, and the total scatter matrix T
be defined below.
1. The sample mean vector u (x1x2. . . xn )/n
2. The mean vector of category i is denoted as
ui
3. The between-class scatter matrix B Si1K
ni(ui - u)(ui - u)t
4. The within-class scatter matrix W Si1K Sx in
?i(x-ui )(x-ui )t
5. The total scatter matrix T Si1n (xi - u)(xi
- u)t
Then T BW

15
Fishers Discriminant Ratio

Linear discriminant analysis for a dichotomous
problem attempts to find an optimal direction w
for projection which maximizes a Fishers
discriminant ratio
J(w)
The optimization problem is reduced to solving
the generalized eigenvalue/eigenvector problem
Bw ? Ww by letting (nn1n2)
Similarly, for multiclass (more than 2 classes)
problems, the objective is to find the first few
vectors for discriminating points in different
categories which is also based on optimizing
J2(w) or solving
Bw ? Ww for the eigenvectors associated
with few largest eigenvalues.

16
LDA and PCA on data8OX

LDA on data8OX

PCA on data8OX

17
LDA and PCA on datairis

LDA on datairis

PCA on datairis

18
Projection of First 3 Principal Components for
data8OX
19
pca8OX.m

finfopen('data8OX.txt','r')
d81 N45 d
features, N patterns
fgetl(fin) fgetl(fin) fgetl(fin) skip 3
lines
Afscanf(fin,'f',d N) AA' read data
XA(,1d-1) remove
the last columns
k3 YPCA(X,k) better
Matlab code
X1Y(115,1) Y1Y(115,2) Z1Y(115,1)
X2Y(1630,1) Y2Y(1630,2) Z2Y(1630,2)
X3Y(3145,1) Y3Y(3145,2) Z3Y(3145,3)
plot3(X1,Y1,Z1,'d',X2,Y2,Z2,'O',X3,Y3,Z3,'X',
'markersize',12) grid
axis(4 24, -2 18, -10,25)
legend('8','O','X')
title('First Three Principal Component
Projection for 8OX Data)

20
PCA.m

Script file PCA.m
Find the first K Principal Components of data X
X contains n pattern vectors with d features
function YPCA(X,K)
n,dsize(X)
Ccov(X)
U Deig(C)
Ldiag(D)
sorted indexsort(L,'descend')
Xprojzeros(d,K) initiate a projection
matrix
for j1K
Xproj(,j)U(,index(j))
end
YXXproj first K principal
components

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Unlocking Insights: Mastering Data Visualization With Analytics 101 PowerPoint PPT Presentation

Unlocking Insights: Mastering Data Visualization With Analytics 101 - Data visualization is emerging as an important skill in data science and other respective data-driven industries, including education, finance, and healthcare. Since data professionals grapple with increasing volumes of complex and diverse data, data visualization evolves as an essential toolkit. Once considered a minor concept in data science, today, data visualization is a dynamic and fast-paced field, nurtured with multiple techniques, tools, theories, and contributions from other disciplines, such as neuroscience and psychology. | PowerPoint PPT presentation | free to view

High-dose Radiotherapy Systems Market Data and Technology PowerPoint PPT Presentation

High-dose Radiotherapy Systems Market Data and Technology - The Prowess of High-Dose Radiotherapy Systems in the Battle Against Cancer Cancer, an unyielding adversary, continues to cast a profound impact on human health globally. Despite ongoing advancements in medical science, it remains a predominant cause of morbidity and mortality. Amidst this relentless struggle, a ray of hope emerges through innovative treatment modalities, with High-Dose Radiotherapy (HDRT) systems standing out as a powerful and precise weapon against cancerous tumors. | PowerPoint PPT presentation | free to view

Tableau Tutorial For Beginners – Learn What is Tableau Data Visualization PowerPoint PPT Presentation

Tableau Tutorial For Beginners – Learn What is Tableau Data Visualization - https://data-flair.training/blogs/tableau-tutorial/ | PowerPoint PPT presentation | free to view

Python Libraries You Can't Ignore for Data Science PowerPoint PPT Presentation

Python Libraries You Can't Ignore for Data Science - Python provides many data science-specific libraries that programmers use regularly to solve challenging issues. Are you looking to become a professional Python developer to reach high in your career? If yes, take part in the best Python training online course to enrich your knowledge in Python libraries. | PowerPoint PPT presentation | free to view

Machine Learning Methods Every Data Scientist Should Know PowerPoint PPT Presentation

Machine Learning Methods Every Data Scientist Should Know - Machine learning is evolving rapidly and to equip you with the finest knowledge through which you can learn Artificial Intelligence and Machine Learning. There are many workshops and E learning classes available through which the attendees can attend and gain proper knowledge. | PowerPoint PPT presentation | free to view

Data analyst course at ExcelR Solutions PowerPoint PPT Presentation

Data analyst course at ExcelR Solutions - "ExcelR Solutions offers a comprehensive Data Analyst course designed to equip participants with essential skills in data analysis. Dive into the world of data manipulation, visualization, and interpretation using industry-standard tools and techniques. Learn to extract meaningful insights from raw data, effectively communicate findings, and make data-driven decisions. Join us to enhance your analytical capabilities and advance your career in the dynamic field of data analysis." | PowerPoint PPT presentation | free to view

Scientific Visualization in High Performance Computing PowerPoint PPT Presentation

Scientific Visualization in High Performance Computing - Compass, feather, and quiver - plots display direction and velocity vectors. Contour - plots show equivalued regions in data. Animations ... | PowerPoint PPT presentation | free to view

Information%20Visualization%20Course PowerPoint PPT Presentation

Information%20Visualization%20Course - To gain insights into large and abstract data sets ... Offload work from cognitive to perceptual system. Reduced Search. High data density ... | PowerPoint PPT presentation | free to view

Most Popular Tableau Interview Questions & Answers PowerPoint PPT Presentation

Most Popular Tableau Interview Questions & Answers - Tableau Interview Questions You must know these all. Data has become the new currency of trade and every organization big or small wishes to have a bite of it. In these fast-paced times and the ever-expanding domain of the data industry, predicting trends by analyzing data has become of the utmost importance. The analysis of huge data sets using Data visualization tools has become all the more imperative and the need of the hour, with a shortage in the demand for professionals who are skilled in this art of data analytics and their demand on a constant rise over the last few years. | PowerPoint PPT presentation | free to view

Data%20Mining:%20%20Concepts%20and%20Techniques PowerPoint PPT Presentation

Data%20Mining:%20%20Concepts%20and%20Techniques - ... A join path, e.g., Student ... Construct a partition of a database D of n objects into a set of k ... Density Based Spatial Clustering of Applications with Noise ... | PowerPoint PPT presentation | free to view

Data analyst course PowerPoint PPT Presentation

Data analyst course - "ExcelR Solutions offers a comprehensive Data Analyst course designed to equip participants with essential skills in data analysis. Dive into the world of data manipulation, visualization, and interpretation using industry-standard tools and techniques. Learn to extract meaningful insights from raw data, effectively communicate findings, and make data-driven decisions. Join us to enhance your analytical capabilities and advance your career in the dynamic field of data analysis." | PowerPoint PPT presentation | free to view

KARYA Technologies - Data Management Services PowerPoint PPT Presentation

KARYA Technologies - Data Management Services - KARYA's Information Solutions helps companies to learn from the past, combat the present and plan ahead for future | PowerPoint PPT presentation | free to view

Data%20Warehouse%20Models%20and%20OLAP%20Operations PowerPoint PPT Presentation

Data%20Warehouse%20Models%20and%20OLAP%20Operations - Pilot, Arbor Essbase, Gentia. CS 336. 17. Star Schema (in ... Add up amounts by day, product. In SQL: SELECT date, sum(amt) FROM SALE. GROUP BY date, prodId ... | PowerPoint PPT presentation | free to view

Earthsoft Foundation of guidance presents personality development PowerPoint PPT Presentation

Earthsoft Foundation of guidance presents personality development - Personality, Etiquetes, Manner, Organised, Time management, Ethics, Communication For other presentations pl visit www.myefg.in | PowerPoint PPT presentation | free to view

Scientific Visualization in High Performance Computing PowerPoint PPT Presentation

Scientific Visualization in High Performance Computing - Broad field requiring technical knowledge and an understanding of many communication issues ... Get this view by rotating the scene around via left mouse button. ... | PowerPoint PPT presentation | free to view

A Strategy Document on The New Marketing Paradigm Holistic Marketing ? Lateral Marketing ? High-tech Marketing by PHILIP KOTLER Documentation Sponsored by Canon PowerPoint PPT Presentation

A Strategy Document on The New Marketing Paradigm Holistic Marketing ? Lateral Marketing ? High-tech Marketing by PHILIP KOTLER Documentation Sponsored by Canon - A Strategy Document on The New Marketing Paradigm Holistic Marketing Lateral Marketing High-tech Marketing by PHILIP KOTLER Documentation Sponsored by | PowerPoint PPT presentation | free to view

CS590D: Data Mining Prof. Chris Clifton PowerPoint PPT Presentation

CS590D: Data Mining Prof. Chris Clifton - Classification A Two-Step Process. Model construction: describing a set of predetermined classes ... Supervised learning (classification) ... | PowerPoint PPT presentation | free to view

CS590D: Data Mining Prof. Chris Clifton PowerPoint PPT Presentation

CS590D: Data Mining Prof. Chris Clifton - An implication expression of the form X Y, where X and Y are itemsets. Example: ... Dynamic itemset counting and implication rules for market basket data. In SIGMOD'97 ... | PowerPoint PPT presentation | free to view

R Programming- Beginner Level PowerPoint PPT Presentation

R Programming- Beginner Level - R Programming or R is a programming language designed for statistical computing and graphics. R is an open-source programming language used as data analysis and statistical software tool. You can learn the R Programming language as it is useful not just for technical fields but also for business. The emergence of R in the Data Science field has opened many job opportunities in this field. | PowerPoint PPT presentation | free to view

MD 240 Data Management: Warehousing, Analyzing, Mining and Visualization PowerPoint PPT Presentation

MD 240 Data Management: Warehousing, Analyzing, Mining and Visualization - it has become easier than ever to collect data about activities in an ... Torrent (www.torrent.com), ThinkAnalytics (www.thinkanalytics.com) Learning Resources ... | PowerPoint PPT presentation | free to view

Spatial and Temporal Data Mining PowerPoint PPT Presentation

Spatial and Temporal Data Mining - DBSCAN: Density Based Spatial Clustering of Applications with Noise Density-based cluster: A maximal set of density-connected points; ... | PowerPoint PPT presentation | free to view

Unit PowerPoint PPT Presentation

Unit - Unit I Data Warehouse and Business Analysis What is Data Warehouse? Defined in many different ways, but not rigorously. A decision support database that is ... | PowerPoint PPT presentation | free to view

Chapter 6: Mining Association Rules in Large Databases PowerPoint PPT Presentation

Chapter 6: Mining Association Rules in Large Databases - Algorithms for scalable mining of (single-dimensional Boolean) association rules ... Eclat/MaxEclat and VIPER: Exploring Vertical Data Format ... | PowerPoint PPT presentation | free to view

CS490D: Introduction to Data Mining Prof. Chris Clifton PowerPoint PPT Presentation

CS490D: Introduction to Data Mining Prof. Chris Clifton - Classification by Support Vector Machines (SVM) Instance Based Methods. Prediction ... apply a statistical test (e.g., chi-square) to estimate whether expanding or ... | PowerPoint PPT presentation | free to view

ISQS 3358, Business Intelligence Dimensional Modeling PowerPoint PPT Presentation

ISQS 3358, Business Intelligence Dimensional Modeling - ISQS 3358, Business Intelligence Dimensional Modeling Zhangxi Lin Texas Tech University * * * The Time Dimension Because online transaction data, typically the source ... | PowerPoint PPT presentation | free to view

Data Warehousing/Mining Comp 150 DW Chapter 10. Applications and Trends in Data Mining PowerPoint PPT Presentation

Data Warehousing/Mining Comp 150 DW Chapter 10. Applications and Trends in Data Mining - ... of mining audio (such as music) databases which is to find patterns ... You pay for prescription drugs, or present you medical care number when visiting ... | PowerPoint PPT presentation | free to view

Revolutionize Your Custom Socks: 5 Must-Use Types of Tools for Designers, Brands, and Manufacturers PowerPoint PPT Presentation

Revolutionize Your Custom Socks: 5 Must-Use Types of Tools for Designers, Brands, and Manufacturers - Discover how technology is transforming the custom sock industry. This blog explores five essential tools that can help designers, brands, and manufacturers create innovative and high-quality custom socks. From design software to 3D knitting machines, learn how to leverage these tools to stay ahead of the curve and elevate your sock game. | PowerPoint PPT presentation | free to view