Research Overview of ICAMS Laboratory - PowerPoint PPT Presentation

1 / 44

About This Presentation

Title:

Research Overview of ICAMS Laboratory

Description:

Chi-square testing of the independence of categorical data. Criterion: ... Elucidate the process and machine component interactions. ICBM = AMFM CBM ... – PowerPoint PPT presentation

Number of Views:45

Avg rating:3.0/5.0

Slides: 45

Provided by: howardi

Category:

more less

Transcript and Presenter's Notes

Title: Research Overview of ICAMS Laboratory

1
Research Overview of ICAMS Laboratory

Present to Dean Robinson from GE Global Research
and Jim Dolle from GE Aircraft Engine

2
Overview
Samuel H. Huang Associate Professor of Industrial
Engineering
3
Intelligent CAM Systems Laboratory

Established in September 1998 at the University
of Toledo, relocated to the University of
Cincinnati in September 2001
Current team member
1 Post-Doc research associate
2 Ph.D. students (1 at University of Toledo)
8 M.S. students
1 undergraduate student
Alumni 1 Post-Doc, 17 graduate students (2 Ph.D.
dissertations, 6 M.S. theses, 9 M.S. projects),
one undergraduate student

4
Research Overview
5
Knowledge-based Engineering

John J. Shi
Post-Doc Fellow

6
KBE Framework Project Progress
7
Raw data

Determine parameters to collect
Raw data processing
Normalization
Transformation
Discretization
Equal width
Greedy Chi-merge
Data split

8
Data Cleansing

Noise/Error in data
Filtering and correcting
Missing Data Manipulation
General methods (imputation) mean, closest-fit ,
and regression
A combined ML(Maximum Likelihood)/EM (Expectation
Maximization),
Reversed neural computing technique

Niharika
9
Dimensionality Reduction

PCA with Clustering technique
Rank input parameters using PCA
Stop/Reduce criterion
Chi-square testing of the independence of
categorical data
Criterion Inconsistency rate
Neural network pruning
Discretization

Saurabh
10
Rule Extraction

Decision tree
ID3, C4.5, etc.
Chi2 statistic test
Clustering
Subtractive Clustering
Two linguistic terms that are not statistically
different, can be merged.
Neural networks
Classify continuous-valued inputs into linguistic
term sets
Represent sets using a binary scheme
Dynamic Depth-first Searching

11
Rule Refinement

Transfer compound rules
Remove redundant rules
Remove overlapping rules
Similar Rule
Mergeable Similar Rule Group
Combine rules
Accuracy of Prediction
Possibility of Merging
Rule pruning according to weights

12
AMFM rule tuning/adaptation

Construct
Tuning
Adaptation

Ranga
13
Model validation

Validation criteria
Traditional statistical criteria MSE/RMSE, R2,
F, etc.
PRESS (Predictive Sum of Square)
Akaike Information Criterion (AIC)
Primitive conclusions
Some criteria cannot be applied to evaluate
soft-computing technique.
AIC is a good criterion.

14
Applications When to use KBE?
Drop Hammer Forming
Atomizer Performance
Thermal Paint Calibration

KBE can be used to
Predict,
Simulate,
Analysis,
Report, etc.

Seamless tubing process
15
Data Cleansing-Dealing With Missing Data

Niharika

16
Introduction

Incomplete data can arise in a number of cases
insufficient Samples of data
incorrect data collection
sensor failure in time series data
samples of data that are impossible to obtain
when modeling exploratory data
calibration transfer (from master to slave
Instruments)

17
Types of Missing Data

Missing data has been characterized into 3 main
types based on the patterns of missing values
that occur in data sets
MCAR - Missing Completely At Random
probability that an element is missing is
independent of both Observed values and Missing
values
MAR - Missing At Random
probability that an element is missing is
dependent only on Observed values
NON IGNORABLE
probability that an element is missing is
dependent only on Missing values

18
Characteristics of Methods to Deal with Missing
Data

Missing Data Algorithms have been proposed based
on the assumption that data is missing at random
These methods have incorporated different
techniques of imputation and multivariate data
analysis
A single efficient algorithm that can be
universally be applied is yet to be proposed
Combinations of the existing methods are being
evaluated to get better convergence of predicted
and actual values in real data sets

19
Methods Proposed To Deal With Missing Data
20
Methods Proposed To Deal With Missing Data

Current methods of data analysis are designed to
preserve the partial knowledge that was ignored
in the previous methods
Data variability is taken into consideration
They have better performance as they are
designed to reproduce the data within
experimental error
These methods are built on a set of assumptions
which may not hold good in real data sets
They give better performance than the previously
mentioned methods

21
Current Methods

Multivariate Analysis
Principal Components Analysis
Statistical Methods
Partial Least squares
Principal Component Regression
Combination Methods
Expectation Maximization PCA
Maximum Likelihood PCA
Multiple Imputation
Neural Networks
Clustering based Techniques

22
Comparison of Current Methods based on Performance

The comparison of the current methods was done on
the basis of two factors
Similarity Factor
This factor is used to judge the convergence of
the predicted values with the actual values of
the reference data set.
Similarity 100- Average of Sdiff
Sdiff sum of percentage differences between
actual and predicted values
Number of Iterations
This factor determines the length of run and thus
the complexity of the algorithm

23
Comparison Statistics of Current Methods
24
Scope of Current Research

Based on the comparison statistics of the methods
tested ,the EM PCA has the best results form
tests done so far both in terms of similarity and
number of iterations
Algorithms incorporating cluster analysis and
combinations of the previously tested methods are
being analyzed to get better convergence in
lesser time
A solution which works towards not only filling
in missing values but also takes care of outliers
and noise in data sets is being worked upon

25
Dimensionality Reduction

Saurabh Dwivedi

26
Introduction

What is Dimensionality Reduction?
The procedure of selection of a subset of process
parameters which are necessary and sufficient to
represent the system under consideration (without
affecting the system accuracy significantly) is
referred to as 'Dimensionality Reduction'.
Why Dimensionality Reduction?
to select sufficient parameters representing the
system
to discard redundant information
to reduce time and cost of any further data
collection and its analysis for system
monitoring and control

27
Basic Steps in Dimensionality Reduction

A Generation Procedure Generates a subset of
features for evaluation.
An Evaluation Function Measures the goodness of
the subset produced by the generation procedure.
A Stopping Criterion A criteria to avoid the
exhaustive run of the dimensionality reduction
procedure.
Validation To test the validity of selected
subset of parameters by carrying out tests over
artificial, real world datasets.

28
Dimensionality Reduction Procedure
Subset
Original Data
Generation
Evaluation
Goodness of Subset
Stopping criterion
No
Yes
Validation
29
Characteristics of the Developed Technique for
Dimensionality Reduction

The technique address two problems
Classification problem The Output of interest
(response variable) considered in this case is
discrete.
Function Approximation Problem The Output of
interest considered in this case takes continuous
values.
The technique can deal with parameters (both
independent and dependent) that are either
discrete or continuous or both.
The subset of features (parameters) is generated
using a guided sequential backward search
mechanism.
Principal Components Analysis is used for the
ranking (guided search) of features.
Clustering is used to measure the goodness of any
subset.
For classification problem, classification error
is used as the Evaluation function whereas for
the approximation problem it is the ratio of
Inter Cluster variance to Intra Cluster variance.

30
Flowchart for the Developed Technique
Complete Dataset
Normalization of the Inputs
Clustering
Calculation of the Evaluation Function
Ranking of Parameters using PCA
Decision Making
Reduced Dataset
31
Advantages of the Developed Technique Over
Existing Methods

The developed technique can be used with datasets
comprising both discrete as well as continuous
parameters.
The technique developed can be used both for
classification as well as approximation problems.

32
Industrial Case Study Lorain Pipe Mills

Lorain Pipe Mills, a division of United States
Steel (USS) is located in the west of
Cleveland-Ohio, in the city of Lorain.
The rotary rolling process developed can produce
seamless pipes with lengths exceeding 40 feet and
diameter up to 26 inches.
Lorain Pipe Mills was going through a problem of
'Low Yield'. Hence to address this, process data
was acquired and analyzed.

33
Dimensionality Reduction for Lorain Data

Process data was collected on 8 Input parameters
and a prediction model (using a Feed Forward
Neural Network) was developed.
After running the algorithm only 2 parameters
were left in the model. The remaining 6 were
discarded.
The developed algorithm for Dimensionality
Reduction was run on the acquired data. Following
table summarizes the results
-

34
Adaptive Mamdani Fuzzy Model

Ranganath Kothamasu

35
Adaptive Mamdani Fuzzy Model (AMFM)

AMFM is an adaptive template that can create
real-time or offline models in all domains
AMFM is a combination of neural networks and
fuzzy inference systems
It can be used to create data driven models
(solutions)
It can also use high level heuristic knowledge in
the modeling process

36
AMFM- Architecture
37
AMFM Modeling Process

Prepare data into patterns with inputs and
desired (observed) outputs
Split the data into training and validation
datasets
Acquire and formalize priori domain knowledge
Extract and validate knowledge from training data
Setup AMFM architecture from the integrated
knowledge
Initialize model parameters based on the training
data
Train the architecture to the desired accuracy
Validate the developed model

38
AMFM, HyFIS and ANFIS
39
Intelligent Condition Based Maintenance (ICBM)

Objectives of ICBM
Failure prediction
Failure diagnosis
Necessity for ICBM
Eliminate breakdowns
Assist in production scheduling
Synchronize the JIT components
Elucidate the process and machine component
interactions
ICBM AMFM CBM

40
ICBM Architecture
41
ICBM Model Development Cycle
42
Engine diagnosis case study

Objective was to create an ICBM model that can
continuously monitor/diagnose the state of an
engine.
Possible failure modes are Turbine deterioration
and Compressor leak.
Data (11 dimensional) includes state variables
like inter-turbine temperature, fuel flow, shaft
speed and vibration.
Two sets of features (time series and diagnostic
based) were extracted from three parameters.
Kurtosis, Spike and Trend
Knowledge was extracted using subtractive
clustering.
AMFM was used to create the diagnosis model.
Extracted knowledge was used to setup the
architecture
Extracted feature data was split into development
and validation data
Model parameters were initialized from the
clusters
Model was recursively developed until the desired
accuracy was reached

43
Model summary

A total of seven inputs (features) were used.
The output is in the form of probability of
occurrence of each failure mode
Seven rules were extracted using the clustering
technique.
The network was trained for 1000 epochs
The model is 95 accurate
There were no false alarms
Signal based features were extracted using
wavelet decomposition and were found to be
equally effective.

44
Conclusions

ICBM has a generic algorithmic approach to
develop maintenance solutions
A defined schema for data acquisition to model
development to accumulation of maintenance
knowledge
AMFM is capable of creating adaptive and precise
models and is a suitable tool for ICBM
The models are real time, noise tolerant and
modify-able

Write a Comment

User Comments (0)