Research Overview of ICAMS Laboratory - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Research Overview of ICAMS Laboratory

Description:

Chi-square testing of the independence of categorical data. Criterion: ... Elucidate the process and machine component interactions. ICBM = AMFM CBM ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 45
Provided by: howardi
Category:

less

Transcript and Presenter's Notes

Title: Research Overview of ICAMS Laboratory


1
Research Overview of ICAMS Laboratory
  • Present to Dean Robinson from GE Global Research
    and Jim Dolle from GE Aircraft Engine

2
Overview
Samuel H. Huang Associate Professor of Industrial
Engineering
3
Intelligent CAM Systems Laboratory
  • Established in September 1998 at the University
    of Toledo, relocated to the University of
    Cincinnati in September 2001
  • Current team member
  • 1 Post-Doc research associate
  • 2 Ph.D. students (1 at University of Toledo)
  • 8 M.S. students
  • 1 undergraduate student
  • Alumni 1 Post-Doc, 17 graduate students (2 Ph.D.
    dissertations, 6 M.S. theses, 9 M.S. projects),
    one undergraduate student

4
Research Overview
5
Knowledge-based Engineering
  • John J. Shi
  • Post-Doc Fellow

6
KBE Framework Project Progress
7
Raw data
  • Determine parameters to collect
  • Raw data processing
  • Normalization
  • Transformation
  • Discretization
  • Equal width
  • Greedy Chi-merge
  • Data split

8
Data Cleansing
  • Noise/Error in data
  • Filtering and correcting
  • Missing Data Manipulation
  • General methods (imputation) mean, closest-fit ,
    and regression
  • A combined ML(Maximum Likelihood)/EM (Expectation
    Maximization),
  • Reversed neural computing technique

Niharika
9
Dimensionality Reduction
  • PCA with Clustering technique
  • Rank input parameters using PCA
  • Stop/Reduce criterion
  • Chi-square testing of the independence of
    categorical data
  • Criterion Inconsistency rate
  • Neural network pruning
  • Discretization

Saurabh
10
Rule Extraction
  • Decision tree
  • ID3, C4.5, etc.
  • Chi2 statistic test
  • Clustering
  • Subtractive Clustering
  • Two linguistic terms that are not statistically
    different, can be merged.
  • Neural networks
  • Classify continuous-valued inputs into linguistic
    term sets
  • Represent sets using a binary scheme
  • Dynamic Depth-first Searching

11
Rule Refinement
  • Transfer compound rules
  • Remove redundant rules
  • Remove overlapping rules
  • Similar Rule
  • Mergeable Similar Rule Group
  • Combine rules
  • Accuracy of Prediction
  • Possibility of Merging
  • Rule pruning according to weights

12
AMFM rule tuning/adaptation
  • Construct
  • Tuning
  • Adaptation

Ranga
13
Model validation
  • Validation criteria
  • Traditional statistical criteria MSE/RMSE, R2,
    F, etc.
  • PRESS (Predictive Sum of Square)
  • Akaike Information Criterion (AIC)
  • Primitive conclusions
  • Some criteria cannot be applied to evaluate
    soft-computing technique.
  • AIC is a good criterion.

14
Applications When to use KBE?
Drop Hammer Forming
Atomizer Performance
Thermal Paint Calibration
  • KBE can be used to
  • Predict,
  • Simulate,
  • Analysis,
  • Report, etc.

Seamless tubing process
15
Data Cleansing-Dealing With Missing Data
  • Niharika

16
Introduction
  • Incomplete data can arise in a number of cases
  • insufficient Samples of data
  • incorrect data collection
  • sensor failure in time series data
  • samples of data that are impossible to obtain
    when modeling exploratory data
  • calibration transfer (from master to slave
    Instruments)

17
Types of Missing Data
  • Missing data has been characterized into 3 main
    types based on the patterns of missing values
    that occur in data sets
  • MCAR - Missing Completely At Random
  • probability that an element is missing is
    independent of both Observed values and Missing
    values
  • MAR - Missing At Random
  • probability that an element is missing is
    dependent only on Observed values
  • NON IGNORABLE
  • probability that an element is missing is
    dependent only on Missing values

18
Characteristics of Methods to Deal with Missing
Data
  • Missing Data Algorithms have been proposed based
    on the assumption that data is missing at random
  • These methods have incorporated different
    techniques of imputation and multivariate data
    analysis
  • A single efficient algorithm that can be
    universally be applied is yet to be proposed
  • Combinations of the existing methods are being
    evaluated to get better convergence of predicted
    and actual values in real data sets

19
Methods Proposed To Deal With Missing Data
20
Methods Proposed To Deal With Missing Data
  • Current methods of data analysis are designed to
    preserve the partial knowledge that was ignored
    in the previous methods
  • Data variability is taken into consideration
  • They have better performance as they are
    designed to reproduce the data within
    experimental error
  • These methods are built on a set of assumptions
    which may not hold good in real data sets
  • They give better performance than the previously
    mentioned methods

21
Current Methods
  • Multivariate Analysis
  • Principal Components Analysis
  • Statistical Methods
  • Partial Least squares
  • Principal Component Regression
  • Combination Methods
  • Expectation Maximization PCA
  • Maximum Likelihood PCA
  • Multiple Imputation
  • Neural Networks
  • Clustering based Techniques

22
Comparison of Current Methods based on Performance
  • The comparison of the current methods was done on
    the basis of two factors
  • Similarity Factor
  • This factor is used to judge the convergence of
    the predicted values with the actual values of
    the reference data set.
  • Similarity 100- Average of Sdiff
  • Sdiff sum of percentage differences between
    actual and predicted values
  • Number of Iterations
  • This factor determines the length of run and thus
    the complexity of the algorithm

23
Comparison Statistics of Current Methods
24
Scope of Current Research
  • Based on the comparison statistics of the methods
    tested ,the EM PCA has the best results form
    tests done so far both in terms of similarity and
    number of iterations
  • Algorithms incorporating cluster analysis and
    combinations of the previously tested methods are
    being analyzed to get better convergence in
    lesser time
  • A solution which works towards not only filling
    in missing values but also takes care of outliers
    and noise in data sets is being worked upon

25
Dimensionality Reduction
  • Saurabh Dwivedi

26
Introduction
  • What is Dimensionality Reduction?
  • The procedure of selection of a subset of process
    parameters which are necessary and sufficient to
    represent the system under consideration (without
    affecting the system accuracy significantly) is
    referred to as 'Dimensionality Reduction'.
  • Why Dimensionality Reduction?
  • to select sufficient parameters representing the
    system
  • to discard redundant information
  • to reduce time and cost of any further data
    collection and its analysis for system
    monitoring and control

27
Basic Steps in Dimensionality Reduction
  • A Generation Procedure Generates a subset of
    features for evaluation.
  • An Evaluation Function Measures the goodness of
    the subset produced by the generation procedure.
  • A Stopping Criterion A criteria to avoid the
    exhaustive run of the dimensionality reduction
    procedure.
  • Validation To test the validity of selected
    subset of parameters by carrying out tests over
    artificial, real world datasets.

28
Dimensionality Reduction Procedure
Subset
Original Data
Generation
Evaluation
Goodness of Subset
Stopping criterion
No
Yes
Validation
29
Characteristics of the Developed Technique for
Dimensionality Reduction
  • The technique address two problems
  • Classification problem The Output of interest
    (response variable) considered in this case is
    discrete.
  • Function Approximation Problem The Output of
    interest considered in this case takes continuous
    values.
  • The technique can deal with parameters (both
    independent and dependent) that are either
    discrete or continuous or both.
  • The subset of features (parameters) is generated
    using a guided sequential backward search
    mechanism.
  • Principal Components Analysis is used for the
    ranking (guided search) of features.
  • Clustering is used to measure the goodness of any
    subset.
  • For classification problem, classification error
    is used as the Evaluation function whereas for
    the approximation problem it is the ratio of
    Inter Cluster variance to Intra Cluster variance.

30
Flowchart for the Developed Technique
Complete Dataset
Normalization of the Inputs
Clustering
Calculation of the Evaluation Function
Ranking of Parameters using PCA
Decision Making
Reduced Dataset
31
Advantages of the Developed Technique Over
Existing Methods
  • The developed technique can be used with datasets
    comprising both discrete as well as continuous
    parameters.
  • The technique developed can be used both for
    classification as well as approximation problems.

32
Industrial Case Study Lorain Pipe Mills
  • Lorain Pipe Mills, a division of United States
    Steel (USS) is located in the west of
    Cleveland-Ohio, in the city of Lorain.
  • The rotary rolling process developed can produce
    seamless pipes with lengths exceeding 40 feet and
    diameter up to 26 inches.
  • Lorain Pipe Mills was going through a problem of
    'Low Yield'. Hence to address this, process data
    was acquired and analyzed.

33
Dimensionality Reduction for Lorain Data
  • Process data was collected on 8 Input parameters
    and a prediction model (using a Feed Forward
    Neural Network) was developed.
  • After running the algorithm only 2 parameters
    were left in the model. The remaining 6 were
    discarded.
  • The developed algorithm for Dimensionality
    Reduction was run on the acquired data. Following
    table summarizes the results
  • -

34
Adaptive Mamdani Fuzzy Model
  • Ranganath Kothamasu

35
Adaptive Mamdani Fuzzy Model (AMFM)
  • AMFM is an adaptive template that can create
    real-time or offline models in all domains
  • AMFM is a combination of neural networks and
    fuzzy inference systems
  • It can be used to create data driven models
    (solutions)
  • It can also use high level heuristic knowledge in
    the modeling process

36
AMFM- Architecture
37
AMFM Modeling Process
  • Prepare data into patterns with inputs and
    desired (observed) outputs
  • Split the data into training and validation
    datasets
  • Acquire and formalize priori domain knowledge
  • Extract and validate knowledge from training data
  • Setup AMFM architecture from the integrated
    knowledge
  • Initialize model parameters based on the training
    data
  • Train the architecture to the desired accuracy
  • Validate the developed model

38
AMFM, HyFIS and ANFIS
39
Intelligent Condition Based Maintenance (ICBM)
  • Objectives of ICBM
  • Failure prediction
  • Failure diagnosis
  • Necessity for ICBM
  • Eliminate breakdowns
  • Assist in production scheduling
  • Synchronize the JIT components
  • Elucidate the process and machine component
    interactions
  • ICBM AMFM CBM

40
ICBM Architecture
41
ICBM Model Development Cycle
42
Engine diagnosis case study
  • Objective was to create an ICBM model that can
    continuously monitor/diagnose the state of an
    engine.
  • Possible failure modes are Turbine deterioration
    and Compressor leak.
  • Data (11 dimensional) includes state variables
    like inter-turbine temperature, fuel flow, shaft
    speed and vibration.
  • Two sets of features (time series and diagnostic
    based) were extracted from three parameters.
  • Kurtosis, Spike and Trend
  • Knowledge was extracted using subtractive
    clustering.
  • AMFM was used to create the diagnosis model.
  • Extracted knowledge was used to setup the
    architecture
  • Extracted feature data was split into development
    and validation data
  • Model parameters were initialized from the
    clusters
  • Model was recursively developed until the desired
    accuracy was reached

43
Model summary
  • A total of seven inputs (features) were used.
  • The output is in the form of probability of
    occurrence of each failure mode
  • Seven rules were extracted using the clustering
    technique.
  • The network was trained for 1000 epochs
  • The model is 95 accurate
  • There were no false alarms
  • Signal based features were extracted using
    wavelet decomposition and were found to be
    equally effective.

44
Conclusions
  • ICBM has a generic algorithmic approach to
    develop maintenance solutions
  • A defined schema for data acquisition to model
    development to accumulation of maintenance
    knowledge
  • AMFM is capable of creating adaptive and precise
    models and is a suitable tool for ICBM
  • The models are real time, noise tolerant and
    modify-able
Write a Comment
User Comments (0)
About PowerShow.com