Title: the prediction of survival of hepatitis patient
1HepatitisThe prediction of survival for the
hepatitis patientGroup No7
No Regno name course
1 T/UDOM/2020/
2 T/UDOM/2020/
3 T/UDOM/2020/
4 T/UDOM/2020/
2Insight on hepatitis
- Hepatitis means inflammation of the liver. The
liver is a vital organ that processes nutrients,
filters the blood, and fights infections. - according to the centers for disease control and
prevention(CDC)Trusted Sources there are
approximately 4.4 million Americans are currently
living with chronic hepatisis b and c.many people
dont even know that they have hepatitis
3Problem statement
- It is hard to predict the chances of survial of a
patient with hepatitis - objective
- To use machine learning algorithms to find or to
determine the chance of survival of the patient
with hepatitis
4Methodology
- By using machine learning algorithms with python
programming language - By using hepatitis dataset from uci data set
repository
5Importing libraries
- Import pandas,numpy,seaborn and matplotlib.pyplot
6Loading data from csv file
7Data description for the complete instances
8Data set cleaning
- As we see our data set is not clear there are
some values are missed denoted by ? - Also our data are objective data type and some
attributes have imbalanced data - Data is not normalized
- So what we do?
- Replacing the missing values by the mean value
for the numerical value and replacing by model
value for the categorical values - Changing the whole data frame to float data type
and using smote,standard scaller and normalization
9Mean calculation
replacements
10Data description after replacing the missing
values and after changing data to float data type
11Data visualization for some attributes to show
skewness
Our data are not normalized
12Normalization By using np.log
- hep_replace'ALBUMIN', 'ALK PHOSPHATE',
'BILIRUBIN', 'SGOT' hep_replace'ALBUMIN','AL
K PHOSPHATE', 'BILIRUBIN', 'SGOT'.applymap(np.lo
g)
13Feauture selection
- We selected all attributes for training testing
and prediction but our target is class attribute
14Checking for imbalanced data
15By using smote to balance our data set
16Standard scaler
17Training testing and predictions
- By using LogisticRegression ,GaussianNB,DecisionTr
eeClassifier for training testing and prediction
of our dataset
18Rogisticregression predictions ad score
19Logisticregression classification report and its
confusion matrix
20logisticregression
21Tuning algorithm performance by ensembles
- Using Random forest classfier
22Predictions by Randomforestclassfier and its
confusion matrix
23Confusion matrix for the .
24conclusion
- With randomforestclassfier we got an accuracy of
92
25 More Thanks