Chapter 31 Naive Bayes Classification

February 14, 2020


Hi guys, this is Jennifer, your personal guide
for this program. If you have any queries you can ask me via the beam roadmap analysis
to as soon as possible. Now, listen to this program carefully as I will be asking you
questions often. So, in this session we are going to discuss the daily about nail Bayes
theorem. Okay, male Bayes theorem is a classification technique, which is mainly used to classify
the records with respect to dependent and independent variable. So, new Bayes theorem
is one of the machine learning algorithm. And it is commander supervised learning technique.
supervised learning means, where we have dependent and independent variables are there. Whenever
we can have dependent and independent variables we can go for supervised learning. And in
the supervised learning, we have prediction and classification techniques are their prediction
techniques are linear regression, logistic regressions or the prediction techniques when
we are coming to classification aspects, okay, where we have the nail based theorem, K and
men and the espm these are all classification techniques. First, when we go for nail Bayes
theorem, nail Bayes theorem is a classification technique, which is used to classify the record
with the help of probability with the help of probability when only we can classify the
records. When we go for new Bayes theorem, we can classify the record with the help of
the probability. So, with the help of probability only we are going to classify the records
in the new based theorem. And in the nail based theorem, both dependent as well as independent
variable, both to the variables are categorical in nature. In the new Bayes theorem, both
dependent as well as independent variables, both the variables are categorical in nature,
then only we can apply that nail Bayes theorem. When we go for Kane and Cayman is nothing
but k nearest neighbor, which is is the classification technique which is used to classify the record
with the help of Euclidean distance with the help of Euclidean distance. One way we can
classify the records and the index k and men, the dependent variable is categorical and
the independent variable is continuous then only we can go for K in next two we have yes
we am yes we M is nothing but support vector machine and SVM also classification technique,
which is used to classify the record with the help of hyperplane. So, these are all
the different techniques are there in the classes classification technique. First we’ll
discuss details about new Bayes theorem. So, nail Bayes theorem is a classification technique,
which is useful to classify the right card with the help of probability. By using probability
one Lee we can classify the data
Vigo for new Bayes theorem both the dependent as well as independent variable, both the
variables are categorical in nature, both the dependent as well as independent variable,
when both to the variables are categorical, we can go for new Bayes theorem new Bayes
theorem is applicable when both the variables are categorical okay. And in the new Bayes
theorem, new based theorems, we can apply when we have large volume of data sets also
we can apply that. So, whenever the probability is there, we can go for new Bayes theorem.
By using probability we can classify the records. So, probably t means what possibility of occurring
the particular event is called the probabilities probability means the possibility of getting
the particular event is called probability. Where we have premier probability conditional
probability joint probabilities are the premier probability means and the instant if you want
to find that is premier probability conditional probability means by applying some condition
and defining the probabilities called a conditional probability. Okay. So, these are all probability
ffice. Next, when we go for a new Bayes theorem, new Bayes theorem is we are classifying the
record with the help of with the help of with the help of probability we are classifying
the records. Now, I will take one example here. So my outcome variable that is my target
variable is whether that company’s commander fraudulent or truthful company focus whether
the outcome variable is fraudulent or
truthful company. So, that is the target variable. So, the target variable is whether it is fraudulent
or truthful and the independent variable is whether the company’s legally charged or not
under the size of the firm. So, based on the size of the firm and the legally charged or
not with respect to these variable quantumly we are going to find whether the company’s
fraudulent are truthful company. Okay. So here the dependent variable is whether the
company’s fraudulent or truthful and the independent variable is whether the company’s Legally
charged are not under the size of the phone. Okay, so first I’ll take that 10 different
records. So, in this 10 record the outcome variable or dependent variable is whether
the company’s truthful or fraudulent company with respect to independent variable is whether
it is charger sir yes or no under the size of the form. So, with respect to that I am
going to classify that records. So, first one, I’m going to find it what is the probability
that companies commander fraudulent in which charges are yes under the smaller company?
Faster condition is what is the probability of the company’s commander the fraudulent
in which charges are yes under the smaller company So, if this is the case, totally there
are 10 companies are there, out of which how many companies are fraudulence
that are lost to four companies one live fraudulent.
So, which means for by 10 four by 10 hand in that for company,
how many companies are? charges are Yes, indeed for company how many charges are yes
the charges are yes in this four company three sold three by four
and how many companies are smaller company one,
so one by four, so three by four into one by four into four by 10 so we are getting
the value is zero Point 075 that is with respect to fraudulence. Next, when it is comes to
truthful company out of 10 company,
how many companies are truthful company out of 10
how many’s are fruitful? There are six companies are fruitful company How does 10 six companies
are truthful company. So, in that six company, how many are charges
are Yes, the charges are he assists only one company
so that is one by six and how many are smaller company out of these six,
four. So, four by six, one by six into four by six into six by 10
So, we are getting the value is 0.067 This is with respect to truthful company and with
respect to fraudulence we can say 0.075. So, 0.075 is a truthful company that is the 0.075
is a fraudulent and 0.067 is the truthful company. Now, we want to find, what is the
probability that companies come under the fraudulent in which charges are yes under
the smaller company. Okay. What is a probability that companies come on to the fraudulent in
which charges are yes under the smaller company. This is a case fraudulence, we can sell 0.075
Right 0.075 plus 0.067 we are getting 0.53 0.5 threes with respect to fraudulent with
respect to truthful we can say 0.067 by 0.075 plus 0.067 we are getting 0.47. So, we can
say in this two things, the maximum value is 0.53 we can say. So, compared to truth
truthful, there is more possibility will be this company’s commander that fraudulent.
So, we can conclude that, if the company’s smaller company and the charges are Yes, then
the higher possibilities that the company’s Come on brother fraudulence Instead of truthful,
the maximum probabilities fraudulence fundly So, that this new base theorem, male base
theorem with to the help of new base theorem, we can predict by using probability new Bayes
theorem is mainly used to classify the record with the level of probability with the help
of probability when only we can classify the record that is a maximum possibility we can
say that is nil based theorem. Okay, next thing next, next Another example is there
are 14 days are there. So that exists on example, next weekend look into another example, for
playing the badminton game. So when we want to play the badminton game There are 14 days
I am taking here I am taking totally 14 days I am taking that. So in this in this 14 days
Okay in this 14 days I am taking my dependent variable is whether they are going to play
the badminton game yes or no. So, the target variable is whether they are going to play
the badminton game yes or no and the independent variable is outlook temperature, humidity
and wind these are all input variables. The output variable is whether they are going
to play the badminton game yes or not. Okay, so playing the badminton game, yes sir. No
is the outcome variable and the independent variables are out Look temperature, humidity
and the wind there are 14 days the past 14 days I have I have the past to 14 days records
I have in this data set I have past 14 days,
I have the records in this data set. Now, I want to predict for the particular day the
outlook is sunny temperature is school humidity is high and windy strong. If this is the condition,
what is the probability that in that particular day whether they will play the badminton game
Yes or No? Okay. The condition is if the particular day is having outlook is sunny temperature
is cold humidity Hi, windy strong. If this is the condition, what is the probability
that they will play the badminton game? Whether they will play the badminton game? Yes or
no? Okay. There are two conditions are there whether they will play the badminton game
Yes or No? Okay, so now this is the condition what is the probability we want to find now?
Okay, out of these 14 Records out of these 14 Records how many days they play the badminton
game PS playing the badminton game yesses How many days here? Older 14 days. Nine days
is playing the badminton game is yes, time the remaining five days is playing the badminton
Game No. So nine days they played the badminton games. So which means out of 14 nine by 14
is the Yes. And the remaining five is no. So five by 14 is no and the nine by 14 is
yes. And the remaining independent variables are how cloak, temperature, humidity and wind
with respect to these variables we are going to predict the outcome. So, the other independent
variables are how clue temperature humidity and wind So, now we have the disease the probability
the probability of playing the badminton out of 14 days, nine by 14 not playing the badminton
game is five by 14. So, the conditional process abilities if the outlook is sunny, so out
of nine days, how many days the outlook is sunny? If the play badminton game is yes,
if the play badminton game Yes, outlook is sunny how many days? How many days outlook
is sunny? We can say two by nine, two days. Out of these nine days, how many days the
outlook is how overcast four out of nine days outlook is rain three
days. Okay, this is with respect to Outlook. Then
we have with respect to temperature, then humidity, wind, these are all conditional
probability. These are all conditional probability we can say. So conditional probability means
so conditional Probability means by applying some condition and defining the probabilities
conditional probability. Now, our questioners, so, if the question is, outlook is sunny,
temperature is school humidity is high winds strong. If this is the condition, what is
the probability that they will play the badminton game? Yes or no? Okay outlook is sunny. If
it is yes then it is two by 9000 rupees sunny. It is no then three by five. If the temperature
is cool, then three by nine with respect to Yes, for no. This one by five. Next, if the
humidity is high out of nine record three records gods were humidity is high, and the
noise four by five record windy strong. Out of nine, three records were strong and three
records were known. And the yeses out of 14 mine were yours five No. So, few multiplied
directly. So we are getting the value is p of y gives yeses 0.00534 no we are getting
0.0206. So if this is the case, the maximum value is getting. No, no only compare two
years. No one really having the maximum value. This is the condition. outlook is sunny temperature
is cool. humidity is high when the strong is this is the condition the maximum probability
that they will not play that badminton game. So, that is a less possibility compared to
yours no will be the higher values okay. So, that is nail based theorem. Nao Bayes theorem
is nothing but it is mainly user to find that outcome with respect to probability with the
help of probability when only we can find the outcome. And the advantage of male Bayes
theorem is where it is applicable one only for categorical variable one levy can use.
So, this is not applicable for continuous variable, it is applicable only for the categorical
not for continuous And for large data set also this mail Bayes theorem is supports new
Bayes theorem support large data sets also it supports okay it is applicable for one
only categorical variable and the shortcoming it always requires the largest data set okay.
And it is it is mainly to classify the record by using probability. With the help of this
Neo Bayes theorem, we can we can separate the Gmail if we can separate the email with
respect to spam or not, we can able to check up that and we can able to predict whether
the patient is having the disease or not. Likewise, whenever we want to classify the
records, we can go for new Bayes theorem. New base theorem is mainly used to fund classifications
The disease classifying the record by using probability when both the dependent and independent
variables are categorical in nature and when it is when we are coming to nail Bayes theorem
computational challenges also we have okay. So, that is all about new Bayes theorem, new
Bayes theorem easier classification technique we can classify the record by using probability
aspects. That is all about this session. next session
we’ll look into the practical aspects, the hands on way we can discuss it in detail
Quiz time guys, you got a few minutes to think. So, in this session, we are going to discuss
the about the new base scale quiz questions and then we can discuss it in detail. Okay.
So, let me listed out the list of questions. The first Question is, nail bass is the options
are conditional independence conditional dependence, then both option A and B, then then not the
next question. Mail base requires the options are categorical with values, then numerical
values that neither option yay or B. Next involved option yay and beat.
Next question. probabilistic model of data
within each class is the options on
the script, do classification generative classification
probabilistic classification last option both B and C
classification rule is said the options are discriminating classification,
generative classification, probabilistic classification, then both option
A and C. Next question
belizian brulees The options are
to see use the x equals P Bo
skill c into PR PX developed by p of x. Next option
B of C gives x equals b, e x q C and B of C developed by the
next option BLC Guzik sequent p of x you see in the B of C developed by P of C. Then the
last option none of the above. Next question. span classification is an example for
the options on male base probabilistic condition
random Artist then all the above time complexity for male based classifier
for young feature Come on, yo class data is the option side.
Yellow multiplied by the end the war have the n plus yellow
then walk off yen on the people it might then go off the end by yellow.
Next question. The male base pays attention to
complex interaction and the options are
local sexual status sequel model
both options Yeah and then the
next question you have a list of symptoms predict
a patient has PCs gets pod, not the option side
medical diagnosis, diagnosis,
spam diagnosis, all the about
so you can take two to three minutes we’ll discuss the answer for the same
Time’s up now evaluate yourself and keep listening. So let’s discuss the first question male basis.
The answer is, it is a conditional independence technique.
Because it is a conditional independence means a year we are using conditional probably
deed with the help of conditional probability Libya classifying the records.
Next question. Male base requires categorical variables
in new ways theorem, both dependent tasks for
less independent variable both to the variables are categorical in nature. Then only we can
go for male based the next probabilistic model of data within each classes, that answer is
both option B and C. That is both Gendry classification astralis probablistic classification we can
say the classification rule is said the answer is both option A and C that is classification
rules discriminate to classification and probabilistic classification we can say
next question bacey and rude is
the answer is B, B or C use the equal PLC use x equals
p of x Gill c into b c dividends by P off the
next question. Spam classification is an example for male
based. So because in the nail is importantly we are classifying the record with the But
to dependent and independent variable, both dependent and independent variables are categorical
in nature. So, the first question time complexity for naval base classifier for yen feature
yen class data is the answer is off, young multiplied white yelled
nailed with space attention to complex interaction and
low construction structure. Your list of symptoms predict whether a patient
has a disease or not.
The answer is medical diagnosis.
Okay, so that is all about this session. So, in This session we are going to look into
detail about the new Bayes theorem in practical way. We’ll look into dtla. Okay, so first
of all, when we want to perform new Bayes theorem, I want to import to the data set.
If I want to import to the data set, I need to import the pandas package first import
pandas as PD. First I am importing the pandas package. Then, after that, if I want to load
the data set, I need to define the directory. So here in the in the right corner, if you
click the folder icon, if I click it will open the pop up window where I can browse
that folder in the desktop. If you go there We can select the the folder names by Tom,
we can select the Python so in that folder only I kept my file. Once I selected the part
this reflector here, and in that folder, whatever files are there, all those files we can able
to see here. Okay, the first step, I imported the pandas package. Secondly, I want to load
the data set data set equal to PD dot read underscore, Excel.
file name. So the file name is here espm tre file name.
Okay, here actually the a CSV file. So, so that I need to load the CSV file. If I want
to see what is the first of five records, I can use the head function data set dot head
domains If the Printer the first five record, I can able to see that I can able to see the
first five records. After loading the data set. If I want to do the pre processing, first
we need to load the data set after loading the data set. So there we have the different
columns are the passenger class gender embark, like we have the different columns are there.
So if I want to convert any text variable into numerical variable, I can convert to
that So, how from So, I get invited to that from skilar import
pre processing. Ski learn is nothing but ski settler, which is mainly applicable for machine
learning algorithm. Whenever we are going for machine learning algorithm, we need the
So, from ski learn import the processing member, I want to import the any file or I want to
do machine learning algorithm I need to go for ski learn scalar is a package which is
mainly used for machine learning alcoholism, if I want to perform any machine learning,
I can go for scalar scalar is the package for performing the machine learning algorithm
from ski learn import pre processing okay. So pre processing is nothing but converting
that text into numerical format I am converting that. After that, if I want to do the train
test split so after that the end They’re a data set this all about, I have around around
nine different columns are there. So I want to divide the CNT or a data set into two set.
One is training data set and other is tested data set I want to split up if I want to split
so I need to right from the test split function from ski learn dot
cross validation, import train test split. Then from ski learn dot new base theorem,
new ways you go scene in the movie. So Gaussian NBC isn’t function for new base theater. Now,
every model we have the accuracy score. Likewise, new Bayes theorem also we want to find the
accuracy score from ski learn dot matrix.
So, these are all the different headers I want to import. So, first one is from ski
learn dot pre processing, which is mainly used to convert the text into numerical format
Then I imported the cross validation, cross validation is a function which is which is
mainly user to split to that two records training and retest to record it is going to split.
Then if I want to do the new Bayes theorem I can use the Gaussian can be Gaussian in
visa new Bayes theorem for finding the accuracy level I used to the accuracy score, scale
and dot matrix import accuracy score. Now, I loaded the data set I want to convert wherever
the categorical variable text variables are there I am going to convert the text variable
into numerical format by using label encoder elite on three processing dot label in the
corner From China, okay, I did the label encoder then I am going to do the first I want to
perform the survived survived is a variable elite fit off
data set survive
segmenting the values. So, in that the entire data set survived is variable I am fit lamenting
that then we can put into that elite
classes so the survived we have zero and one zero minutes not survived one minute survive
when we have different classes are there survived and not survived.
Then we can do data set off showing the that the deceit actually I loaded
the satellite, but to hear it loaded asunder class is mismatching. So I need to be
here all right. So now we don’t get the here. Now if
you look into That data set of survive. So, whatever values are there and the survived,
which is converted into the numerical method. So, this is the way we need to convert for
all the variable whichever variable The categoricals are there, we need to convert for all the
variable we need to convert in the text into numerical format. So, we converted survived
me converter now, now next we can go for class in the class.
First I am going to fitment the model elite Bit of late onset of passenger plus, okay,
frequenting the variable and then I’m going to print the
glasses, print a leader glasses. So I’m getting the Louis passenger
classes 123 we have three different classes, we need to transform the data set into text
to numerical form we need to transform. So likewise we need to convert to that for
each and every variable. Whatever variable we are taking each and every variable we need
to convert here later on where we data set off, they have sibling
This variable on so we need to convert siblings then bring down
different characteristics are there 012345 and eight way down the different classes.
So then the sibling also we need to transform into numerical for
the concert I am transforming the new text into numerical form,
text into numerical format I’m going for parents and children
First we need to declare and then we can go we have different values to from zero to six
we have. Okay, now if I want to do parents children
We can convert the parents and children’s we can convert. So, we have different variables
are there that is survived passenger class sibling, parents children. So, all the variable,
we need to convert that text into numerical format we need to convert to that we need
to convert to that text into numerical form we need to convert after that, we need to
define the dependent and independent variable. So, here my dependent variable is survived,
survival is a dependent variable and remaining all the variable i am taking here as independent
variable. So, let me define why Quantum dataset survived survived is a dependent variable,
okay. Remaining all the variables are independent variable data says don’t drop off
we have remaining all the variables are independent variables, see here x dot head 22% in the
remaining all the variables are independent variable except to survive the remaining all
our independent variables. So here survived is a dependent variable So, survived is a
dependent variable except to survive the remaining all our independent variable okay. So I defined
as a dependent and independent variable. So if you look into that the total number of
records if you want to find them the total number of records, we have 89 records are
there. Okay, the total number of record in the data set is at 89 Records. I want to convert
into two parts. One is training data set, another is tested data set, I need to split
how to split that by using training test split function. Okay, we need to split by using
brain This this split function we need to split to that. Forget for example, x underscore,
train, train test, then why train
then why just clean this white rain white desk
clean best to split off
x comma y comma, this size, the sizes 0.3
state people is the capital Y. So actually Python A case
sensitive language so that’s why so here in that why you want Lee I defined them in that
way wonderfully I started the dependent variable survive okay capital Y and remaining all the
variables are independent variable. Sir variable survival is
a dependent variable remaining Allah independent variable. So by using train test split function
by using train test split function I divided that training and testing data set I’m going
to divide here. So out of this 89 cards 30 percentage of the right card son commander
tested data remainings our training data set by using
this function we split that if you want to verify you can verify on so why underscore
Okay Why underscore train have come. So, that is 662 Records our training data set and the
test today does it y underscore test not counting function. So, there we have to 67 out of he
add nine records to 67 records on distillate asset
that is 70 percentage of the data is our distillate asset remaining that is 30 percentage of the
records are tested data set remaining 70% digits are the training data set we have.
Likewise it is split we split into the training and it is the data set.
After that, we can we can load the we can define the dependent
and independent variable we defined it then we split. Now we need to apply that nail bass
piano okay. So new Bayes theorem we can apply nail based theorems yet.
machine learning algorithm which is command rather ski learn package from
So from this we are going to we are going to find
the outcome. Now I am going to find the outcome of
this values. So by drop by finding the accuracy score
to the z score on y underscore test, comma why
comma normalize equal to equal to two.
So, how much we are getting the model is saying 77.9 percentage this model is accurate, which
means based on the training data set, this data set is classified around 77.9 percentage
it is classified more accurately. So, which means that dependent variable is arrived remaining
all our independent variable. So, with this combination we classified around 77.9 percentage
is classified more accurately. So, you feel want to draw these things in the confusion
matrix, we can get to know which combinations are getting more accurately So, import
conclusion then they can
take from this way in this
production, we got the sun. So, this is a confusion matrix by drying the confusion matrix
we caught this result. So here if you look into that. So, my dependent variable is independent
variable is survived. So, survival is the dependent variable and I curacy score. This
we can say around 77.9 percentage it is accurately classified, the record is classified and more
accurately around the 77.9 percentage classified more accurately. So, if we want to draw these
things we can draw here, like this we can interact
So, sidewinders Oh 12212123 and we have 130 27
78 So, this is the way it is classy paintable
of a car. If you look into that this is a wonderfully diagnosed only classified war
correctly. Thank you only in this only diagnosed only classified there are cards. Currently
total is how much he look into the total record is
feels some that feel some the total values
the total valuables to 67 in debt that correctly classified is
they can send some of this. Does that correctly classify these 288 record corrected as we
fade in characters in the text We can say
these are incorrect classify Okay, these records are classified incorrectly,
if you make this in percentage, so, you can get to the 77.9 percentage remaining 22 percentage
are not classified accurately 77% is it is classified more accurately, we can say 77%
it is classified remaining 22 percentages are not classified correctly that is it is
coming that is the fastest scenario. So, the second is scenario we can change the dependent
variable we can change it does passenger class. Now, what is accuracy score if we change if
we change if we change the passenger classes as a dependent variable if we change the dependent
variable is passenger class. So, same thing if I press up Yarrow and get to the previous
command so here passenger class I am changing us dependent variable okay remaining all our
independent variable yes so two passenger class remaining farland
independent variable okay. So I define the the dependent and independent variable if
I press head so I can able to see that my coins so remaining So, the bass and their
plans remaining holiday dependent variable. Now again I am going to do the brain and the
split function okay again I want to perform brain and split function.
So, if you do that we can do the training under testers split
function. So here we can split the entire record into training and test further we can
do the analysis. Now we can do them way. So, I played with this if I want to find the
right curacy score, so, if they want to find that the score is 56.17 percentage, so it
is 56.17 percentage isn’t accuracy score If the passenger classes the passenger classes,
the dependent variable it is around the 56.17 percentage it is classified the record more
accurately okay. It is classified that occurred more
accurately this classified.
Then if I want to draw either if I want to draw the conclusion metrics
oh you want to add one more column because they don’t are three columns are there
in the past and their plans. We have three columns. So that’s why they want to add one
more column. Fans and their classes may have three categories
123 plus
123123. Mike race here also we have
to Okay 56 percentage, if they want to draw the
confusion matrix or this is how you want to draw the confusion matrix, I can get the confusion
matrix see that this is a confusion when you look into the 27 1195 then 24 and then 15
then four. So, this is the confusion matrix. So, in this
only diagnosed only classified only diagnosed is classified more accurately
remaining all are not correctly classified. So, the total number
of record is if you look into that Total we have
total we have to 67. So, correctly classified is if you
look into that correctly classify the record is falters
off this amount this amount this this is correctly
classified. So, incorrect oneness
in corrective this one this this so 117 records
are not classified correctly okay and 56 percentage classified very much accurately remaining
43% is is not classified correctly okay we can So, lightweight we need to perform this
is passenger class other variables like we perform passenger
class and survived the next two dependent variable is we have gender gender is a dependent
variable, then we need to find the accuracy value
for gender. Again in the same way, we need to execute
to that. So, now I am going to change the dependent variable is
gender is a dependent variable remaining all the variables are independent variables Gender
is a dependent variable remaining all our independent variables Okay. Now, I want to
find I want to give a hand so I can Frank. So, if you see here remaining all the variables
are live except agenda. Now, again I am going to do the split function training and test
split function then I want to do the accuracy score predicting that then I want to make
the accuracy score of this model is 74.15 So, when we want to do the agenda, the model
like jerseys 74.15 if I draw the confusion matrix here, right rather confusion matrix
So, we can say this is the confusion matrix So if I draw the confusion matrix here
this is gender we have eating that agenda on so we have a
Man and female
look into it here also male and female look into the mail is 5345 then 145 then 24 Okay.
So, this is the way it is classified in our current when we want to classify this is the
way we can classify the records. So, we can classify the record is this way and if you
look into it correctly classify the record is these are all the combinations, the total
record is we can say total is
we can look into some ways this much
That total to 67. So, correctly classified is we can say
this is correctly classified and incorrect is we can say in incorrect recorders,
how many records this and this are the incorrectly classified. So, 69 records are incorrect classification
198 records are
correctly classified okay 69 records are incorrectly classified which means
74.13 percentage it is classified accurately give the gender is a dependent variable it
is classified around we can say is the gender is categorical variable. It is 74.15% it is
classified accurately. The next combination is we need to look into that what is the next
variable we did with the gender the survived the passenger class gender and the age will
not be there because it is continuous variable fair also we can’t do where sibling parents
children and embark these three variable we can do next to we can do with the young bar
next weekend perform with him work the weekend proceed that is a case we can go for him work
results dependent variable and the remaining all the variables are independent variable
Okay, Mr. Zen dependent variables Next, we can do that except me but all our independent
variables next weekend do the brain does to function okay 30 percentage is just a data
set remaining 70 percentage is it training data set, we can work on that. Then we can
also predict with to this weekend rather accuracy score, accuracy score with others 73 percentage
if the M Barclays the dependent variable. If we want to draw the confusion matrix we
can draw that. So we have embarked we have three different categories are
there 123 Okay, this is a case here also we can draw
that The dependent variable is
here Mark accuracy scores. We can say 70 73.7 a
psychology score and we can say
and independent variables. We have Oh here we have the dependent variables
paid on This these combination from lead
in the dark arts so in that
the diagnosis are classified accurately.
So the total record is we can say 260s
classy failures 197
These are all 17 accounts are not
classified correctly this number and rain this
so we can say 73 points of any percentage is classified more
accurately. And this is with respect to Mr.
Next we have siblings and parents children today,
the feeble four siblings they can pass
the dependent variable is sibling and independent labels and children
So except sibling remaining Allah independent variables. Now, we can do the train test split
function, we can perform that train test split, then we can also
predict, you can also predict, then we are also finding
the accuracy score with respect to that the siblings is this 68.9% is
it is classified accurately. So, dependent variable is sibling
sibling Independent so the accuracy score is a score is 68.9 percentage
is classified more accurately. And if we want to draw the confusion matrix
on so we can draw that. Okay where we have this siblings we have different categories
to look into that may have siblings, how many categories are there
where we can have a category 0123458
1234 I corresponding classification is a disease
that there is training data that is tested a
plane is will be
Then so the next independent variable is parents Oh here it is 11 is
11 except the pattern
remaining plants your train. Now with this we can find them. I could as a score like
a disease force began to 71 percentage under the corresponding confusion metrics. So if
it is parents 70 145 Three percentage is classified accurately. So, with respect to all the variables
give the dependent variable is then the accuracy score
then the accuracy scores case 77.9
genderless 74 94.5 7373 children is 71
these are all different variables we can say that this is the things we can look into them.
So in this, the maximum value, maximum value is this one. Out of this, the maximum value
is survived, survived is most accurately classified. Records okay. So, the nail Bayes theorem is
nothing but it is mainly used to classify the record okay it is mainly used to classify
the record more accurately. That’s why we are going for new ways okay in the new ways
theorem both the dependent and independent variable both are categorical in nature. So,
when we want to perform new Bayes theorem, we need to first do pre processing the data,
we need to convert that text into numerical format. Then we need to do the train test
split function by dividing training and testing data set. Then we need to go for gauzy and
before performing new ways, then we can find the accuracy score, then label encoder than
survived is independent variable. So then all the variable we converter text into numerical
then the past to the input variable then we be extracted the dependent variable is surveyed
independent variable is remaining other than surveyed remaining Allah independent next
to be you to the distance lead function then we apply to the banali functions then we predicted
that security score then the confusion matrix will clear a stick way will tell you which
are all the records are correctly classified and which are not classified correctly. Likewise
we need to execute with all the variables. Finally, we need to identify which is
the right variable to classify the records. Okay, that is new Bayes theorem. Okay. Neil
Bayes theorem is the classification technique where both the dependent and independent variables
are categorical inmates as And we can classify the record by using probability. Okay, that’s
our quiz time guys. You got a few minutes to think.
So, in this session, we are going to discuss details about the new Bayes theorem, quiz
questions and then we’ll discuss it in detail okay. So, the first question in new ways,
numerical variable must be bind and the converted to the options are categorical values, numerical
values, either option A or B, then both option A and B.
Next, in exact classification is limited to the
two firms matching with their The options are
diagnosis value, probabilistic condition
characteristics then none of about
thanks a probabilistic model of data. Within each
class is an example for the options are
discriminate to classification, degenerative classification,
probabilistic classification, that then both option gay and C.
In New base, the relationship between
probability of fraudulence and probability of truthfulness.
The options are greater than
Next option is them then equal to then none of the
next question there are how many methods to establish here
classifier The options are 123 then none of these
next question how many forms in exact bass classical classification?
The options are 123 then none of these So, you can take two to three minutes when you
discuss the answer for the same Time’s up. Now evaluate yourself and keep
listening. The first question in new ways, numerical variable must be bind and the converted
to categorical values. When we go for new Bayes theorem, both dependent as well as independent
variable should be categorical in nature, both to the variable should be categorical.
Next, in exact new basis limited to the two forms matching with today.
The answer is characteristics by value
next probably Stick model of data
within each class is an example for generating classification.
So, in new base, the relationship between probability of fraudulent and probability
of truthfulness, the answer is greater than. Next.
There are how many methods to establish here classifier?
The answer is three. Next, how many phones in a exact base classification?
The answer is two. So, that’s all about to the quiz, which is related to nail this.
Thank you. I hope you like the class. I’ll be waiting
for your queries at the brain to map See you tomorrow.

No Comments

Leave a Reply