Q1 Kernel Method

, The kernel method for separating linearly non-separable data is to map the data to a higher dimensional vector space where it becomes better separable with linear hyperplanes. Explain in 2-3 sentences why projecting it to a higher dimension allows one to create a more clear separation between the projected datapoints.

Q2 Kernel Method 2

Suppose you have training data whose feature vectors are n dimensions (x1, x2, ...xn) and suppose the data can be classified into two classes, class 0 and class 1. In the training set, you observe that the class 0 data points and clustered around the origin i.e. point (0,0,....0) and class 1 data points are away from the origin. Find a mapping of the n dimensional data points into n+1 dimension where you can separate them with a hyper plane. Your answer should be the additional dimension in terms of x1, x2, ...., xn

Q3 RBF kernel

. In the class, it was explained that RBF kernel allows us to express a similarity measures between two data points (or two vectors), i.e points that belong to a particular class have high similarity measure as measured by RBF and points that belong to different classes have low similarity measure as measured by RBF. Please explain in 2-3 sentences , how RBF kernel indeed does give you such a measure.

Q4 Decision Tree

. Explain in your own words, for a set of training data points (vectors), what is meant by a measure of impurity? Please explain using no more than 2-3 sentences.

Q5 Decision Tree 2

Suppose you have a training set of 1000 malware, and 1000 benignware feature vectors. You consider a feature f and you split the set of 2000 feature vectors into 2 sets, one set where f =1 and and other set where f = 0. The resulting two sets have the following: Left set has 900 malware, and 200 benignware, and Right set has 100 malware, and 800 benignware. Calculate the information gain if you split based on feature f. Please explain your steps in calculating the impurity measures using Gini measure.

Q6 Random Forest

Explain in your own words using no more than 2-3 sentences, why Random Forest reduces the chance of overfitting and also may provide better accuracy than decision tree? (Note that it is NOT the case that Random Forest always gives better accuracy than Decision tree but very often does).

, The kernel method for separating linearly non-separable data is to map the data to a higher dimensional vector space where it becomes better separable with linear hyperplanes. Explain in 2-3 sentences why projecting it to a higher dimension allows one to create a more clear separation between the projected datapoints.

Q2 Kernel Method 2

Suppose you have training data whose feature vectors are n dimensions (x1, x2, ...xn) and suppose the data can be classified into two classes, class 0 and class 1. In the training set, you observe that the class 0 data points and clustered around the origin i.e. point (0,0,....0) and class 1 data points are away from the origin. Find a mapping of the n dimensional data points into n+1 dimension where you can separate them with a hyper plane. Your answer should be the additional dimension in terms of x1, x2, ...., xn

Q3 RBF kernel

. In the class, it was explained that RBF kernel allows us to express a similarity measures between two data points (or two vectors), i.e points that belong to a particular class have high similarity measure as measured by RBF and points that belong to different classes have low similarity measure as measured by RBF. Please explain in 2-3 sentences , how RBF kernel indeed does give you such a measure.

Q4 Decision Tree

. Explain in your own words, for a set of training data points (vectors), what is meant by a measure of impurity? Please explain using no more than 2-3 sentences.

Q5 Decision Tree 2

Suppose you have a training set of 1000 malware, and 1000 benignware feature vectors. You consider a feature f and you split the set of 2000 feature vectors into 2 sets, one set where f =1 and and other set where f = 0. The resulting two sets have the following: Left set has 900 malware, and 200 benignware, and Right set has 100 malware, and 800 benignware. Calculate the information gain if you split based on feature f. Please explain your steps in calculating the impurity measures using Gini measure.

Q6 Random Forest

Explain in your own words using no more than 2-3 sentences, why Random Forest reduces the chance of overfitting and also may provide better accuracy than decision tree? (Note that it is NOT the case that Random Forest always gives better accuracy than Decision tree but very often does).

Everything should be in the brief-Hello,I hope you are great.Just checking in again to see if you would be interested in working on any paid posts on my blog.I would take the chance to introduce a paid guest post on your site myassignmenttutors.com...Student Details ( Student should fill the content)NameBatch NumberStudent ID Cardiff Met ID : ICBT ID :Scheduled unit detailsUnit code CIS 7026Unit title Business Process and Data AnalysisAssignment DetailsNature...Student Details ( Student should fill the content)NameBatch NoStudent ID Cardiff Met ID : ICBT ID :Batch NoScheduled unit detailsUnit code CIS 7029Unit title Social Media Analytics for BusinessAssignment...Assessment Brief and Feedback FormModule title: Public Health LeadershipAssessment Point: First assessment pointAssessment task: Essay Word count limit: 2500Submission deadline: Please consult the VLE.Submission...Everything should be in the briefAMN 401 IMC ASSESSMENT 1: OPINION PIECEOpinion Piece• Write a 1,000 word opinion piece on an IMC topic.• This topic will be given to the class by a leading thinker in IMC.• Due before class in Week 5.Why...**Show All Questions**