Recent Question/Assignment

SIT384 Cyber security analytics
Pass Task 8.1P: PCA dimensionality reduction
Task description:
PCA (Principle Component Analysis) is a dimensionality reduction technique that projects the data into a lower dimensional space. It can be used to reduce high dimensional data into 2 or 3 dimensions so that we can visualize and hopefully understand the data better.
In this task, you use PCA to reduce the dimensionality of a given dataset and visualize the data.
You are given:
• Breast cancer dataset which can be retrieved from:
from sklearn.datasets import load_breast_cancer cancer = load_breast_cancer() detailed info available at: https://scikitlearn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html
• PCA(n_components=2)
• 3D plot settings: (Please refer to prac7 for 3D plot examples) from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure(figsize=(10, 8)) cmap = plt.cm.get_cmap(-Spectral-) ax = Axes3D(fig, rect=[0, 0, .95, 1], elev=10, azim=10) ax.scatter(x,y,z, c=cancer.target, cmap=cmap)
• Other settings of your choice
You are asked to:
• use StandardScaler() to first fit and transform the cancer.data,
• apply PCA (n_components=2) to fit and transform the scaled cancer.data set
• print the scaled dataset shape and PCA transformed dataset shape for comparison
• create 2D plot with the first principal component as x axis and the second principal component as y axis
• set proper xlabel, ylabel for the 2D plot
• print the PCA component shape and component values
• create a 3D plot with the first 3 features (as x,y and z) of the scaled cancer.data set
• create a 3D plot with the first principal component as x axis and the second principal component as y axis, no value for z axis
• set proper title for the two 3D plots
Sample output as shown in the following figures are for demonstration purposes only. Yours might be different from the provided.
Submission:
Submit the following files to OnTrack:
1. Your program source code (e.g. task8_1.py)
2. A screen shot of your program running
Check the following things before submitting:
1. Add proper comments to your code
SIT384 Cyber security analytics
Pass Task 7.1P: K-Means and Hierarchical Clustering
Task description:
In machine learning, clustering is used for analyzing and grouping data which does not include prelabelled class or even a class attribute at all. K-Means clustering and hierarchical clustering are all unsupervised learning algorithms.
K- means is a collection of objects which are “similar” between them and are “dissimilar” to the objects belonging to other clusters. It is a division of objects into clusters such that each object is in exactly one cluster, not several.
In Hierarchical clustering, clusters have a tree like structure or a parent child relationship. Here, the two most similar clusters are combined together and continue to combine until all objects are in the same cluster.
In this task, you use K-Means and Agglomerative Hierarchical algorithms to cluster a synthetic dataset and compare their difference.
You are given:
• np.random.seed(0)
• make_blobs class with input:
o n_samples: 200
o centers: [3,2], [6, 4], [10, 5] o cluster_std: 0.9
• KMeans() function with setting: init = -k-means++-, n_clusters = 3, n_init = 12
• AgglomerativeClustering() function with setting: n_clusters = 3, linkage = average
• Other settings of your choice
You are asked to:
• plot your created dataset
• plot the two clustering models for your created dataset
• set the K-Mean plot with title “KMeans”
• set the Agglomerative Hierarchical plot with title “Agglomerative Hierarchical”
• calculate distance matrix for Agglomerative Clustering using the input feature matrix (linkage = complete)
• display dendrogram
Sample output as shown in the following figure is for demonstration purposes only. Yours might be different from the provided.
Submission:
Submit the following files to OnTrack:
1. Your program source code (e.g. task7_1.py)
2. A screen shot of your program running
Check the following things before submitting:
1. Add proper comments to your code

Looking for answers ?


Recent Questions

It is due this Saturday 22nd Jan and needs to be 3500 words and APA referencing.I was using another company but the assignment came back and the english is too broken to even understand!I would really...The essays should be 2 paragraphed.1.Provide a two-paragraph reading response “Tidewater to Tamba” reading this week. The first paragraph will summarize the reading and the second will provide your reaction...ACCT6004 Finance Session 3, 2021Assessment 3: Business Case Studies 2Due date: 28 January 2022, 11PMThis assignment has a 25% weighting in your overall mark for this unit and focuses on content from Topics...You have to use R studio to make this assignment. Look the questions to answer are in the SDM applied project and excel dataset can be found in the excel file attached. A screenshot and r studio needs...FNS50615 Diploma of FINANCIAL PLANNINGFNSFPL506 Determine client financial requirements and expectationsAssessment 2 - Performance Page 1 of 7Instructions to complete this assessmentIn order to complete...Activity 1Explain some of the basic terms and conditions a customer must meet before a financial product will be supplied by an AFSL.(Explain financial product service terms and conditions).Activity 2Why...Topic: Empirically explore the impact of GST on the Indian economy.You have to empirically work on the above topic using actual data from the RBI Handbook of Statistics. https://www.rbi.org.in/scripts/AnnualPublications.aspx?head=Handbook%20of%20Statistics%20on%20Indian%20EconomyThe...Show All Questions