Recent Question/Assignment

ITECH 2201 Cloud Computing
School of Science, Information Technology & Engineering
Workbook for Week 6
Part A (4 Marks)
Exercise 1: Data Science(1 mark)
Read the article at http://datascience.berkeley.edu/about/what-is-data-science/ and answer the following:
What is Data Science?

According to IBM estimation, what is the percent of the data in the world today that has been created in the past two years?
What is the value of petabytestorage?
For each course, both foundation and advanced, you find at http://datascience.berkeley.edu/academics/curriculum/briefly state (in 2 to 3 lines) what they offer?Based on the given course description as well as from the video.
Exercise 2: Characteristics of Big Data(2 marks)
Read the following research paper from IEEE Xplore Digital Library
Ali-ud-din Khan, M.; Uddin, M.F.; Gupta, N., -Seven V's of Big Data understanding Big Data to extract value,- American Society for Engineering Education (ASEE Zone 1), 2014 Zone 1 Conference of the , pp.1,5, 3-5 April 2014
and answer the following questions:
Summarise the motivation of the author (in one paragraph)
What are the 7 v’s mentioned in the paper? Briefly describe each V in one paragraph.
Explore the author’s future work by using the reference [4] in the research paper. Summarise your understanding how Big Data can improvise healthcare sector in 300 words.
Exercise 3: Big Data Platform(1 mark)
In order to build a big data platform one has to acquire, organize and analyse the big data. Go through the following links and answer the questions that follow the links:
- http://www.infochimps.com/infochimps-cloud/how-it-works/
- http://www.youtube.com/watch?v=TfuhuAuaho
- http://www.youtube.com/watch?v=IC6jVRO2Hq4
- http://www.youtube.com/watch?v=2yfjrBhz5w
Please note: You are encouraged to watch all the videos in that series from Oracle.
How to acquire big data for enterprises and how it can be used?
How to organize and handle the big data?
What are the analyses that can be done using big data?
Part B (4 Marks)
Part B answers should be based on well cited article/videos – name the references used in your answer.For more information read the guidelines as given in Assignment 1.
Exercise 4: Big Data Products (1 mark)
Google is a master at creating data products. Below are few examples from Google. Describe the below products and explain how the large scale data is used effectively in these products.
a. Google’s PageRank
b. Google’s Spell Checker
c. Google’s Flu Trends
d. Google’s Trends
Like Google – Facebook and LinkedIn also uses large scale data effectively. How?
Exercise 5: Big Data Tools(2 marks)
Briefly explain why a traditional relational database (RDBS) is not effectively used to store big data?
What is NoSQL Database?
Name and briefly describe at least 5 NoSQL Databases
What is MapReduce and how it works?
Briefly describe some notable MapReduce products (at least 5)
Amazon’s S3 service lets to store large chunks of data on an online service. List some 5 features for Amazon’s S3 service.
Getting the concise, valuable information from a sea of data can be challenging. We need statistical analysis tool to deal with Big Data. Name and describe some (at least 3) statistical analysis tools.
Exercise 6: Big Data Application (1 mark)
Name 3 industries that should use Big Data – justify your claim in 250 words for each industry using proper references.
ITECH 2201 Cloud Computing
School of Science, Information Technology & Engineering
Workbook for Week 7
Part A (3 Marks)
Exercise 1: Storage Methods(1 mark)
From your lecture and also based on the below given video link:
https://www.youtube.com/watch?v=sXkTSiAe-A
Write a paragraph about memory virtualization.

Watch the below mentioned YouTube link:
https://www.youtube.com/watch?v=wTcxRObq738
Based on the video answer the following questions:
What is RAID 0?

Describe Striping, Mirroring and Parity.

Exercise 2: Storage Design(2 marks)
Summarize storage repository design based on the following video link:
https://www.youtube.com/watch?v=eVQH7C3nulY

Below YouTube link describes the Intelligent Storage System
http://www.youtube.com/watch?v=1wENn4PDqDE
Based on the watched video answer the following questions:
What is ISS?

What are the 3 main components of the ISS?

How cache works in ISS?

Storage Area Network (SAN) and Network Attached Storage (NAS) are widely used concepts in data storage arena. The following YouTube video links gives detailed description of these concepts:
- http://www.youtube.com/watch?v=csdJFazj3h0
- http://www.youtube.com/watch?v=vdf6CvGQZrk
- http://www.youtube.com/watch?v=MKZU8zOMiqE
Based on the watched videos answer the following questions:
Describe NAS and SAN briefly using diagrams?
What are the advantages of SAN over NAS?
What are two common NAS file sharing protocols? How they are different from each other?

Part B (3 Marks)
Exercise 3: Storage Design (1 Mark)
Design Storage Solution for New Application
Scenario
An organization is deploying a new business application in their environment. The new application requires 1TB of storage space for business and application data. During peak workload, application is expected to generate 4900 IOPS (I/O per second) with typical I/O data block size of 4KB.
The vendor available disk drive option is 15,000 rpm drive with 100 GB capacity. Other specifications of the drives are:
Average Seek time = 5 millisecond and data transfer rate = 40MB/sec.
You are required to calculate the required number of disk drives that can meet both capacity and performance requirements of an application.
Hint:In order to calculate the IOPS from average seek time, data transfer rate, disk rpm and data block size refer slide 15 in week 7 lecture slide. Once you have IOPS, refer slide 16 in week 7 to calculate the required number of disks.
Exercise 4: Storage Evolution(2 Marks)
Watch the following videos for Fiber Channel over Ethernet and answer the questions that follow:
- http://www.youtube.com/watch?v=hSFyf-rmjA8
- http://www.youtube.com/watch?v=iCfJCzfNLrw
What is FCoE and why we need FCoE?
In your opinion how FCoE is cost effective than traditional connection – give brief explanation.
You have read and answered about SAN in part A – based on your understanding and with some research effort answers the following questions:
What is a Virtual SAN?
What is IP SAN protocols and FibreChannel over IP (FCIP)?
Watch the below video about Introduction to Object-based and Unified Storage and:
http://www.youtube.com/watch?v=1SkUt7q8Dm8
Choose the correct answer from the following questions:

What is an advantage of a flat address space over a hierarchical address space?
a. Highly scalable with minimal impact on performance
b. Provides access to data, based on retention policies
c. Provides access to block, file, and object with same interface
d. Consumes less bandwidth on network while accessing data

What is a role of metadata service in an OSD node?
a. Responsible for storing data in the form of objects
b. Stores unique IDs generated for objects
c. Stores both objects and objects IDs
d. Controls functioning of storage devices

What is used to generate an object ID in a CAS system?
a. File metadata
b. Source and destination address
c. Binary representation of data
d. File system type and ownership

What accurately describes block I/O access in a unified storage?
a. I/O traverse NAS head and storage controller to disk
b. I/O traverse OSD node and storage controller to disk
c. I/O traverse storage controller to disk
d. I/O is directly sent to the disk
What accurately describes unified storage?
a. Provides block, file, and object-based access within one platform
b. Provides block and file storage access using objects
c. Supports block and file access using flat address space
d. Specialized storage device purposely built for archiving

ITECH 2201 Cloud Computing
School of Science, Information Technology & Engineering
Workbook for Week 8
Part A (3 Marks)
Exercise 1: Green Computing(0.5 Marks)
The questions in this exercise can be answered by doing internet search and/orfrom the YouTube videos. Answer to each question should be one paragraph in your own words.

What is Greenhouse effect?
We are legally, ethically,and socially required to green our IT products, applications, services, and practices – is this statement true? Why?

What is Green IT and what are the benefits of greening IT?

Exercise 2: Environmental Sustainability (0.5 Marks)
Read the article in the below link and answer the questions that follow:
http://www.computer.org/csdl/mags/it/2010/02/mit2010020004.html

According to the article how do you build a greener environment?

Summarize the article in 150 words
Exercise 3: Environmentally Sound Practices(1 Mark)
The questions in this exercise can be answered by doing internet search.
Briefly explain the following terms – a paragraph for each term:
? Power usage effectiveness (PUE) and its reciprocal
? Data center efficiency (DCE)
? Data center infrastructure efficiency (DCiE)
List 5 universities who offers Green Computing course. You should name the university, the course name and the brief description about the course.
Exercise 4: Major Cloud APIs(1 Mark)
The following companies are the major cloud service provider: Amazon, GoGrid, Google, and Microsoft.

List and briefly describe (2 lines) the APIs provided by the above major vendors.
Part B (3 Marks)

Exercise 1: Greening IT Standards and Regulations(0.5 Marks)
To design green computers and other IT hardware – the following standards and regulations are mainly used Epeat (www.epeat.net), the Energy Star 4.0 standard, and the Restriction of Hazardous Substances Directive (www.rhos.gov.uk). Use the link provide with some internet search – summarize each standards and regulations in 150 words.
Exercise 2: Green cloud computing (0.5 Marks)
Xiong, N.; Han, W.; Vandenberg, A, -Green cloud computing schemes based on networks: a survey,- Communications, IET, vol.6, no.18, pp.3294,3300, Dec. 18 2012
Most part of power consumptionin data centers comes from computation processing, diskstorage, network and cooling systems. Nowadays, there are new technologies and methods proposed to reduce energy cost in data centers. From the above paper summarize(in 300 words) the recent work done in these fields.
Exercise3: Cloud API Functionalities (2 Marks)
List the functionalities that can be achieved by using the APIs mentioned in the following link:
https://code.google.com/p/sainsburys-nectar-api/
What API is used in the following link and how it is used?
https://pypi.python.org/pypi/python-novaclient

Openstack is an open source collaborative software project which meets many of the cloud needs. Below links gives vast information about Openstack.
? https://support.rc.nectar.org.au/docs/openstack
? http://docs.openstack.org/api/quick-start/content/
Write a report (2 pages) about the Openstack features and functionalities.