301116 Social Media Intelligence
Due Date: Friday of Week 10
The Project requires us to analyse social media data using the knowledge obtained from this unit with assistance from a computer based statistical package. For this project, we will focus on Twitter.
To complete this project:
1. Read through this specification
2. Complete the data analysis required by the specification
3. Write up your analysis using your favourite word processing/typesetting program, making sure that all of the working is shown and that is it presented well.
4. Include the student declaration text on the front page of your report. Please make sure that your name and student number are clearly displayed on the front page.
5. Submit the report as a PDF by the due date.
Due date and Submission
The project report is due in by 11:59 p.m. on the Friday of week 10. The report must be submitted as a PDF file using the assignment submission facilities in the Project section of 301116 in vUWS.
Once the required analysis is performed, write up the analysis as a report. Remember that the assessor will only see the report and will be marking the analysis based on your report. Therefore the report should contain a clear and concise description of the procedures carried out, the analysis of results, and any conclusions reached from the analysis.
The required analysis in this specification covers material presented in lectures and labs. Students should use the computer software R to carry out the required analysis and then present the results from the analysis in the report.
This project is worth 30% of your final grade, and so the project will be marked out of 30. The project consists of six parts where each part contributes equally to your final mark.
There are five parts to the project, each will be marked using the following criteria:
Marks Criteria Satisfied
0 The method does not lead to insightful analysis.
1 The method is flawed, but the analysis would have provided insight had the method been correct.
2 The correct method leads to partially correct results and analysis.
3 The correct method leads to correct results and analysis.
4 The correct method leads to correct results and analysis, with an insightful aim and conclusion.
5 The correct method leads to correct results and analysis, with an insightful aim and conclusion. Limitations of the analysis are identified and suggestions for further analysis are provided.
If a report is submitted late, the maximum mark it can achieve will be reduced by 10% (3 marks) per day. E.g., if a report is submitted five days late, it can receive at most 15 marks.
The following declaration must be included in a clearly visible and readable place on the first page of the report.
By including this statement, we the authors of this work, verify that:
• I hold a copy of this assignment that we can produce if the original is lost or damaged.
• I hereby certify that no part of this assignment/product has been copied from any other student’s work or from any other source except where due acknowledgement is made in the assignment.
• No part of this assignment/product has been written/produced for us by another person except where such collaboration has been authorised by the subject lecturer/tutor concerned.
• I am aware that this work may be reproduced and submitted to plagiarism detection software programs for the purpose of detecting possible plagiarism (which may retain a copy on its database for future plagiarism checking).
• I hereby certify that we have read and understand what the School of Computing and Mathematics defines as minor and substantial breaches of misconduct as outlined in the learning guide for this unit.
Note: An examiner or lecturer/tutor has the right not to mark this project report if the above declaration has not been added to the cover of the report.
The “Friends of Trump” (FoT) group has hired you as a consultant to examine the impact of Donald Trumps recent statements on his supporters. Trump is highly active on Twitter, so the FoT group wants you to examine the relationships of his Twitter friends and followers. They have provided the following instructions on what they want from your analysis. The analysis should be completed using the R programming language with use of the rtweet and igraph libraries.
Use the rtweet documentation to find functions that will assist your analysis:
1. Friends of Trump
Find 20 friends of Trump that have the most followers. Use only people, not Trump’s company twitter handles. Examine the twitter accounts and summarise the types of people.
2. Followers of Trump
Find the 20 people who follow Trump and have the most followers and examine if they have a positive or negative relationship with Trump based on their tweets. To obtain this set of followers, download a large number of followers and select the 20 that have the greatest number of followers. Examine the twitter accounts and summarise the types of people.
3. Bypassing Trump
Plot the graph containing Trump’s 20 friends and 20 followers. Identify if any of the found friends or followers are friends with each other and add these edges to the graph. Then determine if any of the friends and followers should be friends, based on their background, and add those edges to the graph.
4. Graph Statistics
Compute the diameter and density of the graph, and neighbourhood overlap of each edge and determine which nodes have the greatest social capital. State if the results are obvious from the graph structure and why.
5. Graph Homophily
Compute if there is homophily in the graph. To do this, label each node as either a supporter or nonsupporter of Trump using the information gathered in parts 1, 2 and 3. Write out the hypotheses, the test statistic and a conclusions of the test. Use a significance level of ?? = 0.05.
6. Structural Balance
Finally, determine if the signed network is weakly balanced (using hierarchical clustering) and identify if any within or between signed relationships are not as expected. To perform this analysis, first label all existing edges as either positive or negative, based on their association to Trump.
Write up a report containing your code and analysis of the data with each section clearly labelled. Clearly annotate your code and make sure to state any conclusions you make from each piece of analysis. The report is being marked using the marking criteria, so make sure that each piece of analysis covers all of the criteria. Remember that you are examining the relationship of twitter users to Trump, so make sure that the conclusion of each section refers back to this.