Software Practice Advancement Conference

SPA Conference session: Analysing GitHub commits with R

One-line description:Using GitHub commits dataset to forecast trends with R
Session format: Tutorial [read about the different session types]
Abstract:R has made the world of data analysis more approachable and has become an important tool for computational statistics and data visualization. With Microsoft acquiring Revolution Analytics, the creators behind R and related tooling, it's also now a part of the Azure family.

In this tutorial, Barbara will demonstrate how you can use R to turn application data into business insights and actionable statistics. Using a year worth of Github commits event as an example, Barbara will ask the vital questions such as which languages have had interesting trends over 2014 and who the influencers within each category are. The records also reveals interesting conclusions about future trends and why particular repositories are more popular than others. With the help of Azure, the discussion will lead to scaling your solution and reducing the time required to find these valuable insights.

By developing intuition about the data sets and investigating them with more formal statistical methods, data can shape your business to be more profitable. At the end, attendees will understand the power of the R language and how they can apply the same approaches for analysing Github's commits for their own application allowing them to turn unstructured data into information that could be potentially monetised.
Audience background:- experience in using Git
- experience in programming
- basic knowledge on mathematics and statistics
Benefits of participating:After participating attendees will:
- get knowledge about R language and its basic usage when analysing data
- gain intuition on data having the GitHub commits dataset as an example
- learn how approach data analysis process
Materials provided:- presentation on R and Data Science topic needed to understand datasets analysis and gathering conclusions process
- links to resources like webpages, books on the relevant Data Science areas
- links to other projects using GitHub commits for data analysis
Process:Depending on a part of the session there will be:
- formal presentation
- discussions
- exercises
Detailed timetable:
Outputs:Mail to participants containing:
- presentation
- materials used during tutorial
- conclusions made during the session
1. Barbara Fusinska
Trainline International
2. 3.