Our upcoming meeting will be held on Wednesday 16th of May, at 5.30pm, in room G.04, 50 George Square (see map below). As usual, the meeting will be followed by drinks and chat in the Potting Shed. Our meetings are open for all to attend, and newcomers / beginners are very welcome!
Our first speaker is Euan Gardner, Senior Information Analyst/Statistician and Member of the Data Science Team within the NHS. In this talk, he will be telling us about:
Practical Data Science with Machine Learning and big data in R (
R code here)
There’s been a great deal of interest around data science and machine learning in both the analytics community and wider media. This has translated into many excellent tutorials and resources online for R but most use a standardised dataset, such as the excellent Mixed National Institute of Standards and Technology (MNIST). While this is great for general teaching there are few examples where strategies that cover importing your own data and dealing with the problems of large amounts of data in the context of a larger analytical pipeline are dealt with. This talk is, therefore, based on my own (limited) knowledge and experiences around using R to work with large amounts of complex data, using practical machine learning techniques, and deploying the knowledge and findings to a wider audience. The talk will cover:
- When to use R in projects
- Strategies around small, medium, big, and Google sized data
- Data cleaning/dimensionality problems
- Simple practical machine learning demo (SVM)
- Deploying of information - app, web formats etc.
- Linking to people way smarter than me for excellent resources for further reading on the topic
The second speaker is… me: Caterina Constantinescu. I work as Data Scientist at The Data Lab, and will be talking about:
Using Shiny with interactive maps and network plots
This will be a code demo based on some work I’ve done recently to display transport data interactively. Starting from a dataset including a set of origins and destinations, I’ll show you how you can trace connecting journeys with GraphHopper via the
stplanrpackage. I will also show you how you can then map these journeys using
leafletin R, and then, how to embed the maps within a Shiny app. Since trip origins and destinations could equally be thought of as a network, I’ll also demo a network plot created with
visNetwork, as part of the same Shiny app.
For any newcomers, here’s a map of where we’ll be:
See you there!