read

Hello everyone!


This month I am delighted to announce that we will have a special guest from overseas: David Robinson. Our upcoming meeting will be held on Friday the 16th of February, from 5.00pm, in the ICMS Lecture Theatre (above The Data Lab - details & map below). Please note, this a one-off change from our usual third-Wednesday-of-the-month pattern.

The meeting will be followed by a small reception sponsored by Jumping Rivers, where you will have a chance to meet and chat to David. All welcome, but please grab an Eventbrite ticket if you want to join.

David Robinson is the Chief Data Scientist at DataCamp, where he works on analysis and research to help teach the next generation of data scientists. He is the co-author with Julia Silge of the tidytext package and the O’Reilly book Text Mining with R. He is also the author of the broom, gganimate, and fuzzyjoin packages and of the e-book Introduction to Empirical Bayes. He writes about R, statistics and education on his blog Variance Explained, as well as on Twitter as @drob. In this talk, he will be discussing:

Tidy Text Mining with R

Text data is increasingly important in many domains, but it can be challenging to manipulate and visualize within typical R analysis workflows. In this talk, I will introduce the tidytext package and show how tidy data principles and tools can make text mining easier and more effective, by structuring text as one-token-per-row. You’ll learn how to manipulate, summarize, and visualize text’s characteristics using R packages from the tidy ecosystem such as dplyr, ggplot2, and tidyr. You’ll see case studies of sentiment analysis, tf-idf, and topic modeling applied to examples from literature and Twitter, and gain the tools to draw conclusions from your own text datasets.

Some of the R code demoed at this meeting has kindly been reproduced by Jumping Rivers on their blog here.



This is where we’ll be:

ICMS Lecture Theatre
University of Edinburgh
15 South College Street
Edinburgh
EH8 9AA


So spread the word and sign up on Eventbrite. See you there!

Caterina Constantinescu


Blog Logo

Dr. Caterina Constantinescu

Data scientist @ The Data Lab, University of Edinburgh.

If you would like to sponsor EdinbR, please get in touch!

• Content CC-BY 4.0 licensed.
Image

EdinbR: The Edinburgh R User Group


Back to Overview