Introduction to R and Reproducible Research
Thursday, 23 September to Friday, 24 September 2021
This course has a focus on reproducible research, which means making sure that all of steps in analysing your data are recorded and could be run again automatically: by you, if you discover an error in your data file or a step in your data processing; by a colleague, to do a similar analysis on their own data; or by someone else to verify your results. The course covers the following topics:
- the basics of R and RStudio;
- using R Markdown to tie together your R code, output and analytical decisions;
- the benefits of a reproducible approach to data analysis;
- concepts relating to types of data and how to best organise the data you collect;
- importing data from commonly used file formats including Excel and CSV;
- practical data-cleaning tasks to get your original data ready for analysis;
- methods for summarising and describing data;
- producing high-quality graphics with the ‘ggplot’ package;
- presenting results from statistical analyses in tables and graphs.
The course focuses on specific aspects of the R statistical package, methods for reproducible research and ways to effectively work with data arising from real-world research. The Statistical Consulting Centre also offers a general, introductory statistics course “Statistics for Research Workers using R” which focuses on statistical concepts and methods. These courses are designed to have relatively little overlap and may be taken in either order.
Who should take this course?
The course is suitable for researchers wanting efficient and effective strategies for working with quantitative data in a reproducible manner. This course is about what happens before and after traditional statistical analysis: getting your data in a form ready for analysis and presenting the results of statistical analyses. While designed for those who have not used R before, it may also be of interest to participants with some familiarity in R but not R Markdown or the ‘tidyverse’ family of packages.
The presenter Cameron Patrick, a consultant for the Statistical Consulting Centre in the School of Mathematics & Statistics. Cameron is also consult to industry and government.
Cameron has over a decade of experience using R, both in academic research and in a commercial setting, and worked as a software developer in a data-intensive industry prior to becoming a statistician.
This is a two day workshop. Each day will consist of two approximately equal-length sessions; the first session of the day will commence at 9:00 a.m. and the final session will end at approximately 5 p.m. The sessions will mix lecture presentations with practical work; tutorial help will be readily available.
The statistical package R will be used in the course, along with companion software including RStudio, R Markdown, ggplot2 and the ‘tidyverse’ collection of packages. Participants are welcome to use R and RStudio on their personal computer (Windows, Mac or Linux) during to the course but a PC will be available in Wilson computer lab for use. Installation instructions will be sent out via email prior to the course commencing.
No prior experience with R is necessary.
Most of this course assumes little statistical knowledge. However, some statistical concepts will be employed and may be introduced with less detail or rigour than a statistics-focused course. Background knowledge equivalent to an introductory statistics course will be beneficial but not necessary.
Feedback 2020 (course presented via ZOOM):
“Very useful and I wish I had had the opportunity at the start of my PhD. I tried a few ways of learning R early on but this was more complete and organised and provides a good launching pad to continue.”
“Useful in understanding how to operate R to perform analysis which is helpful in applying the concepts taught in my research.”
“Exceeded my expectations, a great introduction to R, in fact”
“Well done Sandy and Cam! Thanks so much for your hard work and good humour. It's dry stuff, but you managed to make it both interesting and entertaining.”
“Clearly very knowledgeable and enthusiastic presenters, well-prepared material, professional and responsive presentation. excellent course.”
Cost and enrolment details:
This course is offered to the public, University of Melbourne staff and Graduate Researchers.
External to the university: $660 including GST
Staff cost $440 (including $40 GST).
Graduate researcher student $220 (including $20 GST).
Note that GST does not apply you are paying through your School or Department.
Cancellation fee $30.
The fee includes a comprehensive set of notes.
*Sponsored by The Melbourne Statistical Consulting Platform
Wilson Computer Lab
Peter Hall Building
School of Mathematics and Statistics
T: +61 3 8344 6995
$30 cancellation fee applies