This is a hands-on 2 days’ workshop that provides participants with the tools and skills necessary to produce interactive graphs from big data and embed them in websites and dynamic reports. Participants will be capacitated to (i) improve their statistical workflow with large databases, (ii) produce interactive graphs and maps from many software packages using the R software as an interface and (iii) write static and dynamic reports.
The ggplot2 package is a plotting system for R, based on the grammar of graphics, which tries to take the good parts of base and lattice graphics and none of the bad parts.
Examples of plotting region-level data on country maps using the ggplot2 package and shape files from gadm.org.
Examples of static and interactive population pyramids using the packages ggplot2 and rCharts and population data from census.gov.
The googleVis package provides an interface to Google Charts API, allowing users to create interactive charts based on R data frames.
Preparation and prerequisites
In preparation of the training, invited participants are encouraged to:
· Complete a short online survey to provide information on their data visualisation practices, links to publicly available data to be used in the training and the type of data that requires specific visualisation.
· Familiarise themselves with the proposed data sources through their responses to the survey and think about how to best visualise them using the charts from the gallery to be covered in the training.
· Install the open-source and free R software, the R Development Environment RStudio and required R packages. No prerequisite knowledge of R is expected for the training. However, participants should feel comfortable using basic programming.
· Participate in the MOOC on R Programming that is part of the Coursera Data Science Specialization.
Preparation and prerequisites
1. Examples and Overview of Tools for Data Visualisations
· Good examples of data visualisations on stats office websites
· Tools for data visualisation
· The R software: An open source interface to the most popular data visualisation packages
2. Best Practices for Data Visualisations
· Properties of good graphs
· Statistical workflow
· A brief introduction to R
3. Workflow of Statistical Data Analysis
· Setting up your project structure
· Data manipulation and preparation
· Importing your data from any format into R
· Interactive maps with package googleVis
4. Hands-on Breakout Sessions – Producing Interactive Data Visualisations
· Formation of groups of 2-4
· Data preparation and generation of visuals with R packages googleVis and rCharts
· Visuals to be covered: Data tables, bar and pie charts; Population pyramids; Tree maps; Motion charts; Calendar charts
5. Interactive reports
· Report writing
· Websites and interactive graphs
· Application to tree maps and HIES data
6. Data Visualisation: The broader picture
· Your Webmaster’s View
· Your Manager’s View
· Your Communication Team’s View
7. Next Steps: Finding Help and Resources
R Statistical Software
NSO and line ministry staff: Analysts, Statisticians, IT staff (webmasters)
Academics: Faculty and graduate/post-graduate students with technical skills
Participants should have a good level of IT and statistical knowledge and feel comfortable using basic programming. Participants are encouraged to take a pre-course survey (to help us tailor the content to individual training requirements), to install the R Software and to participate in a MOOC on R programming as part of the Coursera Data Science Specialisation.
Overview of tools; Best practices for data visualisation; Workflow of Statistical Data Analysis; Interactive reports; Finding help and resources
Participants are enabled to produce interactive graphics and maps using open source tools, master the workflow to embed them in websites and reports, and are equipped to build their capacities independently.
2 Regional workshops in the Pacific (Fiji) and Eastern Europe (Albania)
1 National workshop in Ghana