Instructor: Andrew Ba Tran


With support from:




This resource page features course content from the Knight Center for Journalism in the America's massive open online course (MOOC), titled "Introduction to R for Journalists: How to Find Great Stories in Data." The five-week course took place from June 23 to August 26, 2018. We are now making the content free and available to students who took the course and anyone else who is interested in learning how to use the statistical computing and graphics language R to enhance data analysis and reporting process.


The course, which was supported by the Knight Foundation, was taught by Andrew Ba Tran. He created and curated the content for the course, which includes video classes and tutorials, readings, exercises, and more.


The course materials are broken up into five modules:

  • Module 1: Offers an introduction to RStudio and how to start a new analysis project. You will learn the basics of how to import and explore data with R.
  • Module 2: Covers how to transform and analyze data the tidy way using the dplyr package.
  • Module 3: Covers the grammar of graphics and how to use the ggplot2 package to make quick exploratory data visualizations.
  • Module 4: Covers how to visualize geographical data and look for neighborhood racial profiling disparities using Census data and traffic stop data from Connecticut.
  • Module 5: Provides tutorials about RMarkdown. You will learn how to use it to present your analysis in a narrative format. You’ll also learn how to log changes to your project with version-control software and publish your analysis on the Internet. Also, Hadley Wickham, Chief Data Scientist at RStudio, joined in for a Google Hangout. Wickham is the creator of several notable and widely used data analysis packages collectively known as the "tidyuniverse." Link to the video is listed under this Module.

As you review this resource page, we encourage you to watch the videos, read the readings, and complete the exercises as time allows. The course materials build off each other, but the videos and readings also act as standalone resources that you can return to over time.


We hope you enjoy the materials and share them with others who is interested in learning how to use the statistical computing and graphics language R to enhance data analysis and reporting process. If you have any questions, please contact us at knightcenter@austin.utexas.edu.



About the Instructor



Andrew Ba Tran is a data reporter for The Washington Post’s rapid response investigative team.


He previously was a data editor at The Connecticut Mirror's TrendCT.org, a nonprofit news site that helped the public find and understand data and its potential impact on the community.


Prior to that, Andrew was a data producer at The Boston Globe and he’s also worked in newsrooms at The Virginian-Pilot and the Sun-Sentinel. He has contributed to investigative projects and breaking news coverage that were awarded the Pulitzer Prize.


He’s a Metpro Fellow, a Chips Quinn Scholar, and a graduate of the University of Texas.


Andrew has taught data journalism as a Koeppel Fellow at Wesleyan University and at American University.


He’s from Dallas, Texas.






Programming in R


In this module you will be introduced to RStudio and learn how to start a new analysis project. You will learn the basics of how to import and explore data with R.


This module will cover:

  • A tour of the RStudio IDE
  • Syntax for coding in R
  • Creating R scripts
  • Importing packages
  • Good habits for workflow and documentation habits
  • How to import data like CSVs, Excel spreadsheets, XML
  • Exploring the data’s structure

Video Class



Readings




Optional Materials





Wrangling Data


In this module you will learn how to transform and analyze data the tidy way using the dplyr package.


This module will cover:

  • Filtering, selecting, arranging, mutating, summarizing data
  • How to join two data sets for more insight
  • Chaining analyses functions with pipes for efficiency and readability

Facebook Live with Andrew Tran


Video Class


Readings


Optional Materials





Visualizing Data


In this module, you’ll learn about the grammar of graphics and how to use the ggplot2 package to make quick exploratory data visualizations.


This module will cover:

  • The aesthetics of data visualizations
  • How to create different charts like bars, boxes, lines, scatterplots
  • Grouping for charts
  • How to create facets or small multiples with the data
  • Labels and titles for visualizations

  • Video Class


    ggplot2 Resources


    ggplot2 Examples





    Spatial Analysis


    In this module, you will learn how to visualize geographical data and look for neighborhood racial profiling disparities using Census data and traffic stop data from Connecticut.


    This module will cover:

  • Creating interactive maps with the R Leaflet package
  • How to geolocate addresses in R
  • Importing and visualizing shapefiles
  • Points in a polygon analysis that merge location data and boundaries for deeper insights

  • Video Class


    Readings


    Optional Materials





    Publishing for Reproducibility


    In this module you will learn how to use RMarkdown to present your analysis in a narrative format. You’ll also learn how to log changes to your project with version-control software and publish your analysis on the Internet.


    This module will cover:

  • The git version control software and its integration with GitHub
  • How data journalists use GitHub and RMarkdown and other notebooks to publish their work
  • How to use the Markdown markup language to annotate RMarkdown
  • How to create a new git code repository and start tracking code
  • How to connect the repository to GitHub and publish to Github Pages

  • Google Hangout with Andrew Ba Tran and Hadley Wickham


    Video Class

    1. Publishing Intro

    Watch Video Download PDF

    2. R Markdown

    Watch Video Download PDF

    3. More r Markdown

    Watch Video Download PDF

    4. Workflow Practices

    Watch Video Download PDF

    5. Git

    Watch Video Download PDF Download PDF Download PDF

    6. Github Pages

    Watch Video Download PDF

    7. Best Practices & Bye

    Watch Video Download PDF


    Readings