August 8, 2022

Grow your data journalism toolkit with Knight Center course on R programming language

After more than 3,300 students from 131 countries registered for the last Knight Center course on statistical programming language R, we heard the calls for more advanced training on this powerful tool for journalists.

That’s why instructor Andrew Ba Tran, an investigative data reporter at The Washington Post, is back to teach another course for us on the topic. 

“Advanced data journalism – Doing more with R” runs from Sept. 5 to Oct. 2, 2022 and registration is now open!

Advanced Data Journalism - Doing more with R

“By the end of the course, students should be comfortable importing and wrangling large data sets and experimenting with different visualizations to find patterns and insights hidden in the numbers,” Tran said. “They should also get a better understanding of how to bring, collect and analyze data from APIs or scraped from websites, as well as how to apply statistical methods to their stories.”

Each week of the course will focus on a different topic:

  • Week 1 gets you acquainted with data and R, going over the building blocks you’ll need for your projects
  • Week 2 goes over common data wrangling tasks and provides an intro to exploratory data visualization
  • Week 3 teaches how to pull and transform unstructured data online into structured data that can be analyzed and turned into stories
  • Week 4 focuses on regression analysis and modeling, looking at how they’ve been used for journalism

This is a big online course (BOC), which means the lessons will be more advanced and the course will be limited to a few hundred students, instead of thousands. There will also therefore be more room for interaction between students and the instructor. Unlike MOOCs, which are free and attract thousands of people, BOCs cost US $95, including full access to the course and a certificate of completion for those who meet course requirements. There is no formal academic credit associated with the certificate.

Tran said this course will differ from his previous massive open online course (MOOC) with the Knight Center in that there will be more focus on obtaining data from accessing APIs and scrapers, “as well as how to think about integrating statistics into news stories.”

“There are plenty of great tutorials out there explaining how to make sophisticated data visualizations with R, but not many on equipping journalists with the tools necessary to obtain data via R programming,” Tran continued.

The data journalist said he has seen some new and exciting developments with R in newsrooms since he taught the last course, including more spatial analysistext mining and sentiment analysis and 3D mapping. Tran will teach this course using video lectures, tutorials and exercises, discussion forums and quizzes.

Tran is a data reporter for The Washington Post’s rapid response investigative team. He previously worked as data editor at The Connecticut Mirror’s TrendCT.org and as a data producer at The Boston Globe. He has contributed to investigative projects and breaking news stories that were awarded the Pulitzer Prize. He’s a Metpro Fellow and a Chips Quinn Scholar. He’s also taught data journalism as a Koeppel Fellow at Wesleyan University and at American University.

“We are delighted to be offering this advanced course with Andrew Tran, who will help journalists take their data journalism skills to the next level through practical, hands-on learning materials,” said Mallary Tenore, associate director of the Knight Center for Journalism in the Americas. “The course, which will focus on some of the newest R developments, will be relevant to data journalists who are already familiar with R, as well as others who want to become more well-versed in this programming language.”

Anyone is welcome to register for the course, but it’s designed for people with some exposure to R. If you are new to the programming language, Tran has prepared material for you to get up to speed before the course starts. That material is accessible once you register.

For the course, students will need the free statistical programming language R, RStudio Desktop and various R packages. Tran recommends that students have a data set from a previous package or a potential project to use in the discussion boards.

The course is asynchronous, meaning you can complete the activities on the days and during the times that best suit your schedule. However, there are recommended deadlines so you don’t fall behind.

So, start on your journey with R or grow your skills with this powerful programming language, and sign up for our latest BOC today!