Detalhes do curso

On Demand

Language

English

Alternative

Modules

4

Generative AI for journalists: Discovering what data can do

This course is for journalists who may have heard of generative AI before and would like to begin engaging with these technologies on a more practical basis, whether to be better prepared for the future of computing or to improve their data journalism practice with new capabilities made possible by machine learning.

No coding experience is necessary, nor will technical skills be assumed. Course videos will introduce all prerequisite skills at your pace, and we’ll make sure your computing environment is properly set up to download and run your own language models.

We’ll also provide you with plenty of exercises to practice the skills using your own data at your own pace. These exercises will be available asynchronously, and the instructor will be around to answer any questions you may have.

Through hands-on tutorials over the next four weeks, we want to help you get in-the-know by training you on practical applications and concepts integral to generative technologies. This course will introduce you to the machine learning domain by building up specific skills week over week, leaving you prepared for the next generation of generative AI applications and workflows only now entering newsrooms.

We will build up your AI expertise such that you will be able to participate in AI policy formulation and implementation in your organization. Finishing this course will allow you to:

  • Understand what generative AI is and is not
  • Be able to clearly articulate when and where to deploy generative AI technologies
  • Convert your data to formats suitable for language models
  • Learn the basics of prompt engineering
  • Embed your documents in a vector database to search through them with natural language
  • Quickly develop prototype workflows to assess potency 

Goals

Introduction Module – Generative AI For Journalists

Welcome to the course! We’ll begin by diving into the recent history of generative AI through a study of successful AI projects instructor Sil Hamilton has observed while working with newsrooms and organizations across the industry. Next, we’ll get you set up with the required tools we’ll be using to discover AI during the course. We’ll also set aside time to go through what you’ll be learning — the exercises and discussions throughout the course will encourage you to try these techniques on your own datasets.

This module will cover:

  • Defining generative AI and understanding what makes a successful implementation
  • An overview of the course structure
  • Getting set up with our required tools and applications
  • Tips on how to make the best of this course

Module 1 – But What Are Models? (November 20 – 26, 2023)

What is called generative AI today is built on the success of machine learning models capable of understanding the world around us through text and images. We’ll develop an intuitive understanding of what is, and is not, possible with generative AI models today by looking at what makes these models tick. 

This module will cover:

  • Prediction tasks: how generative models are trained 
  • Natural language processing fundamentals
  • How ChatGPT works — and why
  • Why understanding modeling matters

Office Hours: Wednesday at 2 PM CST.

Module 2 – Discover The Data In Your Documents (November 27 – December 3, 2023)

Generative models talk to each other through text. Learn how to see your data in new ways by making your data — and your newsroom — “AI ready” by converting your unstructured documents into structured formats via optical character recognition (OCR) and embeddings, the fundamental unit of meaning for generative AI models. Embed your articles, documents, sources, and more.

This module will cover:

  • What sorts of data machine learning models expect
  • Converting your non-textual data to structured formats suitable for language models
  • Ways to “embed” your data with the help of embedding models and vector stores

Office Hours: Tuesday, Wednesday at 2 PM CST.

In Conversation: John Keefe, weather data editor at the New York Times.

Module 3 – Run And Use AI Models (December 4 – 10, 2023)

With your data cleaned and structured, it is now time to use generative models to transform your data in interesting and useful ways. Learn how to run a variety of multimodal models both in the cloud and on your local computer with LangChain, a framework for learning language models into conversational “agents” capable of many things: trawling your archives, summarizing documents, and rearranging your sources in new ways.

This module will cover:

  • Creating an agent with LangChain, a framework for developing applications with AI
  • Plugging your new agent into your vector store to create your very own research assistant
  • Giving your agent a custom personality
  • Extending your agent with new capabilities via tools and external APIs

Office Hours: Wednesday at 2 PM CST.

Module 4 – Putting It All Together (December 11 – 17, 2023)

Now that you’ve created your very own agent using LangChain, learn how to share it with the wider world by packaging and deploying it with the help of Hugging Face Spaces — an easy-to-use hosting platform for machine learning applications suitable for use in your newsroom.

This module will cover:

  • Giving your LangChain application a stylish interface with the help of Gradio
  • Customizing and styling front-end
  • Hosting your application online on your very own Hugging Face space

In Conversation: Freddy Boulton, software developer at Hugging Face.

Office Hours: Wednesday at 2 PM CST.

Sil HamiltonSil Hamilton is AI researcher-in-residence at Hacks/Hackers, a network of journalists who rethink the future of news through talks, hackathons, and conferences. A machine learning researcher at McGill University exploring the intersection of AI and culture, Sil has published research at NLP conferences like ACL, AAAI, and COLING. His work exploring the limits of language models has been discussed by Wired, The Financial Times, and Le Devoir. Sil has given talks on AI and the newsroom at the Nieman Foundation for Journalism at Harvard; the Brown Institute for Media Innovation at Columbia; the Computer History Museum in Mountain View, California; and The Knight Center for Journalism in the Americas at the University of Texas at Austin. Sil has consulted for The Associated Press on AI policies and serves as technology advisor at Health Tech Without Borders, a non-profit seeking to mitigate healthcare crises with digital tools.