Natural Language Processing in the Social Sector — Supplemental Materials

Text Classification

When a collection of texts comes with labels, it’s possible to train an algorithm to label new, un-labeled text. These labels might be, for example: the section of a newspaper that an article came from, whether an email was labeled as spam or not, or the topic of a bill.

Below, we explain how text classification works and apply it to finding health-related state bills.

Sample code

More demos

Pre-Processing, Parts of Speech, and Named Entities — “Give me just the nouns.” and “Who did what to whom?”
Term Frequencies — “Let’s rank words by importance.”
Text Summarization — “What’s the TL;DR version of this text?”
Topic Modeling — “I think I really have five categories of text here.”

This site is open source. Improve this page.