Text and Data Mining at University of Toronto Libraries

Text and data mining are associated methods for identifying patterns within large bodies of text, in the case of text mining, or data, in the case of data mining. There are a number of different techniques associated with this method.

"What is Text Mining?" from Elsevier

"How does Text Mining Work?" from Elsevier

Resources and Training

Voyant Tools is a web-based platform for generating statistical information about text corpora that may offer preliminary information about your text(s). For text-wrangling and text mining skills, consult the University of Southern California's excellent list of training resources. Additionally, Programming Historian has many tutorials on working with text and textual data.

Getting Textual Datasets

For Text and Data information at University of Toronto, consult our TDM Guide, which includes a list of licensed and free textual datasets as well as a list of APIs. 

For help with using APIs or to inquire about available materials for text mining, contact us.