Collection 33: Articles classified as being about the humanities or the sciences from U.S. top-circulating newspapers and student newspapers, c. 1998-2018

A collection of word-frequency and other data representing 13,214 unique articles (no duplicate or close-variant documents) classified as being about the humanities or science published from 1998-2018 in 507 U.S. top-circulating and student newspapers and their associated blogs. The collection includes 2,477 articles from U.S. top-circulating newspapers and 10,737 articles from student newspapers. Using supervised classification models, 2,869 articles in the collection have been classified as being about the humanities, and 10,345 articles in the collection have been classified as being about science. WE1S and other researchers use this data to look for broad patterns and to help guide closer study.

News sources in Collection 33 include 15 top-circulation U.S newspapers: Boston Globe, Chicago Tribune, Daily News (New York), Dallas Morning News, Denver Post, Houston Chronicle, Los Angeles Times, New York Post, New York Times (and its blogs), Newsday (New York), Seattle Times, Star Tribune (Minneapolis, MN), Tampa Bay Times, USA Today, Washington Post. Also included are documents from 491 U.S. campus newspapers, among which the top 15 sources in the collection are: The Stanford Daily (Stanford University), The California Aggie (UC Davis), The Daily Californian (UC Berkeley), The Daily Bruin (UC Los Angeles), The Kaleidoscope (U Alabama Birmingham), The Tartan (Carnegie Mellon), Michigan Independent (UM Ann Arbor), The Daily Princetonian (Princeton), The Harvard Crimson (Harvard), The Dartmouth (Dartmouth), Cornell Daily Sun (Cornell), Colorado Daily (U Colorado Boulder), The Daily Texan (UT Austin), The Daily Cardinal (UW Madison), Indiana Daily Student (Indiana University).

Kinds of Sources (by Tags)

Sources in Collection 33 are associated with the following non-exclusive metadata categories, which describe the kinds of sources in the collection. Categories are listed in order from those associated with the most documents to those associated with the least: education/institution/Doctoral, education/funding/US public college, region/US/North East, education/funding/US private college, region/US/Midwest, region/US/West Coast, reach/US/top-circulating, media/US newspaper, region/US/South, education/institution/Liberal Arts, education/demographic/Hispanic-serving Institution, education/affiliation/UC system, education/affiliation/Ivy League, region/US/Rockies and Southwest, education/emphasis/Science Tech and Ag school, education/demographic/religion/Catholic, education/affiliation/Cal State system, education/demographic/religion/Christian, education/institution/Community College, education/demographic/Historically Black Colleges and Universities, education/non-US college, region/Europe, education/demographic/religion/Latter-day Saints, region/US/non-contiguous, region/US/mulitregional, education/demographic/religion/Jewish, education/demographic/Womens College, education/, media/website, identity/race ethnicity and cultural heritage, region/Canada. Sources are assigned to categories based solely on explicit publication information and/or self-identification.

Suggested Citation

WhatEvery1Says (WE1S) Project. (May 15, 2020). Collection 33: Articles classified as being about the humanities or the sciences from U.S. top-circulating newspapers and student newspapers). Zenodo. DOI 10.5281/zenodo.4940725.


Collection Metadata


Topic Models of This Collection

Model Family 1 (created May 15, 2020): models for 25, 50, 100, 150, 200, 250 topics

Visualizations for this model family:

25 topics 50 topics 100 topics 150 topics 200 topics 250 topics
Dfr-browser
TopicBubbles
pyLDAvis
DendrogramViewer
Diagnostics

WE1S Developers Only

This start page for the collection last revised: June 13, 2021