A collection of word-frequency and other data representing 2,477 unique articles (no duplicate or close-variant documents) classified as being about the humanities or science published from 2000-2018 in 15 U.S. top-circulating newspapers and their associated blogs. Using supervised classification models, 260 articles in the collection have been classified as being about the humanities, and 2,217 articles in the collection have been classified as being about science. WE1S and other researchers use this data to look for broad patterns and to help guide closer study.
News sources in Collection 34 include 15 top-circulation U.S newspapers: Boston Globe, Chicago Tribune, Daily News (New York), Dallas Morning News, Denver Post, Houston Chronicle, Los Angeles Times, New York Post, New York Times (and its blogs), Newsday (New York), Seattle Times, Star Tribune (Minneapolis, MN), Tampa Bay Times, USA Today, Washington Post.
Sources in Collection 33 are associated with the following non-exclusive metadata categories, which describe the kinds of sources in the collection. Categories are listed in order from those associated with the most documents to those associated with the least: media/US newspaper, reach/US/top-circulating, total docs in project, region/US/North East, region/US/Midwest, region/US/West Coast, region/US/South, region/US/mulitregional, region/US/Rockies and Southwest. Sources are assigned to categories based solely on explicit publication information and/or self-identification.
WE1S Project, (Articles classified as being about the humanities or the sciences from U.S. top-circulating newspapers), 2020, doi:[TBD].
25 topics | 50 topics | 75 topics | 100 topics | 150 topics | 200 topics | 250 topics | |
---|---|---|---|---|---|---|---|
Dfr-browser | |||||||
TopicBubbles | |||||||
pyLDAvis | |||||||
Metadata7D | |||||||
GeoD | |||||||
DendrogramViewer | |||||||
Diagnostics |