Share Email Print

Proceedings Paper

Analysis of Wikipedia pageviews to identify popular chemicals
Author(s): Yuru Cao; Hely Mehta; Ann E. Norcross; Masahiko Taniguchi; Jonathan S. Lindsey
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

A new approach to assess popularity relies on analysis of the number of times a web article is viewed. Here, a strategy is described to identify chemicals of widespread interest. The strategy makes use of Wikipedia, a rapidly growing publicly editable web encyclopedia that has become an influential knowledge base. While the total number of chemicals mentioned in Wikipedia is unknown, use of the Wikipedia Chemical Structure Explorer (WCSE) developed by Novartis enables identification of those that are described in an Infobox or Chembox along with a Simplified Molecular-Input Line-Entry system (SMILES) code. Using a Python script, all so-listed chemicals (16,243) in Wikipedia were identified and then sorted on the basis of their pageview rankings. Of the 16,243 chemicals, 846 (5.2%) belonged to controlled substances (United States Drug Enforcement Administration), WHO essential medicines, or the top 300 US drugs. These 846 chemicals received 220 million pageviews, which is 41.4% of the pageviews for all members of the Wikipedia chemical list. The number of chemicals described in the entire corpus of Wikipedia remains a tiny fraction of the <107 known chemicals. Much remains to be done to make the venerable literature and data of chemistry readily accessible. Regardless, identification of popular chemicals in this manner can be used to create selected databases, to tailor educational curricula, or to create targeted informational materials (such as safety brochures); such considerations of public demand are likely to engender corresponding widespread interest.

Paper Details

Date Published: 21 February 2020
PDF: 18 pages
Proc. SPIE 11256, Reporters, Markers, Dyes, Nanoparticles, and Molecular Probes for Biomedical Applications XII, 112560I (21 February 2020); doi: 10.1117/12.2542835
Show Author Affiliations
Yuru Cao, The Univ. of North Carolina at Chapel Hill (United States)
Hely Mehta, The Univ. of North Carolina at Chapel Hill (United States)
Ann E. Norcross, North Carolina State Univ. (United States)
Masahiko Taniguchi, North Carolina State Univ. (United States)
Jonathan S. Lindsey, North Carolina State Univ. (United States)

Published in SPIE Proceedings Vol. 11256:
Reporters, Markers, Dyes, Nanoparticles, and Molecular Probes for Biomedical Applications XII
Samuel Achilefu; Ramesh Raghavachari, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?