Structured Citations for the Sum of All Human Knowledge

We are building a structured repository of citable sources that can serve Wikipedia and other open knowledge projects.

structured citations image

Project overview

Verifiability is critical to open knowledge projects like Wikipedia, and citations are central to their success. Yet, methods used to cite information are outdated. Citations in Wikimedia projects are hard to discover, maintain, reuse, analyze, and discuss. The contributor community has long been in agreement that citation management needed improvement.

Efforts are now underway to build a stronger technological foundation for representing sources and managing citations in Wikimedia projects. The introduction of Wikidata marked the beginning of an organized attempt to improve structured citations.

In 2016, we introduced an initiative called WikiCite. Its aim is to grow and support a community of contributors who are invested in creating technological solutions to the verifiability issue. We plan to build an open repository of sources—using Wikidata as the primary infrastructure—that can serve all open knowledge projects, including Wikipedia.

We strive to develop a deeper understanding of how Wikimedia contributors use sources and to improve technological support for sources and citations.

Recent updates

  1. Citation needed coverage

    VentureBeat provided some coverage of our Citation Needed study on which statements in English Wikipedia are lacking citations and why.
  2. Citation needed blog post

    We are using machine learning to predict whether—and why—any given sentence on Wikipedia may need a citation in order to help editors identify areas of content violating the verifiability policy.
  3. Reader trust survey

    The first round of surveys went out for research on the role of citations in how readers evaluate Wikipedia articles.
  4. WikiCite 2018

    The third annual WikiCite conference wrapped up at Berkeley, California. Stay tuned for reports.
  5. WikiCite at TechStorm

    Miriam Redi and Antonin Delpeuch presented some fun with WikiCite in Wikidata.
  6. WikiCiteVis: exploring citations of Wikipedia

    Find out how scholarly articles are cited on Wikipedia with WikiCiteVis.
  7. Accessibility of Wikipedia references

    How many Wikipedia references are available to read? We measured the proportion of open access sources across languages and topics.
  8. Characterizing Wikipedia Citation Usage

    We're starting a new collaboration with researchers at Stanford University and EPFL to understand the role of external citations among Wikipedia readers.
  9. Wikipedia’s top-cited scholarly articles — revealed

    "Gene collections and astronomy studies dominate the list of the most-cited publications with DOIs on the popular online encyclopaedia." Nature on our dataset of citations by identifier in Wikipedia.
  10. The most-cited authors of Wikipedia had no idea

    "A single academic paper, published by three Australian researchers in 2007, has been cited by Wikipedia editors over 2.8 million times. And the researchers behind it didn't have a clue." Our dataset and analysis of citations by identifier in Wikipedia got featured in Wired.
  11. What are the ten most cited sources on Wikipedia? Let’s ask the data.

    We released a dataset with fifteen million records, documenting source usage in Wikipedia by identifier across nearly 300 languages.
  12. Unsourced statements in Wikipedia

    We are kicking off a new project and a collaboration with a team at Leibniz Universität Hannover to identify statements in Wikipedia that need an inline citation to a reliable source, using a machine-assisted framework.
  13. The WikiCite 2017 report

    We published our annual report, highlighting the accomplishments the community and our network of partner organizations have achieved this past year.
  14. Citations with context

    We published a dataset representing structured metadata and contextual information about every reference added in the history of English Wikipedia.
  15. Unlocking citations from tens of millions of scholarly papers

    We gave a keynote on our progress in liberating open citation data, and reusing it in projects like Wikidata, at SWIB17, the 2017 Conference on Semantic Web in Libraries.
  16. Wikidata as a structured repository of bibliographic data

    A video of our session on WikiCite at WikidataCon 2017 and an overview of why we're building an open knowledge base of citable sources to support free knowledge.
  17. Initiative for Open Citations

    We launched the Initiative for Open Citations (I4OC): an advocacy initiative and coalition co-founded by the Wikimedia Foundation, promoting the unrestricted availability of citation data.

Project team

Dario Taraborelli, Miriam Redi, Aaron Halfaker


Besnik Fetahu (Leibniz Universität Hannover), Michele Catasta (Stanford University), Anna Filippova (Carnegie Mellon University), Andrea Forte (Drexel University), Meen Chul Kim (Drexel University), Jure Leskovec (Stanford University), Tiziano Piccardi (EPFL), Robert West (EPFL)


Resources and links