Hackathon 2013/Citations

From TaxonWorks Wiki
Revision as of 14:42, 30 September 2013 by Gaurav (Talk | contribs)

Jump to: navigation, search

This pitch covers making sure that citations can flow easily in and out of TaxonWorks. It will do this in three ways:

  1. Finding an existing data model which can hold information citations with a standard format for representing it.
  2. Writing code to read this in and out in Ruby, possibly just using a standard library
  3. Writing code to resolve microcitations. A working example of this has been created by Rod Page that links generic names from Nomenclator Zoologicus with references from the Biodiversity Heritage Library.
    • Micro-citations should be differentiated from verbatim references. A micro-citation is just an author and date (as seen in a full taxon name). A verbatim reference is the full reference as it appears in documentation that hasn't be broken into normalized pieces yet. Both versions need to be tracked and searchable. We also need a way to get a listing out of TW of both versions, so they can be normalized into full sources.
    • Lists of journals.
    • BHL API to convert BHL URLs into references
  4. Finding out more about a particular citation
    • Getting abstract, keywords from PubMed
    • Pulling in citation from BHL
    • ImpactStory information
  5. Automatically parsing citations into authors, title, etc.
  6. Designing a user interface to make it easy to resolve microcitations
    • Autocompletion
    • Journal name identification
    • Searching on Google Scholar/Wikipedia for books/authors

Contents

Deliverables

  1. Use bibtex-ruby to read, write and round-trip bibliographic information.
  2. Write a Rails system for storing citations, and integrate it with bibtex-ruby.

Terms

  • Citation: An individual, unnormalized use of a source.
    • Each citation will have a unique identifier that the rest of the system can reference.
    • It must be possible to have a citation which consists ONLY of a single identifier. We could treat this as a verbatim reference.
  • Source: Something you want to credit in providing that data.
    • A person can be a source.
    • TW needs sources to be private or public.
  • Global source: a common pool of sources. These should be published and non-private.

Members

Datasets we can play with

  • ITIS
  • GNUB
  • UCD

Input/output formats

URLs and identifiers of taxonomic significance

It should be noted that there will be multiple identifiers associated with a single source.

  • ISBN/ISSN
  • BHL URLs
  • PubMed ID/URLs
  • DOI ID/URLs
  • Handle ID/URLs?
  • Mendeley/Zotero/EndNote ID/URLs

Revelent Code and projects

Projects and gems that may be of interest

APIs available

Links