Main Page
Report
Program
Session I
Session II
Session III
Session IV
Session V
Session VI
 
Intellectual Property Rights Page
 

For more information, please contact Mark Frankel.

 
Scientific Freedom, Responsibility & Law Program
 

IT'S NOT JUST SINGLE PAPERS ANYMORE.
Peter B. Boyce
American Astronomical Society

The scholarly information exchange paradigm is changing before our eyes, yet the tenets of how science is done remain unchanged. This contradiction is leading to a diversity of approach in the dissemination of electronic scholarly information.  Mats Lindquist is right in stating that progress in science is built as scientists rely upon what has been done before to establish a base from which they can develop new knowledge. Fundamental to this process is the production of research results which are fixed in time and which can form the basis of future work. The integrity of such "building blocks" in our edifice of knowledge is a prime requisite. We have to have specific results to build on, specific papers to refer to, specific information which can be tested and verified or found wanting. It is incumbent upon scientists to refer to the previous knowledge and to establish the context in which their work falls.

One of the few places where I disagree with Mats is when he says, "...the scientific journal is gradually being transformed and becoming less paper bound...".   I believe the change is happening with breakneck speed, compared with the time scale on which publishers have generally been able to evolve. After all, the first "real" electronic journals with effective links incorporated just appeared three years ago, our Astrophysical Journal Letters and the Journal of Biological Chemistry.

Now, with the information revolution made possible by the World Wide Web, which provides a common interface for every computer on the Internet, several things are happening. More and more publishers are providing the electronic hyperlinks which bring access to the references with a heretofore unheard of immediacy. They add enormous value to the electronic scholarly articles.

Along with that, the character of articles are changing, departing from the norms established when ink on paper was all we had available as a medium for exchanging information. Links within articles are needed to connect the text with video, audio, live math fragments, machine readable tabular data, etc... .

As readers and discipline-specific communities have begun using the electronic methods for information interchange, an expectation has arisen that the information on the Web will be the very latest information, with up-to-date corrections having been made.  Within the astronomical community, the fact that the paper version differs from the electronic and is wrong, is less important than the fact that the electronic version is the most up-to-date, and hence assumed to be the authoritative version.
 
The problem of changing versions adds complexity to the process of linking from scholarly articles to references, a process which is not simple in the first place.  Let's look at the criteria for links and identifiers: Links must be stable - meaning that they have to be to a logical identifier, not a physical location, i.e. a URN not a URL. There are a number of "standard" identifiers in use among publishers. (Standards are good, everybody should have one!) The linked electronic material, be it past references, forward citations, of electronic-only material will soon make up a significant portion of the value of an article to the reader. As the connection mechanism which makes this material accessible, links are critical to electronic articles, and must be part of any electronic journal archive. Links must be able to be assigned automatically during the process of preparing the article and its reference page for publication. It is impossibly expensive to add links by hand.

This implies that the article/object identifier must be findable by readers and other publishers. In other words, a reader who remembers a reference must be able to query a database somewhere and retrieve the identifier, and preferably get back a pointer/hyperlink to the article or other electronic object. This is also germane to the publishing process when publisher wants to build links to other references. The first publisher's choice of identifier may either facilitate this process or render it impossible.

The ACS is using the Digital Object Identifier (DOI). The problem with the DOI is that it was designed to track usage and return revenue to the publisher. So far, it is not useful for facilitating the transfer of information. The DOI Foundation claims to be working on a query service, but they have been saying this for a year, and I do not yet see the result. It will certainly come, but until then the DOI is to not of real use for linking references in scholarly publications.

Links must also work across discipline boundaries and between journals from different publishers, who may be using different standard naming conventions. I think there is a role here in the future for the services to manage the article identifiers. Certainly, within the field of astronomy, we have had great success by having our reference links (except to our own journals) go through the searchable abstract database operated by the Astrophysics Data System (ADS). The ADS has, and continues to be, supported by NASA as a service to the scientific field. Our article naming standard (I said everyone should have one) is based upon the volume, page year structure with the first initial of the first author's last name. We call it a "Bibcode". It has its limitations, but had already been in use in astronomy for ten years, and we needed some standard in order to start making the links. The bibcode, indeed any Volume, page, year scheme does not work well for monographs and conference proceedings. In astronomy, where 3/4 of the references are to scholarly journals, the bibcode is adequate for now.

The advantage of the bibcode is that it is calculable from the reference. It does not require that the user query some "authority" to find the identifier.  In the early days, this simple approach has served the astronomical community well.  The Pub Med number is another "standard" identifier, but it is not calculable. It is assigned. In order to look up a reference, the database has to be queried to find the Pub Med number. Sometimes this type of identifier is called an "opaque string", since the number itself carries not meaning (or is opaque) to the user. The DOI has both a publisher identifier and an opaque string assigned by the publisher. Eventually, the AAS will have to go to an identifier which covers more than the serial literature, but we will still use the bibcode as our not-so-opaque string for the serial literature.


A limited number of identifiers is no problem for us. Since we have a name resolver to convert the bibcode to the current URL where the desired article (or other electronic object) is stored, it is simple to also make the translation for a limited number of other identifiers. As in all areas of electronic publishing, we must plan to ensure that we are able to incorporate changes as they occur.
 
To return to the critical issue, single journal articles without live links to other electronic articles and digital information resources will become a dying breed. To be an effective researcher, a scientist will have to have access to all the relevant information. The journal article, with its long list of germane references will become the gateway for a scientist to information which the author and the editors find to be both important and relevant to the subject.  With time such an important commodity in the modern world, good scholarly journals will be islands of quality information in an ever-growing sea of trivial papers and self published junk. Good reference lists will be beacons and roadmaps to quality information and will be appreciated as such within a very short time.

I predict that a system of links to references, citations and digital objects will be seen as absolutely essential by everyone within a short time. The power of a fully interlinked system is enormous. Witness the degree to which such a system has been accepted as necessary by the astronomical community who has had a fully functioning, interlinked example for just three years.
 
Astronomy now has the core literature available on line. The ADS has scanned images of the historical core literature available back to 1965. All the abstracts are available in the ADS, along with the backward references and forward citation lists.  In addition, the critical data tables are available in machine-readable format in several online astronomical databases which serve the worldwide astronomical community. The databases are organized by astronomical object, so that a researcher may enter an object name - such as the Orion Nebula, and get back a list of data which have been published about that object, complete with links to the individual papers. Once in the literature system, one may search the ADS for similar papers, and follow the reference tree forward and backward in time to eventually have all the papers on a given astronomical object within a couple of mouse clicks or even printed out to read on the bus.
 
Astronomy has a powerful system, much appreciated within the community. The question is how to make such a system of interlinked, online knowledge resources available to all disciplines. It can be done.

One final comment about printouts. I defy anyone to print out a video clip of two colliding galaxies, complete with profile of temperature, density, ionization state and star formation rate step by step. But, rather than cling to our paper, and our paper metaphor (like PDF), we have to plan for the day in the not-too-distant future when such electronic capability becomes available on our Palm Pilots, downloaded as we ride. That's the day we should be preparing for, three or five years in the future.

Is this group capable of thinking five years into the future? Or even three years?