Toward a Stable Web: Current Initiatives...
Toward a Stable Web: Current Initiatives...


Toward a Stable Web:
Current Initiatives Add Stability to a Dynamic Environment

The Internet has a way of condensing time almost unlike any invention we have created to date. To illustrate the point, look at the evolution of the scholarly journal. It took approximately two hundred years from the invention of print until the first scholarly journals appeared in London and Paris in 1665. Now, a scant few decades since the concept of the electronic journal was developed, ejournals are a growing and expected component of the scholarly communication model.

So much quicker than they appeared, electronic journals are becoming more robust in their offerings of value-added features. The addition of interactive content like 3D models, video and audio files are just now beginning to meet the promise that electronic publishing holds for the scholarly community. Perhaps one of the more important developments in the burgeoning ejournal space is the emergence of stable linking from references, to indexing and abstracting databases, and directly to articles themselves. Links are so ubiquitous on the Internet that we often overlook their navigational importance and the architecture that allows us to use them. In the formative days of the Internet, objects were often identified with URL's like http://mywebsite/filename.pdf. Many items are still identified in this way. This method does little to make the web a more stable medium for publishing. It does, however, increase the concerns of many that use the web and have vested interests in seeing it develop in a durable manner. A recent NISO Linking Workshop outlined the breadth of linking issues when it noted that: " . . . scholars and researchers, expect seamless access to network-accessible materials; authors and publishers want wide awareness of and use of their intellectual property; repositories and vendors require mechanisms to facilitate linking access both into and out of their systems, and libraries must provide and manage a wide range of search tools and information sources containing links and serve those who use them.1"

With easy and stable access to content being such an important factor in the web, it is important to understand a bit about links themselves. What is it that constitutes stable links and why is it that some links are potentially unstable? This is actually a complex issue that includes both identifying and locating the item.

To start, let's deconstruct an unstable linking structure. These links often use a structure like http://mywebsite/filename.pdf. URLs like these are prone to change as their location moves from one machine to another over time. More than likely, these links are fully functional when the author creates the work but in the long haul, become troublesome. For instance, it is common for a publisher to create an online article or book with hyperlink pointers to supporting web resources. It is also just as common for at least a handful of these links to be inoperative as the article or book goes to press. Other links will be problematic soon after the work is in circulation. In all probability, neither the author nor the publisher will know of these broken links until users report them.

Many authors and publishers don't currently give enough forethought to the creation of stable linking structures. As illustrated above, this inevitably leads to the fragmentation of web-based literature and the inability to locate items over time. Once an article is published on the web, its URL should never change. Unfortunately, and all too often, they do change. Some publishers think far enough into the maze to notify linking partners of widespread link changes minimizing the pain but not eliminating the problem. Others fail to make these notifications and broken links go unnoticed until a researcher needing the information stumbles into a dead-end. In the pursuit of stable links, consideration should be given to many decisions that, unfortunately, are sometimes overlooked. The identification of the article is an important component in the URL configuration. Publishers can add more structure to the URL by using established, somewhat calculable, article identifiers such as the PII or SICI. The Publisher's Item Identifier (PII) and Serial Item and Contribution Identifier (SICI) both identify articles in a consistent manner. The ability to calculate links is important in that this helps to form a uniform resource name ensuring the URL remains constant over time. Calculable links also allow other publishers to develop automated mechanisms to construct links to items without having to manually determine the location of each article for that publisher. These links can be script generated based on some predictable structure for that publisher. This method is more structurally sound than filename.pdf, but even this is not the ideal model because it requires each publisher to determine every other publisher's structure and keep this knowledge current. This problem becomes staggering when a publisher tries to keep track of even a small set (one hundred) of other publishers. While providing much needed consistent article identification, identifiers like these still do not address the location of that article.

To assist in the creation of stable locations for Internet content, publishers and other content creators can employ Digital Object Identifiers (DOI). This is an identification system for intellectual property in the digital environment. The DOI is both a persistent identifier, and a system that processes that identifier on the Internet to deliver services. The DOI initiative does much to stabilize article location on the web. As content moves from place to place over time, publishers update the location of the object with the central DOI registration service thus ensuring stability. While the DOI initiative helps to stabilize the web, there are still drawbacks. For instance, DOI has yet to develop a lookup system that facilitates discovery of stable links. Lookup tools allowing others to identify the article DOI or persistent identifier are now a hot topic and have opened a market niche for newly emerging organizations and initiatives. During the past year, lookup initiatives have emerged spanning the entire spectrum of web publishing from cross-disciplinary, to discipline-specific, and publisher-specific. Standardization will be the key to the long-term stability of these tools.

CrossRef is spearheading the cross-disciplinary effort. This is a newly emerging initiative for publishers of Scientific and Technical Information (STI) and is based on the DOI. The CrossRef scheme uses a limited set of metadata which, when queried, becomes a unique citation with a stable URL pointing to the requested object. For a fee, individual users will be able to access the system to generate uniform citations accompanied by stable links to objects on the web. Publishers will use a batch download capability to harvest larger sets of stable links and citations. CrossRef's management expects that at the outset, more than three million articles across thousands of journals will be linked through this initiative, and more than half a million more articles will be linked each year thereafter. With this level of linking, users will rapidly have the ability to navigate through a huge universe of related scholarly materials. With fewer concerns about discovery, users will have more time to focus on valuable content. No doubt, this will be a winning situation for publishers, authors, libraries, and users alike.

Other discipline specific initiatives are also under development. For instance, the National Library of Medicine's PubMed/PubRef system works in conjunction with publishers of biomedical literature and provides a search tool for accessing literature citations and linking to full-text journals at web sites of participating publishers. The AMS is also creating its own lookup tool for mathematical literature. MRLookup will parallel CrossRef in terms of publisher requirements allowing authors and publishers to implement stable links to reviews and original articles in the mathematical sciences.

As the web continues to expand, scholarly communication in electronic format will become more accepted. Judging from the explosive growth and rapid acclamation of the Internet in general, this recognition is imminent. The need for stable linking structures and lookup tools is just as quickly becoming a critically important component of web development. Publisher's independent initiatives and broader based initiatives like CrossRef and DOI will do much to add stability to the web. More importantly, the emergence of, and adherence to, standards will be the key to developing stability in the ever-expanding space of electronic scholarly communications.

Notes:
1. Needleman, Mark. Meeting Report of the NISO Linking Workshop. February 11, 1999, Washington DC with contributions from the Workshop Steering Committee. http://www.niso.org/linkrpt.html

by Wendy Bucci, Special Project Administrator, American Mathematical Society (wab@ams.org) andTimothy E. McMahon, Electronic Publishing Specialist, American Mathematical Society (txm@ams.org).

 

Information Outlook Main Page | This Issue's Table of Contents | Back Issues of Information Outlook
SLA Home Page | Join SLA Now | Feedback | Search

Privacy Statement
©2009 Special Libraries Association. All rights reserved.
331 South Patrick Street Alexandria, VA 22314-3501 USA