Anne
Prestamo, MLIS, Ed.D.
Associate Professor and Head, Digital Library Services
University Libraries
Oklahoma State University
As the Oklahoma State University Libraries entered into more and more agreements with publishers and aggregators for online access to journals, magazines, and newspapers, it became increasingly challenging for librarians, let alone our patrons, to keep up with all of the available options. In May 1999 the initial version of the “Full-Text Periodical Titles and Coverage List” went live on the Library’s Website. Initially, the list contained approximately 5,500 entries. Today the list has grown to nearly 18,000 access points.
At the time this project began, the OSU Library was using NOTIS for its Online Catalog. Since NOTIS was a text-based, telnet system, adding links to the MARC 856 field was not particularly helpful. In July 2000, migration to Endeavor’s Voyager system was completed. With the introduction of a Web-based catalog the 856 links have become much more utile, and efforts are underway to continue adding the URL information to records as quickly as possible. Whether the “Full-Text Periodical Titles and Coverage List” will be maintained once the 856 project reaches a critical mass is currently a topic of discussion.
This paper will detail the
processes used, including:
1)
gathering and “normalizing” of data elements from vendors and
publishers
2)
creating the master file of all merged titles
3)
converting to a Web-deliverable format
4)
maintenance and update procedures
5)
attempted migration to new format, including Access/SQL and ProCite/
Reference Web Poster
6)
856 links vs. “Full-Text Periodical Titles and Coverage List” ongoing
discussion
7)
lessons learned - what we’d do differently if we were starting over
I. Background
As the Oklahoma State University Libraries entered into more and more agreements with publishers and aggregators for online access to journals, magazines, and newspapers, it became increasingly challenging for librarians, let alone our patrons, to keep up with all of the available options. When a patron was looking for a known periodical or article it was necessary to search numerous resources to determine if and where it was available. At the time this project began, the OSU Library was using NOTIS for its Online Catalog. Since NOTIS was a text-based, telnet system, adding links to the MARC 856 field was not particularly helpful. Creating a local database of Full-Text Periodical access points was undertaken as a stopgap measure until the Libraries’ migration to Endeavor Voyager was completed, at which time 856 links would be added to periodical records and be truly functional.
In May 1999 the initial version of the Full-Text Periodical Titles and Coverage List went live on the Library’s Website. Initially, the list contained approximately 5,500 entries. In addition to the periodical titles, this list contained information on the formats available and the coverage dates for each title. The majority of these entries represented the full-text access points to periodical titles from two aggregators: Bell and Howell Information and Learning’s ProQuest Direct and Gale’s InfoTrac databases. Initially, we were able to provide links to the “front door” of these resources. Once there, it was necessary to do a search for the particular periodical title or article within that resource.
In the months that followed online full-text titles from Dow Jones Interactive, Project MUSE, JSTOR, Elsevier ScienceDirect, MCB Press Emerald Library, Lexis/Nexis Academic Universe, IEEE/IEE Electronic Library (IEL), Academic Press IDEAL, Oxford University Press, Cambridge University Press, Institute of Physics, American Institute of Physics, American Chemical Society, ACM, University of Chicago Press, SIAM, Wiley InterScience, Blackwell, and Springer LINK, as well as numerous society and single subscription titles have been added. Some of the this content is distributed via Stanford University’s HighWire Press, while others are direct links to the respective publishers. Currently, there are over 18,000 entries in the list.
As the list grew and procedures were refined, direct links to the periodical title within a given resource were incorporated through the use of durable or persistent URLs, streamlining the process of locating a particular periodical title.
The procedures for obtaining the raw data for each resource have varied. Whenever possible, we have requested that the vendor provide a delimited text file containing periodical title, ISSN, coverage dates, persistent or durable URL to the periodical title-level. In a few cases the vendor has furnished this file to us, while in other cases we have been able to download the required information from the vendor’s Website as a delimited text file. In yet other cases, it was necessary to retrieve the needed information on a title-by-title basis from the vendor’s Website.
A separate Excel file with
common field structure was created for each resource.
Particularly in the case of aggregators these files have proved very
useful when we have considered new subscriptions, in that we have a readily
usable file against which we can run checks for duplicate titles and/or
coverage.
The following list contains links to sources of the necessary data from several
of our vendors and notes on the ease of use for each.
|
Resource |
Data
Source |
Ease
of Use |
|
ProQuest |
http://www.umi.com/cgi-bin/TitleForm?cfg=LibTitles.cf |
To
create the title-level URL it was necessary to subscribe to ProQuest
SiteBuilder at additional cost. Building
each durable link was a multiple-step process. |
|
InfoTrac |
http://www.galegroup.com/servlet/ |
In
February 2001 durable links to the title-level (InfoMarks) and
downloadable MARC records containing the 856 field were made available.
|
|
ScienceDirect |
Excel
file from vendor, including durable URL for each title |
Normalizing
fields to our local format was all that was required. |
|
AcademicUniverse |
Title
and coverage dates provided by vendor in delimited text file. |
List
included many titles that were not, in fact, full-text.
Best link is to the “Source List”, with no current method to
provide durable links to the title-level. |
|
Emerald |
Subscribed
title list provided by vendor in as text file. |
Coverage
dates and title-level URLs were retrieved on a title-by-title basis. |
|
Oxford
University Press |
http://www3.oup.co.uk/jnls/online/ |
This
is a new feature. Save page
as .html and open in Excel to edit. It
is still necessary to copy and paste title-level URLs into the Excel file. At the time we subscribed it was necessary to build the
list from scratch. |
|
Cambridge
University Press |
http://www.journals.cambridge.org/bin/ Note:
this URL requires authentication |
Once
subscribed, this URL displays a page showing your subscribed titles only.
This feature was not available when our local list was originally created.
Coverage dates and title-level URLs must be retrieved on a title-by-title
basis. |
|
IEL |
http://ieeexplore.ieee.org/lpdocs/epic03/ Note:
this URL requires authentication |
Once
subscribed titles were copied and pasted from the source list and coverage
dates and title-level URLs were retrieved on a title-by-title basis. |
|
Project
MUSE |
http://muse.jhu.edu/journals/ |
Save
Webpage list as .txt file and open in Excel.
Coverage dates and title-level URLs were retrieved on a
title-by-title basis. |
|
JSTOR |
http://www.jstor.org/cgi- |
Save
Webpage list as .txt file and open in Excel.
Coverage dates and title-level URLs were retrieved on a
title-by-title basis. |
Figure 1 Sources of Raw Data
Once the file for each
individual resource was created, a copy of each file was merged into the master
Excel file. The file was sorted
alphabetically by Periodical Title (see Figure 2).
This master file was then subdivided into separate worksheets, each
containing approximately 100-150 titles. Column
headings containing the field names were inserted into the worksheets every
15-20 titles for clarity (see Figure 3). The
worksheets were named with the appropriate alphabetic subdivision for the titles
it contained (i.e. A-Ag).

Figure 2 Master File in Excel
In
1999 the first version of the list consisted of 50 worksheets.
Today, that number has grown to 105, with most worksheets containing
150-200 titles.
Once
the worksheets were created each Excel worksheet was saved as an HTML file.
While the “Save as HTML” feature of Excel proved very useful in
quickly converting the worksheets to Webpages, the resulting files were bloated
with a great deal of extraneous HTML tagging and attributes.
To
strip the extraneous HTML we used HotDog Pro, an HTML editor.
Using its MultiFile Find and Replace utility, a variety of schemes were
created to find and replace all unnecessary code.
While creating these schemes was a challenge, this utility saved a
tremendous amount of time, without having to open and edit each individual file. Some schemes made more than 80,000 replacements across all
files in a single operation. Once
the extraneous code was stripped the file sizes were reduced by as much as
two-thirds.
The
“TitleURL” and “VendorProductName” fields from the Excel file were
blended into the “Access Via” field in the Webpages using schemes built with
the MultiFile Find and Replace utility as well (see Figure 3).

Figure 3 Sample of Webpage version
An
index page (see Figure 4) was created with a grid linking to each section of the
alphabet. Once again using
HotDog’s MultiFile Find and Replace this grid was inserted at the top and
bottom of each individual Webpage to facilitate browsing.
The
work described above, as well as the maintenance and updating that will be
discussed later was accomplished with one librarian, a half-time graduate
assistant, and two student employees. Overall,
approximately 10% of the librarian’s time is devoted to this project, although
the time-per-week varies tremendously. This
will be further addressed below. The
graduate assistant works approximately 15 hours per week on an ongoing basis on
the project. Student employees are
involved primarily when new title and coverage lists are being constructed for a
particular vendor. This has
required as much as 30 hours per week of student labor at various points in the
project.

Figure 4
Index Page showing navigation grid that repeats on each page
http://www.library.okstate.edu/database/
V. Maintenance and Update Procedures
As discussed above, we have a graduate assistant who devotes approximately 15 hours per week to the retrieval and entering of updates. On a week-to-week basis these updates are entered manually in both the Excel file and their respective HTML files. Updated HTML files are uploaded to the server on an as-needed basis.
When
large numbers of new titles are added a complete regeneration of the HTML files
from the Excel file is done. This
has averaged approximately once per semester.
When this is done it requires 60-80 hours of the librarian’s time.
Because the complete regeneration is incredibly labor intensive and done
rather infrequently, there are periods when the list does not include recently
added titles for a period of up to 8 weeks.
An alternative solution was explored that involved migrating the Excel master file to ProCite, a bibliographic management utility. A companion product, WebReference Poster enables searching of ProCite databases via Web forms. While not a simple, straightforward conversion, after some trial and error the Excel file was successfully converted to ProCite. WebReference Poster was installed in test mode on the Libraries’ Web server. It functioned well, with the exception that the URL in each record appeared only as text – not as a clickable link. Through further editing, the necessary HTML tags were inserted with each URL in an attempt to force the link. This merely resulted in the HTML tags being displayed as text in the WebReference Poster record screens. Technical support at RIS stated that they are aware clickable links are an issue, but that have no plans to incorporate that functionality in the near future. As a result of this shortcoming we abandoned this solution.
Again, it appears that the Access/SQL solution is the most viable and we are in the process of trying to develop or acquire personnel with the expertise necessary to move in this direction.
As discussed earlier, the creation of the Full-Text Periodical Titles and Coverage List was initially viewed as a stopgap measure until such time the 856 links could be provided in the Libraries’ OPAC records. It seems, however, that we are the victims of our own success. At the current time there is great support for maintaining the Full-Text Periodical Titles and Coverage List in tandem with having the 856 links in the OPAC records.
The OSU Libraries’
Cataloging Department has made great strides in the highly labor-intensive work
of incorporating 856 links. They
are following the CONSER Guidelines for a single-record approach.
(See http://www.loc.gov/acq/conser/mod31pt1.html)
In a nutshell, this means that when the Libraries have no holdings in
other formats a new bibliographic record is created and the 856 field is
inserted in the bibliographic record. When
the Libraries have existing print or microform holdings a new holdings record is
created with the 856 field and linked to the existing bibliographic record.
In
some cases, this massive undertaking has been facilitated by the availability of
MARC records for bulk import. Some vendors, like the Gale Group are now providing the MARC
856 records at no additional cost, while others like ProQuest have MARC records
available for purchase. Information
on the Gale Group’s MARC records may be found at http://www.galegroup.com/catalog/marc.htm
Information on MARC records for ProQuest Direct is available at http://www.umi.com/hp/Features/Marc/MARCdata.html
Some
libraries are still taking the a highly conservative approach with 856 links,
holding firm to the idea that only items which are owned by a library,
and over which the library has physical control, should be cataloged.
In our case, a decision has been made to catalog anything to which we
have access, regardless of its format and physical location.
However, one issue that we are still wrestling with involves the question
of whether of not 856 links for titles included in consortial subscriptions
should be cataloged. For example,
we benefit from a statewide subscription to the Gale Group’s InfoTrac databases.
As an individual library we have no direct control over the decision to
renew or cancel access to these databases.
The titles in the InfoTrac databases have been assigned the lowest
priority in the Cataloging Department’s workflow, as the statewide contract is
up for renewal at the present time. Once
a decision is made on renewal, we will make a decision on whether or not records
for these titles will be added to the catalog.
The
OSU Libraries will be migrating to Voyager2000 over the next few months. Currently under study is the possibility that Voyager2000
will allow us to build “canned searches” utilizing the 856 field
information. These canned searches
may allow on-the-fly-generation of browsable lists of full-text periodicals and
could replace the locally maintained Full-Text
Periodical Titles and Coverage List. Until
that option can be fully tested it is unlikely that migration of the Full-Text
Periodical Titles and Coverage List to Access/SQL will be undertaken.
When the initial work on the Full-Text Periodical Titles and Coverage List was undertaken it was envisioned as short-term solution. While low-tech solutions may appear expedient in the beginning, their ability to “scale” as demands increase is a question that should have been asked from the beginning. In its earliest stages, the methodology adopted was labor intensive, but not prohibitively so. No one involved in the project imagined that the Libraries’ access to online full-text periodical would grow at the rapid rate we’ve experienced. As the number of titles doubled and then more than tripled it became clear that the initial method of creating and delivering the list was becoming increasingly impractical.
As the Cataloging Department has begun entering 856 links we have had to develop procedures for gathering and disseminating update information. At the current time, we are updating records in three places: the Excel master file, the Full-Text Periodical Titles and Coverage List Webpages, and the OPAC records. An administrative decision will have to be made as to whether or not we can continue to commitment the human resources necessary to provide multiple means of access to full-text periodical links.
The reassuring thing in all of this is that, although the maintenance of the Full-Text Periodical Titles and Coverage List has become highly labor-intensive, it has become an important tool for librarians and Library patrons. Analysis of the log files for 2000 indicates that the Full-Text Periodical Titles and Coverage List was the third most frequently accessed Webpage on the Libraries’ Web server. It use was surpassed only by the Libraries’ homepage, which displays every time a browser is opened on any of the Libraries’ 400+ computers. The second most frequently accessed Webpage is the Libraries’ list of Electronic Indexes and Databases.
Facilitating access to online materials will continue to pose challenges for libraries for the foreseeable future. Seamless and simple access has yet to be achieved. The availability of full-text periodicals online has certainly streamlined the ways in which we retrieve information, but has brought with it a whole new set of challenges in bibliographic control, organization, and access.
Contact the Author
Anne Prestamo, Coordinator
Digital Library Services
Edmon Low Library
Oklahoma State University
Phone (405) 744-9161
FAX (405) 744-7579
Email prestam@okstate.edu