Projekt Runeberg is a digital library initiated in December 1992 by Lysator, a students' computer club, in cooperation with the Linköping University, Sweden. It is an open and voluntary initiative to create and collect free electronic editions of classic Nordic literature and art. Around 200 titles are available in full text, and there is also data on more than 6,000 Nordic authors.
Some digital libraries are organized around an author, for example The CompleteWorks of William Shakespeare, The Dante Project or The Marx/Engels InternetArchive (MEIA).
Begun in 1996, The Marx/Engels Internet Archive (MEIA) "is continually expanding, as one work after another is brought on-line […] Pictures/photos now adorn the site, with many more to come". The Marx & Engels WWW Library gives a chronology of the collected works of Karl Marx and Frederick Engels, and access to a number of them. The Photo Gallery presents the Marx and Engels clan from 1839 to 1894, and their dwellings from 1818 to 1895.
The MEIA Search allows searching in the entire Marx/Engels Internet Library. "As larger works come on-line, they will also have small search pages made for them alone - for instance, Capital will have a search page for that work alone." The biographical archive gives access to biographies of Marx and Engels, and also short notices and photographs of the members of their family and their friends. The link "Others" gives access to a short biography and the works of Marxist writers, including: James Connolly, Daniel DeLeon, andHal Draper. The MEIA Non-English Archive lists the works of Marx and Engels in other languages (Danish, French, German, Greek, Italian, Japanese, Polish, Portuguese, Spanish, and Swedish), with links to them. The following statement is posted on the website:
"There's no way to monetarily profit from this project. 'Tis a labor of love undertaken in the purest communitarian sense. The real 'profit' will hopefully manifest in the form of individual enlightenment through easy access to these classic works. Besides, transcribing them is an education in itself… Let me also add that this is not a sectarian/One-Great-Truth effort. Help from any individual or any group is welcome. We have but one slogan: 'Piping Marx & Engels into cyberspace!'"
7.3. Digital Image Collections
Other digital libraries include pictures, for example the impressive Gallica. Available since 1997, Pictures and Texts of French 19th Century are the first part of the massive project of the French National Library (Bibliothèque nationale de France) which is digitizing thousands of texts and images relating to French history, life and culture.
The digital collections of American Memory are a major component of the Library of Congress's National Digital Library Program. The National Digital Library Program (NDLP) is an effort to digitize and deliver electronically the distinctive, historical Americana holdings at the Library of Congress, including photographs, manuscripts, rare books, maps, recorded sound, and moving pictures.
"The Library of Congress National Digital Library Program (NDLP) is assembling a digital library of reproductions of primary source materials to support the study of the history and culture of the United States. Begun in 1995 after a five-year pilot project, the program began digitizing selected collections of Library of Congress archival materials that chronicle the nation's rich cultural heritage. In order to reproduce collections of books, pamphlets, motion pictures, manuscripts and sound recordings, the Library has created a wide array of digital entities: bitonal document images, grayscale and color pictorial images, digital video and audio, and searchable texts."
There are currently over 30 collections in American Memory, for example:
(a) African American Perspectives: Pamphlets from the Daniel A. P. Murray Collection, 1818-1907: 351 rare pamphlets offering insight into attitudes and ideas of African Americans between Reconstruction and the First World War;
(b) Architecture and Interior Design for 20th Century America: Photographs by Samuel Gottscho and William Schleisner, 1935-1955: Approximately 29,000 photographs of buildings, interiors, and gardens of renowned architects and interior designers.
The New York Public Library Digital Collections provide the public with digital versions of books, manuscripts, photographs, engravings, and other items as well as tools to browse, search, and analyze these materials remotely via the Internet. Four general sections allow the browsing of the collections: Digital Schomburg (Center for Research in Black Culture); Archival finding aids; Cooperative projects; and On-Line Exhibitions.
SPIRO (UC Berkeley Architecture/Slide Library Slide and Photograph Collection) is the visual on-line public access catalog (VOPAC) for the UC (University of California) Berkeley's Architecture Slide Library (ASL) collection of 200,000 35mm slides.
"SPIRO can be accessed using either Image Query, a powerful database retrieval package, or the World Wide Web. ImageQuery2.0 was developed originally by UC Berkeley's Information Systems and Technology, Advanced Technology Planning (ATP) Office under the direction of Barbara Morgan. ImageQuery2.0 is currently maintained by the Museum Informatics Project (MIP). ImageQuery SPIRO permits access to the collection by ten access points: period; place; creator name; object name; view type; subject terms from the Art and Architecture Thesaurus; source of image; creation dates; classification number; image identification number. The vast majority of images in SPIRO are copyrighted."
IMAGES 1 (on-line images of the National Library of Australia's Pictorial Collection) contains over 15,000 historical and contemporary images relating to Australia and its place in the world, including paintings, drawings, rare prints, objects and photographs. The images have been selected from more than 40,000 paintings, drawings and prints and more than 550,000 photographs held in the National Library's Pictorial Collection. Topics covered include first impressions of Australia, convict days, gold mining and Australian towns.
IMAGES 1 offers a number of search options to enhance access to the images including searching by the creator (for example photographer or artist; other names associated with a work or collection; title; subject; the image number in the database; and by format (for example, watercolor or photograph).
Founded in 1989 by Bill Gates, the head of Microsoft, Corbis is a main provider of visual content and services in the digital age, offering more than 20 million photographs and fine-art images (and 1,3 on-line) for access worldwide via the Internet, on CD-ROM disc, and through traditional stock catalogs. The images includes contemporary stock photography, photojournalism, archival photography, and royalty-free images, available to both creative professionals and private consumers.
7.4. Future Trends for Digital Libraries
The quick development of digital libraries leads us to define the role of the digital library, a very recent concept, relating to the much older "traditional" library, and vice versa.
In the same way that the paper document is not going to be "killed" by the electronic document, at least not in the near future, many librarians believe the "traditional" library is not going to be "killed" by the digital library.
When interviewed by Jérôme Strazzulla in Le Figaro of June 3, 1998, Jean-PierreAngremy, president of the French National Library (Bibliothèque nationale deFrance) stated: "We cannot, we will not be able to digitize everything. In thelong term, a digital library will only be one element of the whole library".
Digital libraries give instant access to many works in the public domain. They also give instant access to old and rare texts and images. The full-screen images are still quite long to download, so many sites were backed up to present small images, so as not to ask too much from the cybernaut's patience. Most of the time a bigger format can be requested by clicking on the selected image. This problem should be solved in the future with improvements in data transmission.
The digital libraries also further the textual research on one or several works at the same time, such as the works of Shakespeare, Dante's Divine Comedy, different versions of The Bible, etc.
The major problem of the cyberlibrary is the fact that recent documents cannot be posted because they don't belong to the public domain. Some projects, like DOI: The Digital Identifier System, an identification system for digital media, will enable automated copyright management systems.
Another problem is format harmonization, to allow the downloading of the texts by any hardware and software. Libraries often choose the ASCII format (ASCII: American standard code for information interchange) or the SGML format (SGML: standard generalized markup language).
Many organizations are involved in research relating to digital libraries.
Sponsored by the The Library of UC Berkeley and Sun Microsystems, SunSITE is the site where the Berkeley Digital Library builds digital collections and services while providing information and support to others doing the same. Its contents are: catalogs and indexes; help/search tools and administrative info; Java corner; teaching and training; text and image collections; information for digital library developers; research and development: where digital libraries are being built; tools: software for building digital libraries.
The Digital Library Technology (DLT) Project supports the development of new technologies to facilitate public access to the data of NASA (National Aeronautics and Space Administration) via computer networks, particularly technologies that develop tools, applications, and software and hardware systems that are able to scale upward to accommodate evolving user requirements and order-of-magnitude increases in user access.
The Stanford Universities Digital Libraries Project deals primarily with computing literature, with a strong focus on networked information sources. It is one participant among five universities of the Digital Library Initiative, supported by the NSF (National Science Foundation), DARPA (Defense Advanced Research Projects Agency), and NASA (National Aeronautics and Space Administration). "The Initiative's focus is to dramatically advance the means to collect, store, and organize information in digital forms, and make it available for searching, retrieval, and processing via communication networks - all in user-friendly ways."
Library 2000 gives the historical record of a project held by the MIT Laboratory for Computer Science (MIT: Massachusetts Institute of Technology) between Fall 1995 and February 1998. Library 2000 was a computer systems research project that explored the implications of large-scale on-line storage using the future electronic library as an example. The project was pragmatic, developing a prototype using the technology and system configurations expected to be economically feasible in the year 2000.
Based at the Corporation for National Research Initiatives (CNRI), the D-Lib Program supports the community of people with research interests in digital libraries and electronic publishing. D-Lib Magazine, the magazine of digital library research, is a monthly compilation of contributed stories, commentary, and briefings.
The International Federation of Library Associations and Institutions (IFLA) provides a very interesting section Electronic Collections and Services.
[In this chapter:]
[8.1. Library Catalogs / 8.2. International Bibliographic Databases / 8.3. Future Trends for On-line Catalogs]
Why a whole chapter on catalogs? Because, even if most of them are not yet user-friendly and are still in the domain of information specialists, they are essential to students, researchers, and anybody who needs a particular document or wants to know more about a specific topic.
Until now, the catalogs could easily be reproached as being complicated to deal with, and above all for giving the references of the documents but never giving access to their contents and full-text. All this is now changing. Catalogs on the Web have become more attractive and user-friendly. And, in an emerging trend, catalogs have begun to give instant access to some documents, for example, the works listed in The Universal Library which can be accessed through the Experimental Search System (ESS) of the Library of Congress.
8.1. Library Catalogs
Two catalogs, those of The British Library and the Library of Congress, are impressive bibliographic tools, freely available to all Internet users. They include many documents published in non-English languages.
In May 1997, The British Library launched OPAC 97, which provides free access via the World Wide Web to the catalogs of the major British Library collections in London and Boston Spa. For a wider range of databases and many additional facilities, the British Library offers Blaise, an on-line bibliographic information service (which you must pay for), and Inside, article title records from 20,000 journals and 16,000 conferences. As explained on the website:
"The Library's services are based on its outstanding collections, developed over 250 years, of over one hundred and fifty million items representing every age of written civilisation, every written language and every aspect of human thought. At present individual collections have their own separate catalogues, often built up around specific subject areas. Many of the Library's plans for its collections, and for meeting its users' needs, require the development of a single catalogue database. This is being pursued in the Library's Corporate Bibliographic Programme which seeks to address this issue."
The reference collections represented on OPAC 97 comprise:
a) Modern books and periodicals from Britain and overseas;
b) Humanities and Social Sciences collection (from 1975), which include: humanities and social sciences information; popular science and psychology holdings; modern oriental holdings; rich resources relating to Africa; Hispanic materials relating to Spain, Portugal, Portuguese North Africa and Latin America; one of Europe's largest collection relating to Slavonic, East European and Soviet studies;
c) Science, Technology and Business collection (from 1975);
d) Music collection (1980- ), one of the world's finest collections of printed music;
e) Older books and periodicals from Britain and overseas;
f) Older reference material collection (to 1975 only), incomparable holdings of early printing from Britain and overseas Western and Oriental materials from the beginning of writing, including: archives and materials assembled by the former India Office; rich resources relating to Africa; Hispanic materials relating to Spain, Portugal, Portuguese North Africa and Latin America (one of Europe's largest collections relating to Slavonic, East European and Soviet studies); historical resources for scientific, technological and business information; and musical works.
The Document Supply collections represented on OPAC 97 are comprised of:
a) Books and reports collection (from 1980), which covers millions of British and overseas books, reports and UK theses;
b) Journals/Serials collection (from 1700), including half a million British and overseas periodicals (journals and serials);
c) Conference collection (from 1800), which is the world's largest collection of conference proceedings.
Parts of the current systems are now 20 years old. The basic design of the systems is no longer in line with current business needs and the fact that the British Library's software is out of date is often a hindrance, particularly as concerns cooperation with other organizations. The British Library has therefore decided to replace these systems, and the Corporate Bibliographic Programme is charged with implementing this decision.
The key objectives of the Programme, as summarized on their website, are:
"- To ensure the continuation of essential processes and services, i.e. creating, maintaining and providing access to catalogue data;
- to make these processes and services more efficient and effective; and
- to provide a basis for future developments which will support the Library's strategic objectives and be in line with the Library's information systems strategy."
The Library of Congress Catalogs can be searched using four different methods: a) Word Search; b) Browse Search; c) Command Search; and d) Experimental Search System (ESS).
a) The Word Search's Z39.50 Gateway provides a simple search form for authors and title queries and an advanced search form allowing the use of Boolean operators (and, or, and not), with searches for subjects, names, titles, series, notes, and various numbers. Some of these records have direct links to digitized materials.
b) The Browse Search allows the user to browse and then select from alphabetical indexes for the Library's catalogs, including subject cross references. One can browse by subject, author (personal, corporate), conference, title, series, Library of Congress Classification (partial call number), Dewey Decimal Number, and standard numbers like the ISBN (international standard book number), the ISSN (international standard serial number), and the LCCN (Library of Congress control number).
c) The Command Search allows the use of commands which can be typed to search for words and to browse indexes for the Library's catalogs, and for additional non-catalog files. This method provides access to LOCIS (the Library of Congress Information System, which is the original mainframe-based retrieval system), with browsable indexes, word searches, Boolean combinations, various display options, set creation, and advanced features for limiting and refining search results. This method requires the Internet Telnet function (either Telnet or tn3270) in order to connect to LOCIS. The Telnet capability comes with most WWW browsers, but must be configured.
d) The Experimental Search System (ESS), currently located in the LC Web research and development area, supports relevancy-ranked searching of catalog records, as well as sorting and e-mailing search results. Special search features include analyzing results by subject heading and "browsing" the shelf for items with similar LC call numbers. Some of these records have direct links to digitized materials, including selected full-text, image, video and audio files, at the Library of Congress and elsewhere. This is a test system and results may not be all inclusive.
The catalog records relate to books (9,543,910 as of December 10, 1998), maps (171,756), serials (825,664), prints and photographs (68,135), manuscripts (10,698), music (209,142), visual materials (278,771) and software (6,318). As explained on the website:
"The Experimental Search System (ESS) is one of the Library of Congress' first efforts to make selected cataloging and digital library resources available over the World Wide Web by means of a single, point-and-click interface. The interface consists of several search query pages (Basic, Advanced, Number, and a Browse screen) and several search results pages (an item list of brief displays and an item full display), together with brief help files which link directly from significant words on those pages. By exploiting the powerful synergies of hyperlinking and a relevancy-ranked search engine (InQuery from Sovereign Hill Software), we hope the ESS will provide a new and more intuitive way of searching the traditional OPAC (on-line public access catalog). […]
Besides the cataloging records for over 4 million books (including JACKPHY records not currently available through SCORPIO); 263,000 motion pictures, videos, filmstrips and other visual work; 200,000 sound recordings and musical scores; more than 150,000 maps; and 4,300 computer files - i.e., LC cataloging records created since 1968 - ESS also contains the cataloging for almost 140,000 photographs and manuscripts in the National Digital Library Program's American Memory, linking to more than 70,000 digital photographs and images available on-line. By indexing the works selected and organized by The On-Line Books Page at Carnegie Mellon University, links are also provided to the full-text of over 2,500 on-line books from sites across the Internet. Even early motion pictures are available for searching and viewing once the proper viewer is installed. (Hint: try searching on the subject heading 'shorts' in the Photographs, Manuscripts, Movies collection.)"
Except for their prohibitive costs, the commercial databases give us an idea of what the catalogs could be in the future: for the past several years the Dialog Corporation, Lexis-Nexis or UnCover have been using their catalogs to provide on-line documents.
Based in London, United Kingdom, with regional headquarters in Mountain View, California, and Hong Kong, the Dialog Corporation is a major on-line information company, with 900 main databases (the most well-known being Dialog and Profound) serving over 20,000 corporate clients in 120 countries. Content areas include: news and media; medicine; pharmaceuticals; chemicals; reference; social sciences; business and finance; food and agriculture; intellectual property; government and regulations; science and technology; and energy and environment.
LEXIS-NEXIS is an international provider of enhanced information services and management tools using on-line, Internet, CD-ROM and hardcopy formats for a variety of professionals. It serves customers in more than 60 countries. The 25-year old company has introduced Web products for business, legal and academic research, current awareness, and both standard and customizable tracking of competitive and business subjects and companies on a daily basis.
A service of CARL Corporation, UnCover is both a fax reprint service and the world's largest database of magazine and journal articles, with current article information taken from well over 17,000 multidisciplinary journals. UnCover contains brief descriptive information for over 7,000,000 articles which have appeared since Fall 1988. Any Internet surfer can use the free keyword access to article titles and summaries.
8.2. International Bibliographic Databases
Two organizations, the OCLC Online Computer Library Center and the Research Library Information Network (RLIN), run international databases of bibliographic information through the Internet.
The OCLC Online Computer Library Center is a nonprofit, membership, library computer service and research organization dedicated to the public purposes of furthering access to the world's information and reducing information costs. More than 27,000 libraries in 65 countries use OCLC services to manage their collections and to provide on-line reference services. The site is available in English, Chinese, French, German, Portuguese, and Spanish.
OCLC Services include: access services; collections and technical services;reference services; resource sharing; Dewey Decimal Classification (published inOCLC Forest Press); and preservation resources. From its headquarters in Dublin,Ohio, OCLC operates one of the world's largest library information networks.Libraries in the United States join OCLC through their OCLC-affiliated RegionalNetworks. Libraries outside the United States receive OCLC services through OCLCAsia Pacific, OCLC Canada, OCLC Europe, OCLC Latin America and the Caribbean, orvia international distributors.
OCLC also runs WorldCat, name of the OCLC Online Union Catalog, which is a merged electronic catalog of libraries around the world, and probably the world's largest bibliographic database with its 38 million records (at the beginning of 1998) in 400 languages (with transliteration for non-Roman languages), and an annual increase of 2 million bibliographic records.
WorldCat is derived from a concept which is the same for all union catalogs: earn time to avoid the cataloguing of the same document by many catalogers worldwide. When they are about to catalog a publication, the catalogers of the member libraries search the OCLC catalog. If they find the corresponding record, they copy it in their own catalog and add some local information. If they don't find the record, they create it in the OCLC catalog, and this new record will immediately be available to all the catalogers of the member libraries worldwide.
Unlike RLIN, another international bibliographic database (see below) which accepts several records for the same document, the OCLC Online Union Catalog takes into consideration only one record per document, and emphatically requests its members not to create double records for documents which have already been cataloged. The records are created in USMARC format (MARC: machine readable catalog) according to the Anglo-American Cataloguing Rules, 2nd version (AACR2).
What is the history of OCLC? According to the website:
"In 1967, the presidents of the colleges and universities in the state of Ohio founded the Ohio College Library Center (OCLC) to develop a computerized system in which the libraries of Ohio academic institutions could share resources and reduce costs.
OCLC's first offices were in the Main Library on the campus of the Ohio State University (OSU), and its first computer room was housed in the OSU Research Center. It was from these academic roots that Frederick G. Kilgour, OCLC's first president, oversaw the growth of OCLC from a regional computer system for 54 Ohio colleges into an international network. In 1977, the Ohio members of OCLC adopted changes in the governance structure that enabled libraries outside Ohio to become members and participate in the election of the Board of Trustees; the Ohio College Library Center became OCLC, Inc. In 1981, the legal name of the corporation became OCLC Online Computer Library Center, Inc. Today, OCLC serves more than 27,000 libraries of all types in the U.S. and 64 other countries and territories."
Both complementary and different from the OCLC Online Catalog (WordCat) with its 38 million records (with one record per document), the Research Libraries Information Network (RLIN) includes 88 million records (with several records per document).
RLIN is run by by the Research Libraries Group (RLG). The central RLIN database is a union catalog of nearly 88 million items held in comprehensive research libraries and special libraries in RLG member institutions, plus over 100 additional law, technical, and corporate libraries using RLIN. It includes:
a) Records that describe works cataloged by the Library of Congress, the National Library of Medicine, the U.S. Government Printing Office, CONSER (Conversion of Serials Project), The British Library, the British National Bibliography, the National Union Catalog of Manuscript Collections, and RLG's members and users;
b) Comprehensive representation of books cataloged since 1968 and rapidly expanding coverage for older materials;
c) Information about non-book materials ranging from musical scores, films, videos, serials, maps, and recordings, to archival collections and machine-readable data files;
d) Unique on-line access to special resources, such as the United Nations' DOCFILE and CATFILE records, and the Rigler and Deutsch Index to pre-1950 commercial sound recordings; and
e) International book vendors' in-process records that can be transferred by bibliographers, acquisitions libraries, and catalogers to create citations, order records, and cataloging in their local systems.
In RLIN, particularly valuable sources of processing information are available on-line:
a) A catalog of computer files: Machine-readable data files are of value to a growing number of disciplines. RLIN contains records describing a wide array of such files, from the full-text French literary works in the ARTFL Database to the statistical data collected by the Inter-university Consortium for Political and Social Research (ICPSR) at the University of Michigan;
b) A catalog of archives and special collections: The archival and manuscript collections of research libraries, museums, state archives, and historical societies contain essential primary resources, but information about their contents has often been elusive. Archivists and curators worked with RLG to create an automated format for these collections. There are close to 500,000 records available in RLIN for archival collections located throughout North America. These records analyze many collections by personal name, organization, subject, and format.
Complementing the central bibliographic files of RLIN is the English Short Title Catalogue (ESTC), an invaluable research tool for scholars in English culture, language, and literature. This file provides extensive descriptions and holdings information for letterpress materials printed in Great Britain or any of its dependencies in any language, from the beginnings of print to 1800 - as well as for materials printed in English anywhere else in the world. Produced by the ESTC editorial offices at the University of California, Riverside, and the British Library, in partnership with the American Antiquarian Society and over 1,600 libraries worldwide, the file continues to be updated and expanded daily. ESTC serves as a comprehensive bibliography of the hand-press era and as a census of surviving copies.
ESTC included 420,000 records as of June 1998. It contains records for items of all types published in Great Britain and its dependencies or in English anywhere in the world from the beginnings of print (1473) through the 18th century - including materials ranging from Shakespeare and Greek New Testaments to anonymous ballads, broadsides, songs, advertisements and other ephemera. Extensive indexing includes imprint word, place, genre, and year as well as copy-specific notes. Searches may also be limited by date, language and country of publication.
8.3. Future Trends for On-Line Catalogs
The future of catalogs is linked to the harmonization of the MARC format. While MARC is an acronym for Machine Readable Catalogue or Cataloguing, this general description is rather misleading as MARC is neither a kind of catalogue nor a method of cataloguing. According to UNIMARC: An Introduction, a document of the Universal Bibliographic Control and International MARC Core Programme, MARC is "a short and convenient term for assigning labels to each part of a catalogue record so that it can be handled by computers. While the MARC format was primarily designed to serve the needs of libraries, the concept has since been embraced by the wider information community as a convenient way of storing and exchanging bibliographic data."
MARC II established certain principles which have been followed consistently over the years. In general terms, the MARC communication format is intended to be:
"- hospitable to all kinds of library materials;
- sufficiently flexible for a variety of applications in addition to catalogue production; and
- usable in a range of automated systems."
Over the years, however, despite cooperation efforts, several versions of MARC emerged, e.g. UKMARC, INTERMARC and USMARC, whose paths diverged because of different national cataloguing practices and requirements. Since the early 1970s an extended family of more than 20 MARC formats has evolved. Differences in data content means that editing is required before records can be exchanged.
One solution to the problem of incompatibility was to create an international MARC format (UNIMARC) which would accept records created in any MARC format. Records in one MARC format could be converted into UNIMARC and then be converted into another MARC format, so that each national agency would need to write only two programs - one to convert into UNIMARC and one to convert from UNIMARC - instead of one program for each other MARC format, (e.g. INTERMARC to UKMARC, USMARC to UKMARC etc.).
In 1977 the International Federation of Library Associations and Institutes (IFLA) published UNIMARC: Universal MARC format, followed by a second edition in 1980 and a UNIMARC Handbook in 1983, all focussed primarily on the cataloguing of monographs and serials, and taking advantage of international progress towards the standardization of bibliographic information reflected in the ISBDs (international standard bibliographic descriptions). In the mid-1980s it was considered necessary to expand UNIMARC to cover documents other than monographs and serials, so a new description of the format - the UNIMARC Manual -was produced in 1987. By this time UNIMARC had been adopted by several bibliographic agencies as their in-house format. But developments did not stop there. Increasingly, a new kind of format - an authorities format - was being used. As described in the website:
"Previously agencies had entered an author's name into the bibliographic format as many times as there were documents associated with him or her. With the new system they created a single authoritative form of the name (with references) in the authorities file; the record control number for this name was the only item included in the bibliographic file. The user would still see the name in the bibliographic record, however, as the computer could import it from the authorities file at a convenient time. So in 1991 UNIMARC/Authorities was published."
The Permanent UNIMARC Committee, charged with regularly supervising the development of the format, came into being that year, as users realized that continuous maintenance - not just the occasional rewriting of manuals - was needed. In maintaining the format, care is taken to make changes upwardly compatible.
In the context of MARC harmonization, The British Library (using UKMARC), the Library of Congress (using USMARC) and the National Library of Canada (using CAN/MARC) are in the process of harmonizing their national MARC formats. A three-year program to achieve a common MARC format was agreed on by the three libraries in December 1995.
Other organizations recommend the use of SGML (standard generalized markup language) as a common format for the bibliographic records and the corresponding hypertextual and multimedia documents.
As most of the publishers use the SGML format to store their documents, a convergence between MARC and SGML is expected to occur. The Library of Congress set up the DTD (definition of type of document, which defines its logical structure) for the USMARC format, because it will probably sell more and more data both in SGML and in USMARC. A DTD for the UNIMARC format has also been developed within the European Union. In his study L'accès aux catalogues des bibliothèques par Internet (The Access to Library Catalogs through the Internet), Thierry Samain specifies that some libraries choose the SGML format to encode their bibliographic data. In the Belgian Union Catalog, for example, the use of SGML allows one first to add descriptive elements stemming from the MARC format and other formats, and second to facilitate the production of the annual CD-ROM.
The libraries also have to adapt their thesauri and their key-word lists. In international bibliographic databases like the OCLC Online Union Catalog, the absence of a universal thesaurus is a real problem when you try to find documents using the search by subjects. In Europe, each country uses thesauri or key-word lists in its own language, whereas multilingual thesauri would be essential.
Another problem is the harmonization of software. From January to December 1997, ONE (OPAC Network in Europe) was a collaborative project involving 15 organizations in eight European countries. This project provided library users with better ways to access library OPACs (online public access catalogs) and national catalogs, and stimulated and facilitated interworking between libraries in Europe.
Because of international rules, catalog records are often much more difficult to establish today than in the past. That is why nowadays libraries often hire full-time catalogers. Because of the knowledge and the training it requires, cataloging has become a specialty in librarianship.
In a few years, catalogs on the Web will no longer be "only" a collection of records, which is often a prelude to a difficult time finding the document itself - because of the forms to fill out and the difficulties of interlibrary loans. Catalogs on the Web will give instant access to the documents on the screen. This is already true in an experimental way for a few thousands documents, but has to be progressively widened to all catalogs.
[In this chapter:]
[9.1. Print Media and the Internet / 9.2. Intellectual Property / 9.3. Multimedia Convergence / 9.4. The Information Society]
9.1. Print Media and the Internet
As shown all throughout this study, the Internet is opening new perspectives in all the sectors of the print media.
In any field (literature, sciences, technology, etc.), authors can create a website to post their works - they no longer need to wait for a publisher to distribute them. And, thanks to e-mail, communication with their readers has become much easier.
On-line booksellers are able not only to sell books published in their own country, but also sell foreign books or sell abroad, or both. The readers can read on their screen excerpts or full texts of books. Many on-line bookstores offer an extensive literary magazine with an editorial content which changes every day.
The dream of catalog managers to be able to give access to a document through its bibliographic record is no longer totally utopian. It is already the case for a few thousand works belonging to public domain. Organizations are also studying the possibility of posting commercial documents on the Web, in return for a royalty tax corresponding to the copyright rights, which could be paid by credit card.
Libraries have a new tool for letting the public know their collections better, and for developing projects for real or potential users. The Internet is also a gigantic encyclopedia, easily available for consultation by the libraries' staff and readers.
Many newspapers and magazines' latest issues are available on-line, as well as "dossiers" on current events and archives equipped with a search engine to find information from previous issues. We are also witnessing the first steps of an on-line press which would be different from the paper version and would have its own criteria. Some publishers of specialized periodicals, as well as academic and research works, are thinking about becoming "only" electronic to escape the paper publishing crisis, or making only small print runs when necessary.
Besides this gigantic and lively encyclopedia, the people working in these different fields can increase exchanges thanks to electronic mail and discussion forums. For once, a (relatively) cheap new tool permits people to communicate quickly and worldwide with no concern for time and boundaries.
The disruption of the print media by the Internet has led to new perspectives for intellectual property and regulations about cyberspace. The so-called "multimedia convergence" has led to major changes in jobs. We are living the first years of the information society. Will this society provide any changes for the better?
9.2. Intellectual Property
The massive arrival of electronic texts on the Web is a real problem for applying the rules relating to intellectual property. Digital libraries, for example, would like to post commercial documents but can't do so yet, until there is a system allowing the surfer to pay the equivalent royalties. With a few clicks, any text or article posted on the Internet can be very easily retrieved and copied - much more easily than by photocopying - without its author being paid for the use of his text. And what about all the hyperlinks giving access to all kinds of documents from one website?
The World Intellectual Property Organization (WIPO), an intergovernmental organization which is one of the 16 specialized agencies of the United Nations System of Organizations, says on its website:
"As regards the number of literary and artistic works created worldwide, it is difficult to make a precise estimate. However, the information available indicates that at present around 1,000,000 books/titles are published and some 5,000 feature films are produced in a year, and the number of copies of phonograms sold per year presently is more than 3,000 million."
WIPO is responsible for the promotion of the protection of intellectual property throughout the world through cooperation among States, and for the administration of various multilateral treaties dealing with the legal and administrative aspects of intellectual property. Intellectual property comprises two main branches: (1) industrial property, chiefly in inventions, trademarks, industrial designs, and appellations of origin; and (2) copyright, chiefly in literary, musical, artistic, photographic and audiovisual works.
Copyright protection generally means that certain uses of the work are lawful only if they are done with the authorization of the owner of the copyright. As explained by WIPO in International Protection of Copyright and Neighboring Rights, the most typical are the following:
"the right to copy or otherwise reproduce any kind of work; the right to distribute copies to the public; the right to rent copies of at least certain categories of works (such as computer programs and audiovisual works); the right to make sound recordings of the performances of literary and musical works; the right to perform in public, particularly musical, dramatic or audiovisual works; the right to communicate to the public by cable or otherwise the performances of such works and, particularly, to broadcast, by radio, television or other wireless means, any kind of work; the right to translate literary works; the right to rent, particularly, audiovisual works, works embodied in phonograms and computer programs; the right to adapt any kind of work and particularly the right to make audiovisual works thereof."
Under some national laws, some of these rights - which together are referred to as 'economic rights' - are not exclusive rights of authorization but, in certain specific cases, merely rights to remuneration. In addition to economic rights, authors (whether or not they own the economic rights) enjoy 'moral rights' on the basis of which authors have the right to claim their authorship and require that their names be indicated on the copies of the work and in connection with other uses thereof, and they have the right to oppose the mutilation or deformation of their works.
Started in July 1993, the International Trade Law (ITL) Monitor was one of the very first law-related WWW sites, and the first dedicated to a particular area of law. The site is run by Ralph Amissah, and hosted by the Law Faculty of the University of Tromso, Norway. The section relating to Protection of Intellectual Property gives access to various documents, including the European Commission Legal Advisory Board (LAB): Intellectual Property.
Until the payment of royalties for copyright is possible on the Web, digital libraries focus on 19th-century texts, or older texts, which belong to public domain. In many countries, a text enters the public domain 50 years after his author's death.
In Clearing an Etext for Copyright, Michael Hart gives Project Gutenberg's volunteers some rules of thumb for them to determine when works enter the public domain. For the United States:
a) Works first published before January 1, 1978 usually enter the public domain 75 years from the date copyright was first secured, which is usually 75 years from the date of first publication. (This is the rule Project Gutenberg uses most often).
b) Works first created on or after January 1, 1978 enter the public domain 50 years after the death of the author if the author is a natural person. (Nothing will enter the public domain under this rule until at least January 1, 2023.)
c) Works first created on or after January 1, 1978 which are created by a corporate author enter the public domain 75 years after publication or 100 years after creation whichever occurs first. (Nothing will enter the public domain under this rule until at least January 1, 2053).
d) Works created before January 1, 1978 but not published before that date are copyrighted under rules 2 and 3 above, except that in no case will the copyright on a work not published prior to January 1, 1978 expire before December 31, 2002. (This rule copyrights a lot of manuscripts that we would otherwise think of as public domain because of their age.).
e) If a substantial number of copies were printed and distributed in the U.S. without a copyright notice prior to March 1, 1989, the work is in the public domain in the U.S."
When Project Gutenberg distributes in the United States, U.S. law applies. When it distributes to other countries, local law applies.
Project Gutenberg and The On-Line Books Page, among others, are concerned with the new Copyright Extension. On October 28, 1998, John Mark Ockerbloom wrote in the News of The On-Line Books Page:
"The copyright extension bill mentioned in the October 9 news item is now law, having been signed by President Clinton on October 27. This will prevent books published in 1923 and later that are not already in the public domain from entering the public domain in the United States for at least 20 years.
I have started a page to provide access to copyright renewal records, which eventually should make it easier to find books published after 1922 that have entered the public domain due to nonrenewal. I welcome contributions of additional records, in page image, text, or HTML format.
Although the bill has become law, I would encourage readers to speak loudly in support of the public domain. Congressional testimony indicates that some in the entertainment industry favor even longer copyright periods, effectively preventing anything further from ever entering the public domain. Your voice is needed to help stop this from happening."
Journalists, too, are particularly concerned by this problem of intellectualproperty rights. During the ILO Symposium on Multimedia Convergence held inJanuary 1997, Bernie Lunzer, Secretary-Treasurer of the Newspaper Guild, UnitedStates, stated:
"There is a huge battle over intellectual property rights, especially with freelancers, but also with our members who work under collective bargaining agreements. The freelance agreements that writers are asked to sign are shocking. Bear in mind that freelance writers are paid very little. They turn over all their future rights - reuse rights - to the publisher and very little in exchange. Publishers are fighting for control and ownership of product, because they are seeing the future."
Another participant to this Symposium, Heinz-Uwe Rübenach, of the FederalAssociation of German Newspaper Publishers (Bundesverband DeutscherZeitungsverleger), said:
"Copyright is one of the keys to the future information society. If a publishing house which offers the journalist work, even on an on-line service, is not able to manage and control the use of the resulting product, then it will not be possible to finance further investments in the necessary technology. Without that financing, the future becomes less positive and jobs can suffer. If, however, publishers see that they are able to make multiple use of their investment, then obviously this is beneficial for all. Otherwise the costs associated with on-line services would increase considerably. As far as the European market is concerned, this would only increase competitive pressures, since United States publishers do not have to pay for multiple uses."
DOI: The Digital Object Identifier System is an identification system for intellectual property in the digital environment. Developed by the DOI Foundation on behalf of the publishing industry, its goals are to provide a framework for managing intellectual content, link customers with publishers, facilitate electronic commerce, and enable automated copyright management.
The Introduction to the Digital Object Identifier specifies:
"The Internet represents a totally new environment for commerce. As such, it requires new enabling technologies to protect both customer and publisher. Systems will have to be developed to authenticate content to insure that what the customer is requesting is what is being delivered. At the same time, the creator of the information must be sure that the copyright in the content is respected and protected.
In considering the new systems required, international book and journal publishers realized that a first step would be the development of a new identification system to be used for all digital content. This Digital Object Identifier (DOI) system not only provides a unique identification for that content, but also a way to link users of the materials to the rights holders themselves to facilitate automated digital commerce in the new digital environment.
Developed and tested over the last year, the DOI system is now being used by more than a dozen U.S. and European publishers in a pilot program that has been running since July. Participation in Phase Two of the Prototype was extended to all publishers at the Frankfurt Book Fair in October 1997."
Penny Pagano, a former Washington correspondent for the Los Angeles Times, is a Washington, D.C.-based freelance writer. In Intellectual Property Rights and the World Wide Web, an article published in AJR/NewsLink, she wrote: "Today, those who create information and those who publish, distribute and repackage it are finding themselves at odds with each other over the control of electronic rights."
Among many comments mentioned by Penny Pagano is the one of Dan Carlinksy, writer and vice president of the American Society of Journalists and Authors, in New York.
"'The electronic explosion has changed the entire nature of the business,' Carlinsky says. In the past, articles sold to a periodical essentially 'turned into a pumpkin with no value' once they were published. 'But the electronic revolution has extended the shelf life of content of periodicals. You can now take individual articles and put them into a virtual bookstore or put them on a virtual newsstand.'
The second major change in recent years, he says, is 'an increasing trend to more and more publications being owned by fewer larger and larger companies that tend to be international media conglomerates. They are connected corporately with an enormous array of enterprises that might be interested in secondary use of materials'."
To get secondary rights, "the National Writers Union has created a new agency called the Publication Rights Clearinghouse (PRC). Based on the music industry's ASCAP [American Society of Composers, Authors, and Publishers], PRC will track individual transactions and pay out royalties to writers for secondary rights for previously used articles. For $20, freelance writers who have secondary rights to previously published articles can enroll in PRC. These articles become part of a PRC file that is licensed to database companies." Several companies participate, including UnCover, both a fax reprint service and the world's largest database of magazine and journal articles.
9.3. Multimedia Convergence
Because of computerization and communication technologies, previously distinct information-based industries, such as printing and publishing, graphic design, the media, sound recording and film-making, are converging into one industry. Information is their common product.
Wilfred Kiboro, Managing Director and Chief Executive of Nation Printers andPublishers Ltd, Kenya, made the following comments during the ILO Symposium onMultimedia Convergence held in January 1997:
"In content creation in the multimedia environment, it is very difficult to know who the journalist is, who the editor is, and who the technologist is that will bring it all together. At what point will telecom workers become involved as well as the people in television and other entities that come to create new products? Traditionally in the print media, for instance, we had printers, journalists, sales and marketing staff and so on, but now all of them are working on one floor from one desk."
Journalists and editors working on-screen could go directly from text to page make-up, which eliminated the need for rekeying and shifted preliminary typesetting functions from the production to the editorial staff. In book publishing, digitization has speeded up the editorial process, which used to be sequential, by allowing the copy editor, the art editor and the layout staff to work at the same time on the same book.
Employers try to convince us that the use of new information and communication technologies will create new jobs, whereas unions are sure of the contrary.
Heinz-Uwe Rübenach, of the Federal Association of German Newspaper Publishers (Bundesverband Deutscher Zeitungsverleger), carried out an inquiry relating to the on-line services and the staff of European newspaper publishers.
"The responses revealed that in the United Kingdom, Denmark, Sweden, Finland and France there were on average three employees, that is journalists, in each on-line service. These were newly employed people who had not originally come from more conventional newspaper activities. In Germany, an average of six permanent jobs are created per on-line service and roughly five freelance positions as well. There were no jobs lost in publishing houses as a result of the new activities of newspapers in on-line services. These figures, while not totally representative or complete, do indicate a general trend, which is that when newspapers add on-line services to their activities, jobs are created."
However it is difficult to admit that the information society would generate jobs, and it is already stated worldwide that multimedia convergence leads to massive loss of jobs. In the same Symposium, Michel Muller, Secretary-General of the French Federation of Book, Paper and Communication Industry (Fédération des industries du livre, du papier et de la communication), stated that, in France, the graphics industry had lost 20,000 jobs - falling from 110,000 to 90,000 - within the last decade, and that very expensive social plans had been necessary to re-employ those people. He explained:
"If the technological developments really created new jobs, as had been suggested, then it might have been better to invest the money in reliable studies about what jobs were being created and which ones were being lost, rather than in social plans which often created artificial jobs. These studies should highlight the new skills and qualifications in demand as the technological convergence process broke down the barriers between the printing industry, journalism and other vehicles of information. Another problem caused by convergence was the trend towards ownership concentration. A few big groups controlled not only the bulk of the print media, but a wide range of other media, and thus posed a threat to pluralism in expression. Various tax advantages enjoyed by the press today should be re-examined and adapted to the new realities facing the press and multimedia enterprises. Managing all the social and societal issues raised by new technologies required widespread agreement and consensus. Collective agreements were vital, since neither individual negotiations nor the market alone could sufficiently settle these matters."
Quite theoretical compared to the unionists' interventions, the answer of Walter Durling, Director of AT&T Global Information Solutions, was that humanity must not fear technology:
"Technology would not change the core of human relations. More sophisticated means of communicating, new mechanisms for negotiating, and new types of conflicts would all arise, but the relationships between workers and employers themselves would continue to be the same. When film was invented, people had been afraid that it could bring theatre to an end. That has not happened. When television was developed, people had feared that it would do away cinemas, but it had not. One should not be afraid of the future. Fear of the future should not lead us to stifle creativity with regulations. Creativity was needed to generate new employment. The spirit of enterprise had to be reinforced with the new technology in order to create jobs for those who had been displaced. Problems should not be anticipated, but tackled when they arose."
Is it true? People are not so much afraid of the future as they are afraid of losing their jobs. The problem is more the context of a society with a high rate of unemployment, which was not the case when film was invented and television developed. In the information society, what is, and what will be, the percentage of job creations compared to dismissals?
Unions fight worldwide for job creations through investment and innovation, vocational training in the use of new technologies, retraining of workers whose jobs are abolished, fair conditions for the setting-up of contracts and collective conventions, the defense of copyright, a better protection of workers in the artistic field, and the defense of teleworkers as full workers. According to the estimates of the European Commission, there should be 10 million European teleworkers in the year 2000, which would represent 20% of the number of teleworkers worldwide.
Despite all the unions' efforts, will the situation become as tragic as the one described in a report of the International Labour Organization (ILO) suggesting that "in the information age individuals will be 'forced to struggle for survival in an electronic jungle' with 'survival mechanisms' which have been developed over previous decades 's orely tested by change'…"?
In Cyberplanète: notre vie en temps virtuel (Cyberplanet: our life in virtual time) (Paris, Editions Autrement, 1998), Philip Wade et Didier Falkand stated that the United States, Canada and Japan, which are the countries investing the most in new technologies, are also the ones that create the most jobs. A study carried out in February 1997 by Booz.Allen & Hamilton for European Ministers of Industry showed that the European delay has cost one million jobs in 1995 and 1996, because of a technological growth of 2.4% (compared to 9.3% in the United States). According to another study made in January 1997 for the European Commission, 1.3 million jobs could be maintained or created by the European Union between 1997 and 2005. The 300,000 jobs lost in traditional companies would be compensated by 93,000 jobs created by their competitors and 1.2 million jobs created in the sectors of telecommunications, electric and electronic construction, equipment, and distribution of communication products.
Will the traditional distinction between library, publishing house, press publisher or bookstore still exist in a few years? Any writer can create a website, and any website can already create a digital library. More and more libraries, bookstores and publishing houses have no walls, no windows and no shelves. Their premises are their websites, and all the transactions are made on the Web. As for distribution, it is still possible to buy newspapers and magazines at the newsstand or to receive them in the letterbox, but more and more people read them on the Web, and more and more periodicals are "only" electronic.
Will the traditional professional groups (booksellers, editors, librarians, publishers, journalists, etc.) established many years ago stay the same while being more cyberspace-oriented and become cyberlibrarians, cyberpublishers, cyberjournalists, cyberbooksellers, etc.? Or will all these professional tasks be restructured into new professions? With the explosion of the Internet, some information specialists and others decided to move over to companies specialized in computing and the Internet.
Even if information specialists or journalists, for example, convince us they will always be useful, on a general scale the employment trends for the future are far from exciting. Will all the people working in the print media be able to get training and retraining in new technologies, or will they be violently hit by unemployment?
9.4. The Information Society
Jean-Paul, a French musician and writer, wrote in his e-mail of June 21, 1998:
"[…] surfing on the Web operates in rays (I have a centre of interest and I methodically click on all the links included in home pages) or in hops and jumps (from one click to another, as they appear). Of course, it is possible with the print medium. But the difference is striking. So the Internet didn't change my life, but my writing. You don't write the same way for a site as for a script, a play, etc."
He also notes that all the Internet functionalities could already be found in the first Macintosh, which revolutionized the relationship between the user and the information.
"It is not the Internet which changed the way I write, it is the first Mac that I discovered through the self-learning of Hypercard. I still remember how astonished I was during the month when I was learning about buttons, links, surfing by analogies, objects or images. The idea that a simple click on one area of the screen allowed me to open a range of piles of cards, and each card could offer new buttons and each button opened on to a new range, etc. In brief, the learning of everything on the Web that today seems really banal, for me it was a revelation (it seems Steve Jobs and his team had the same shock when they discovered the ancestor of the Mac in the laboratories of Rank Xerox).
Since then I write directly on the screen: I use the print medium only occasionally, to fix up a text, or to give somebody who is allergic to the screen a kind of photograph, something instantaneous, something approximate. It's only an approximation, because print forces us to have a linear relationship: the text is developing page after page (most of the time), whereas the technique of links allows another relationship to the time and the space of the imagination. And, for me, it is above all the opportunity to put into practice this reading/writing 'cycle', whereas leafing through a book gives only an idea - which is vague because the book is not conceived for that."
A very important factor too is the radical change between the book culture and the digital culture. Moving from one to the other as we are doing now deeply changes our relationship to knowledge, because we move from stable information to moving information. During the September 1996 meeting of the International Federation of Information Processing, Dale Spender explained this phenomenon in a very interesting lecture about Creativity and the Computer Education Industry:
"Throughout print culture, information has been contained in books - and this has helped to shape our notion of information. For the information in books stays the same - it endures.
And this has encouraged us to think of information as stable - as a body of knowledge which can be acquired, taught, passed on, memorised, and tested of course.
The very nature of print itself has fostered a sense of truth; truth too is something which stays the same, which endures. And there is no doubt that this stability, this orderliness, has been a major contributor to the huge successes of the industrial age and the scientific revolution. […]
But the digital revolution changes all this. Suddenly it is not the oldest information - the longest lasting information that is the most reliable and useful. It is the very latest information that we now put the most faith in - and which we will pay the most for. […]
Education will be about participating in the production of the latest information. This is why education will have to be ongoing throughout life and work. Every day there will be something new that we will all have to learn. To keep up. To be in the know. To do our jobs. To be members of the digital community. And far from teaching a body of knowledge that will last for life, the new generation of information professionals will be required to search out, add to, critique, 'play with', and daily update information, and to make available the constant changes that are occurring."
The Internet will not do away with the print media, the cinema, the radio or the television. As a new information and communication medium, it is creating its own space while adapting itself to the other media, and vice versa.
From my point of view, the greatest contribution of the Internet to the print media is that people no longer run after information, but that the information is there, available on their screen, and the quantity of this information is really impressive. While, in the beginning, connecting to the Internet was rather complicated for the average user, it has now become simple (for example, with the iMac). One improvement we are all waiting for, however, is a shorter connection time when accessing any website or individual pages we may wish to consult, especially those with many pictures. Let us hope that is coming soon.
But, once more, we have to remember that, as revolutionary as it can be, Internet is still only a means, as stated in Technorealism Overview: "Regardless of how advanced our computers become, we should never use them as a substitute for our own basic cognitive skills of awareness, perception, reasoning, and judgment."
ABU: la bibliothèque universelle (ABU: Association des bibliophiles universels)
AJR/NewsLink (AJR: American Journalism Review)
Amazon.com
American Memory
Athena
Barnesandnoble.com
Bertelsmann
Bielefeld University Library
BitBlioteca
Blackwell's Book Services
British Library
British Library Catalogue
Chaptersglobe.com
Computer Industry Almamach (CIA)
Corbis
Dawson
Dialog
Digital Library Technology (DLT)
D-Lib Program
DOI: The Digital Object Identifier System
EDventure Holdings
E.Journal
Electronic Frontier Foundation (EFF)
English Short Title Catalogue (ESTC)
ETEXT Archives (The)
Everybook
E-Zine-List
Gabriel
Gallica
Gutenberg, see: Project Gutenberg
I*m Europe
Images 1
International Federation of Library Associations and Institutions (IFLA)
International Telecommunication Union (ITU)
International Trade Law (ITL) Monitor
Internet Bookshop (iBS)
Internet Public Library (IPL)
Japanese Text Initiative (The) (JTI)
Liber Liber
Libraries Programme / European Union
Library 2000
Library and Information Science Resources / Library of Congress
Library and Related Resources / University of Exeter
Library Journal Digital (LJDigital)
Library of Congress Catalog
Library of Congress Catalog / Experimental Search System (ESS)
Librius
LibWeb: Library Servers via WWW
Logos
Manuzio Project