Scholarship in the Digital Age: Information, Infrastructure, and the Internet

Author: Christine L. Borgman
Publisher: Cambridge, MA: MIT Press, 2007
Review Published: December 2009

 REVIEW 1: Denise N. Rall

Christine L. Borgman has produced a carefully constructed book with nine succinct chapters, and a gratifyingly large set of references. These chapters are constructed logically around her major points, tied to appropriate references to information theory and current research projects, populated with examples from specific databases and web-based repositories as well as illustrations of scholarly pitfalls in pursuing information without appropriate guidelines or even ethical misconduct (usually from the sciences).

Borgman's first chapter, "Scholarship at the crossroads," deals with the familiar issues of the data deluge, and the problems of scale, trust and reciprocity, and preservation. In brief, as she mentions throughout, data users and content providers have two different agendas. This chapter lays out the conflicts between the various players, the academics, libraries, publishers, content providers, database services, information managers, governments, policy makers, and the problematic nature of technology itself. The data deluge is a tsunami that threatens to swamp us all, and the issues involved are many: peer review, speed of dissemination, who pays, retrieval mechanisms and the ease of access, as well as issues of cost and preservation (9). Our current academic research practices could be defined as evolutionary, revolutionary, or in crisis mode. How these conflicts evolve or are resolved (even partially) will determine the shape of e-Research infrastructures to come.

Chapter 2, "Building the scholarly infrastructure," details the shape of past and present infrastructures, with a few paragraphs on previous organizational strategies, followed by the modern incarnations of systems theory, the Internet, the World Wide Web, the Grid, and digital libraries. In my opinion, grid computing has never been satisfactorily explained, and I am no further enlightened by its inclusion here (it is alluded to as the subject of an upcoming book). As could be anticipated, the section on digital libraries is clear and well-defined. The other sections detail the many initiatives on scholarly infrastructure from several countries, dominantly the US and the UK, but the EU initiatives and those from Japan are also covered. A helpful graphic, entitled "Cyberstructure: Layered Model" is included from the US's National Science Foundation (NSF) (24). It is evident that much funding has been devoted to a number of infrastructural projects, including large-scale partnerships in the fields of astronomy and geoscience, meteorology and oceanography. She concludes that these initiatives will be extremely important for those who pursue "highly distributed, collaborative, multidisciplinary research and learning that relies on large volumes of digital resources" (31).

The third chapter, "Embedded everywhere," addresses theoretical concepts. Borgman includes a brief discussion of Merton's norms of scientific practices, including citation, peer-review, "disinterestedness," universalism, and scholarly publication (36). Next, approaches from STS (science & technology studies) and the social studies of science are followed by longer sections on information theory from Michael Buckland (1991) which may be new to some readers. Borgman cites Hey and Trefethen's views (2003) that data and documents comprise a continuum rather act dichotomously. (Although why this is newsworthy is not too clear; scientists especially rely on their data to generate documents). Geoff Bowker and S.L. Star round out her discussion of informational infrastructures (1999). In this section Borgman also differentiates between infrastructure of and infrastructure for information (emphasis original). Infrastructure of information depends on the database itself; infrastructure for is the sociotechnical system over which any kind of information can (or should) flow. Finally, Paul Wouters (2004) provides an approach to e-Research as infrastructure, identifying the political movement of e-Research, the technological assessment studies, the case studies, and, finally, the studies of networks and connections. Authors active in e-Research literature are mentioned, particularly those in Christine Hine's recent collection (2006).

Chapter 4, "The continuity of scholarly communication," is a very detailed analysis of scholarly publishing. While this could become slightly tedious for those who are not specialists, it is greatly enhanced by a significant graphic, "The dissemination of scientific information in Psychology" (Garvey & Griffith, 1964). From the date of this graphic it is clear that scholarly publishing proceeds much the same as four decades ago, excepting the speed of online publishing. There are sections on formal and informal publication venues, the consideration of online publishing, and important contributions on "process" vs. "structure" of publishing from authors such as Kling and Callahan (2003), Lievrouw (2002), and Nentwich (2003). The structure of "invisible colleges" (Crane 1972) and further discussions of the sociotechnical networks references MacKenzie & Wacjcman (1999) and Latour & Woolgar (1979/1986). Here, the issues of peer review, legitimization, dissemination, access, preservation and curation are dealt with from the information science perspective. This breaks down authorship into various structural components: writer, submitter, linker, and citer (69-74).

Chapter 5, "The discontinuity of scholarly publishing," carries on the theoretical work begun in the previous chapter by reviewing the principles of open science, particularly from the view of policy makers and economists. Borgman notes that "transformative social change is rarely apparent while it is in progress" (76) yet it is evident that the publishing industry is undergoing many transformations -- or crises, depending on one's perspective. This is linked with institutional changes in the educational system, in the US, UK, and around the world. Here is where Borgman details the major arguments so dear to publishing: the issues of copyright, intellectual property, patents and trademarks, and further problems in dissemination, maintenance, and archival methods for printed and online materials. A single concept, such as legitimization, is subsequently broken down into the following categories: "authority, quality control, certification, registration in the scholarly record, priority and trustworthiness" (84). These concepts are vital to information scholars (librarians, data managers, publishers) and perhaps less so to the rest of us. Borgman details pertinent histories of automated information seeking and retrieval, cataloging. The enormity of "databasing the world" requires breakthroughs in preservation, and new economic models for storage. While Borgman leaves nothing to chance, these sections are succinct. Not being a specialist in this area, I was especially pleased with the brevity of the metadata section.

Chapter 5 continues with the very significant survey of distinctions between open science models, open access, public domain, informal communication methods (e.g., blogs, SNSs). I found Borgman's sections on the definitions, motivations behind, and the technology and services for Open Access, her brief discussion of copyright, Open Commons -- the "information commons" as it is now called -- particularly helpful (100-114). I was less interested in the economic models and those are not elaborated.

The title of Chapter 6, "Data: Input and output of scholarship," begins with a significant definition, that Borgman takes almost as a standpoint: "the value chain of scholarship" (116). The concept began with Michael Porter (1985) to explain value-added activities in business, but Borgman makes it relevant for scholarship. I believe that this section will be cited again and again by those who need a way to talk about the cumulative processes that have always existed in research (depending on one's area) that are exacerbated by the speed and intensity of scholarship in the digital age.

Again, Chapter 6 is well-researched and configured. It includes many definitions that will be useful to information scholars, discussing data by types, levels, sources, and policies for sharing. She includes most of the usual caveats for data storage (although more follow in Chapter 8 which differentiates between three broad types of disciplines). Data dissemination, preservation, storage, the growth of data, data interpretation and the key roles of legitimization and trust are discussed here. Data preservation and data sharing is more frequently a condition for funding by many federal granting agencies, both in the US and abroad. As examples, pages 126-127 include a large list of national and international large database repositories. A very good point is that migrating ever more materials to online repositories allows scholars to spend more time using, rather than locating, data. Maintaining better links between data and documents is often repeated throughout the next two chapters.

If there is a weakness in this book, it might lie in the organization of Chapter 7, "Building an infrastructure for information." Many specialists will care a great deal about how these infrastructures will evolve, and the subsequent or prerequisite of appropriate economic models to allow these transformations to take place. Borgman's overview of disciplines seems more simple than succinct, but since this is one of my areas of research, I am culpable here of prejudice. Most of the vital literature is cited and probably quite useful to the non-specialist. The section on academic cultures and her discussion of boundaries, barriers, and bridges seems too brief, and the development of professional identities and scholarly practices is short as well. When Borgman moves on to information-seeking behavior the discussion opens up, and I found the research cited on the current reading patterns of academics to be fascinating. The brief overview of index citation guides was necessary, but outlining the styles of referencing in various academic disciplines seemed out of place.

Later in Chapter 7, the discussion of collaboration and social networks was much more interesting, particularly the section called "Making knowledge mobile" where social scientists have viewed knowledge ontologies to reconcile terminological differences (Bowker 2005, among others). For me, this section was excellent and made up for any perceived shortcomings. However, the section on collaboration was overlong in that it stressed the legal aspects vs. the control aspects of data ownership, although the discussion of the market value for data was improved by citing Wouters' study of science policy, scientists, and motivations for sharing data (2004). Curiously absent from the section on collaboration was a meaningful analysis of the relationship between working academics and their students. Here, the potential for collaboration is the carrot attracting students to professors, who may not have funding or other incentives. It is the student roster that drives both the classroom and the laboratory and defines much of the shape of research practice. Students were not even mentioned.

Any disappointment with Chapter 7 was overcome by Chapter 8, "Disciplines, documents and data," a thoroughly enjoyable review. The chapter centers on how academics differ in their use of information and publications. Borgman considers academics in their primary roles as writers, submitters, citers, and linkers among her three major demarcations: scientists, social scientists and scholars in the humanities. Sometimes one fact fleshes out the picture, and here, the fact that "by 2003 83 per cent of science, technology and medicine journals were reported to be available online" (Cox and Cox 2003; Garson 2004) makes the information seeking behavior of the scientist an ongoing inbox of citations to analyze, utilize or discard. The fact that all information is not equal is alluded to in each section but not emphasized enough for my liking. It seems a priority here that great quantities of information (repositories) should be available to scholars rather than more refinement of search methods. Chapter 8 will be utilized by those interested in scholarly practices, and details can be fleshed out by the references that Borgman readily provides. Further useful testimony regarding the increased usage of larger online repositories is included here (199).

In the ninth chapter, "The view from here," Borgman concludes, "what is clear at this stage is that information is more crucial to scholarship than is the infrastructure per se" (227). She offers a helpful graphic on the Information Life Cycle (229) to tease out the stages of information superimposed on top of six types of information uses or processes, including creating, modifying, indexing, storing and retrieving, distributing, and filtering or accessing. Here, Borgman's concerns with the interoperability of data formats and storage seem slightly misplaced in an academic climate of extreme specialization. The premise of information as social is explored through her discussion of information institutions (libraries), publishers, and funding agencies linked to her concept of the "information commons," where the exchange-value of information is negotiated. In particular, the business and organizational models, the investment in digital content, and the issues in curation (among other objects, museum specimens remain unclassified) require metadata that can be read across a variety of formats. A major theme reappears as Borgman concludes that the true worth of any system requires that content and infrastructure be mediated by the middleware.

As mentioned above, the book is a valuable resource to anyone with an interest in the significance of e-Research infrastructures in both academe and within the community at large, with an emphasis on the necessity of appropriate policies and economic models to identify and maintain the "value chain" of scholarly work. It is a rare commodity -- a book both brief and thorough. The only drawback for me was a slightly skewed point of view. For example, Borgman states: "Scientists want e-Research infrastructure in order to do science" (201) and continues that social scientists want e-Research to do social science, but they also want to interact with framework of knowledge that infrastructure holds in place. In my opinion, these confident assertions seem misplaced. Some scientists might want infrastructure, but many only tolerate the information requirements of their profession, and some despise the reporting aspects of their research. Scientists generally are interested in solving particular problems, not trolling through databases, and sometimes the literature review is delegated to students. Here, the omission of students within the framework of academic research is a small, but significant oversight. However, Borgman's book will be in demand as a reference and starting point for many important conversations about the Academy, NGOs and governments, and how they interact to frame the future's e-Research infrastructures.

Denise N. Rall:
Denise N. Rall holds a PhD from Southern Cross University for her thesis entitled, "Locating four pathways to internet scholarship." In the past, she has assisted scientists with their information requirements in academic computing centers in the United States (Wisconsin, Purdue, and Northern Arizona University) and field research stations in Costa Rica, Ghana, and American Samoa. For the past ten years she has lived and worked in Lismore, New South Wales. Her current interests include textiles, search engine logics, and how new media and science interact.  <denise.rall@scu.edu.au>

