Early European Books: The Role of Private Partnerships in Heritage Preservation
Francesca Petricca
In the context of cultural heritage preservation, questions regarding the role of private institutions in the conservation and dissemination of such heritage and whether preservation can be considered a business remain important. This paper examines these questions from a private company standpoint. We will use the example of the Early European Books (EEB) project curated by ProQuest. This project started officially in 2009, as the ideal extension of its cognate programme, Early English Books Online, which began almost 90 years ago in 1938. Today, this global endeavour is known under the name of Early Modern Books, and it allows scholars to view materials from over 225 source libraries worldwide. EEB contains, to date, records on 78,556 editions printed between 1450 and 1700. The actors involved in this project are ProQuest; the Universal Short Title Catalogue at St Andrews University; and five major European collections, namely the National Library of France, the National Library of the Netherlands, the Royal Danish Library, the National Central Library in Florence, and the Wellcome Library in London. In 2025, two new source libraries will join the EEB project, The British Library and University College London.
In a paper published in 2021, Andrew Pettegree and Artur der Weduwen portray the framework of the partnership between ProQuest and the Universal Short Title Catalogue (USTC) (Pettegree & der Weduwen, 2021). The study discusses the parallel destinies of Alfred Pollard and Eugene Power, whose efforts led to the birth of Early Modern Books. In 1918, Alfred Pollard, curator of the British Museum, had the initial idea of creating a checklist of all the books published in Great Britain during the first two centuries of printing. The memories of the fear caused by German bombs falling from the sky were still fresh. If these early print treasures were to be lost, at least an inventory of the books would be useful in recreating the collection. The Short Title Catalogue (STC) initiative came out of this idea and was completed in 1927. Although it inspired other similar initiatives, the STC remained the core of what is known today as the English Short Title Catalogue (ESTC). A short title catalogue is a list of printed works designed to identify editions, typically including a variety of bibliographical information, including the shortened version of the title, publication information (‘imprint’), subject headings, genre terms, pagination and format, and references to other catalogues. On the other side of the Atlantic Ocean, Eugene Power, the founder of University Microfilm International (UMI), the company known today as ProQuest, pioneered a microfilm industry that was applied to scholarly work (Pack, 1994; Pennavaria, 2015). The idea was to remove physical barriers to research, facilitate access to information, and reveal meaningful content. The imminence of World War II and the threat hanging over the masterpieces in the British Library accelerated the need to create manageable copies of the collections. Eugene Power sailed to London to capture in microfilms the books listed by Pollard a few years before. Early English Books Online was born and with it, a new concept of business and heritage preservation emerged (Gadd, 2009).
Today, Early English Books Online includes over 150,000 titles which represent 98% of the ESTC and is the most comprehensive collection of early printed books available to the scholarly world. The impact of EEBO in the preservation landscape has been a major one, and the project has stimulated heated debates in the scholarly community. While many historians have underlined its limits, others recognized its major contribution in the field of computational humanities and quantitative research (Gavin, 2017, 2021; Herman, 2020; Kichuk, 2007). EEB is a cognate project which aims to preserve European collections, in an age where digitisation methods have evolved sufficiently to capture the entire book – cover to cover – in full colour and to film fragile works with care. The endeavour of covering continental early printed books, however, is larger and more complex than surveying the English press. ProQuest has been working with the USTC since 2009, and a consultancy partnership began in 2015. The USTC, based at St Andrews University since 1995, is a bibliography of books printed between 1450 and 1700 worldwide. Expanding the ESTC, the USTC includes European prints by physically visiting libraries in Europe and meticulously examining the catalogues onsite. The cooperation between ProQuest and the USTC is pivotal to establishing a roadmap for digitisation and developing metadata. In 2024, ProQuest released the 25th collection of EEB. The collections are the units used to organise and commercialise EEB, and each collection contains substantial holdings from one or more libraries (Kibble, 2011). Collections 1 to 10 are comprehensive and can be considered as the foundation of the project because they contain over 60,000 titles. Collections from 11 onwards are smaller and thematic to align with faculty subjects, and USTC subject tags are used to create the themes.
The Publisher Workflow and the Role of Metadata
Each partner library part of the EEB project enters into a legal agreement with ProQuest, whereby the library is remunerated through the payment of royalties calculated on subscriptions to the collection. To date, ProQuest has agreements with the five above-mentioned libraries that have been selected for the relevance of their collections. Once the agreement is signed, ProQuest and the partner library select the supplier in charge of the books’ digitisation. First, a preliminary study is conducted with the library and the supplier to prepare the digitisation, which is followed by a negotiation with the library on the terms of delivery of the books. The working conditions are regulated in every minute aspect by the contract concluded between the parties e.g. on the daylight exposure and the hours worked per day, and humidity insurance is stipulated to protect the collections.
The scanning lab is set up in the library premises to minimise the risks for the collections and facilitate any interactions required with the library staff. The pages are turned manually as the books come in many different sizes. Since some pages are folded and some books have a very tight binding, a cradle is used to open the books at a 45° angle to maintain their integrity. The spine, cover, and edges of the books are captured to give the user a sense of being in the presence of the book.
ProQuest creates a list of the titles in MODS format that is sent to the digitisation partner. The list is generally based on the original MARC records from the source library (if available). The supplier returns the data after digitisation; these are MODS records stored in METS.[1] The data is sent via FTP or hard disk, depending on the supplier, along with the scanned images of the book. In addition to bibliographic information, the file provides details of the equipment used in the digitisation process at the level of each page. Page features, such as coats of arms, illuminated letters, and manuscripts, are highlighted in the code describing each page. For work identification, links to VIAF and HPB Database are added to obtain a standard version of the author’s name. These links are added manually by the ProQuest team in the Cambridge (UK) offices.
The enhanced metadata are the main value provided by ProQuest. The EEB project aims to improve the discoverability of the works through enriched metadata, make unknown materials available to scholars, and reveal connections between works that have not been considered previously. Most libraries digitise their fragile books as part of a conservation and dissemination remit. They either make them available themselves on library websites or share them with a wider community with a minimum of metadata. ProQuest-enhanced metadata allows the retrieval of known items using keywords, author names, and titles, and they grant the discovery of new volumes via USTC subject terms, page features, and source libraries. Every book has an identifier that has been added to the items in EEB. As of today, the USTC features 39 themes.
Currently, scholars who are looking for a specific item will most likely find versions of the books/volumes on the web as scattered content or part of digitised collections, but those who are still formulating their cognitive questions will find the metadata an essential tool in their research. ProQuest-enhanced metadata offers extensive opportunities not only to identify a book but also to get an idea of the landscape of research on different topics in Europe. These features allow users to properly assess the context of a topic and the other works that could be on a continuum of output from the Early Modern period. This functionality is possible due to the dimensions of Early Modern Books, a resource that represents one of the biggest corpora of content from a historical period and that covers a large region from Scandinavia to Spain.
Women in Publishing and Inclusive Metadata
The unique features of EEB and the cooperation with the USTC have made it possible to conduct new research and build a more inclusive scholarship. One of the most significant examples is the creation of the indexation subject ‘women in publishing’ in 2021.[2] The debate on the ability of metadata to influence and bias research is a long-standing issue, first documented in 1971 by Sanford Berman in the monograph Prejudices and Antipathies: A Tract on the LC Subject Heads Concerning People (P&A). Classification and creation of metadata have never been neutral acts. In more recent times, representing diversity and establishing cataloguing ethics has become a necessity and gauge of scholarly integrity.[3] The attribution of gendered metadata to the collections contributes to the representation of traditionally marginalised groups, such as women, and ultimately sheds light on different aspects of social life in early modern Europe. Elise Watson’s paper on the early modern period states that reassessing metadata can provide new perspectives in the study of printing and publishing (Watson, 2024).
Women played a far more important role in the book industry than is acknowledged by the imprint data on the books. Female authors, such as Margaret of Navarre, belonged to families of the elite where there was a greater chance of an excellent education. In contrast, it was far easier for non-aristocratic women to contribute to the book world as producers rather than as authors. Since almost all print shops were a family business, the wife would frequently act as business manager. If they outlived their husbands, women could step out of the shadows and print under their own names.[4] The same was true in families without male heirs, where the daughters became the business successors and ran the firms. Indeed, women often used their late husbands’ or fathers’ names, but the USTC aims to include the given names of the female printers when it is possible to find them via archival research (Watson, 2024). Labelling books mentioning widows and heirs is one of the best practices to acknowledge the work of women in the book industry. As of February 2025, 1,823 works have been tagged under the USTC subject classification ‘women in publishing’ in EEB and 4,421 in Early English Books Online, representing almost 3% of the entire catalogue. This tag captures women engaged in the publishing and printing trade as listed on the imprints or colophons of early printed books. It provides insights into the many different roles women played in the book trade, not only in the print shop but also in management and bookselling. This endeavour can be seen as the first step in a new way of treating metadata. The USTC’s Elise Watson talks about feminist bibliographical data.[5] The recognition of women’s role in publishing paves the way for the acknowledgement of other marginalised groups. In addition to the printer, publisher, editor, and author, who are well known, the binder, compositor, seller, illustrator, translator, pressman/woman, compiler, dedicatee, and papermaker are also important. If represented in this way, the making of a book becomes a collective and communal process. Thus, acknowledging the full working team goes beyond the binary distinction between men and women and shapes a more faithful image of the workforce in the book industry in the early modern era. Metadata are far from being neutral; they reflect the intentions of the cataloguer and determine the interpretative framework in which early printed books are read. The more information we can add to describe a document, the more information we will be able to provide to scholars investigating the period.
Preservation reaches its peak when it brings together curation, dissemination, and discovery. The partnership between public libraries and private stakeholders can prove reliable despite having commercial goals if conducted accurately and transparently. ProQuest’s EEB project aims to contribute to the debate around the printed output of the early modern era and propose a collaborative approach to heritage preservation.
Conclusion
The EEB project exemplifies how private–public partnerships can contribute to the preservation and dissemination of cultural heritage while balancing commercial imperatives with scholarly objectives. By leveraging enhanced metadata and digitisation technologies, ProQuest and its partners have expanded access to early printed materials, enabling more comprehensive research opportunities. The inclusion of gendered metadata and recognition of historically marginalised contributors illustrate the evolving role of cataloguing in shaping historical narratives. Ultimately, this study underscores the importance of meticulous curation and collaborative efforts in safeguarding early modern printed heritage. Thus, the study demonstrates that commercial initiatives can, when managed transparently, serve the broader interests of academia and cultural conservation.
References
Gadd, I. (2009), The use and misuse of Early English Books Online. Literature Compass, 6(3), 680–692. https://doi.org/10.1111/j.1741-4113.2009.00632.x
Gavin, M. (2021). EEBO and us. Textual Cultures, 14(1), 270–278. https://www.jstor.org/stable/48647122
Gavin, M. (2017). How to think about EEBO. Textual Cultures, 11(1/2), 70–105. https://www.jstor.org/stable/26662793
Herman, P. C. (2020). EEBO and me: An autobiographical response to Michael Gavin, ‘How to think about EEBO’. Textual Cultures, 13(1), 207–216. https://www.jstor.org/stable/26954245
Kibble, M. (2011). Sponsored article: ProQuest’s Early European Books project: A collaborative approach to the digitisation of rare texts. LIBER Quarterly: The Journal of the Association of European Research Libraries, 20(3-4), 372–381. https://doi.org/10.18352/lq.8003
Kichuk, D. (2007). Metamorphosis: Remediation in Early English Books Online (EEBO). Literary and Linguistic Computing, 22(3), 291–303. https://doi.org/10.1093/llc/fqm018
Martin, J. M. (2021). Records, responsibility, and power: An overview of cataloging ethics. Cataloging & Classification Quarterly, 59(2–3), 281–304. https://doi.org/10.1080/01639374.2020.1871458
Pack, T. (1994). UMI—History in the making. Library Hi Tech, 12(3), 91–100. https://doi.org/10.1108/eb047931
Pennavaria, K. (2015). Genealogy gems: The evolution of ProQuest. Kentucky Libraries, 79(1), 18–19.
Pettegree, A., & der Weduwen, A. (2021). Proquest’s Early Modern Books: A celebration. ProQuest. https://pq-static-content.proquest.com/collateral/media2/documents/article-earlymodernbooks-celebration.pdf
Watson, E. (2024). Queering the language of dynasty in imprints and bibliographic metadata. The Papers of the Bibliographical Society of America, 118(2), 223–243. https://doi.org/10.1086/730317
Abstract
This paper examines the role of private institutions in the preservation and dissemination of cultural heritage, with a particular focus on the Early European Books (EEB) project led by ProQuest. Originating as an extension of Early English Books Online, EEB has evolved into a global initiative aimed at digitising and cataloguing early printed materials from major European collections. Through partnerships with institutions such as the Universal Short Title Catalogue (USTC) and prominent libraries, this project enhances scholarly accessibility to rare texts while maintaining rigorous digitisation and metadata standards. The study highlights the significance of metadata enrichment in facilitating research and fostering inclusivity, particularly through the indexing of traditionally underrepresented contributors, such as women in publishing. By examining the intersection of commercial enterprise and heritage preservation, this research underscores the potential of public–private collaborations in advancing academic scholarship and ensuring the longevity of early printed materials.
Keywords
Digital cultural heritage; Public–private partnerships; Enhanced metadata; Feminist bibliography; History of the book; Digitisation and preservation
- MODS stands for Metadata Object Description Schema (https://www.loc.gov/standards/mods/mods-overview.html) while METS stands for Metadata Encoding & Transmission Schema (https://www.loc.gov/standards/mets/METSOverview.v2.html). ↵
- Refer to this blog entry, published at the time of the launch of the Women in Publishing initiative in 2021: https://pq-static-content.proquest.com/collateral/media2/documents/brochure-eeb-discoverwomenprinters.pdf. ↵
- A meaningful and comprehensive contribution on cataloguing ethics is provided by Jennifer Martin (2021). ↵
- For further information on early modern women printers – with particular attention to sister printers – refer to: https://www.ustc.ac.uk/news/sorority-before-sororities-early-modern-sister-printers. ↵
- Watson refers to Kate Ozment's work on 'Feminist Bibliography'; see https://www.youtube.com/watch?v=razNzIQXUqE. ↵