I Review 6 Digital Preservation Models So You Don’t Have To

icon_21032

“Archive (concept)” icon from The Noun Project. Part of Cultural Heritage Iconathon icon set. Public domain.

 

When digging through the ol’ desktop/hard-drive/thumb-drives/external-drives/google-drive/dropbox/evernote/oh-god-please-help-me recently I came across the notes (filename: “levels_notes.doc” — good grief) I had taken when Andrea Goethals, Trevor Owens, Meg Phillips and I were working on the paper, The NDSA Levels of Digital Preservation: An Explanation and Uses.

As the paper’s abstract says:

This paper presents the Levels, explains the context of the 
project’s development within the NDSA, describes the rationale 
behind each of the guidelines and why they were prioritized the 
way they were, suggests how the guidelines may be used, and 
compares and contrasts the Levels to other ways of assessing 
stages of digital preservation. Other assessment models include 
Nancy McGovern and Anne Kenney’s “The Five Organizational 
Stages of Digital Preservation,” Charles Dollar and Lori 
Ashley’s “Digital Preservation Capability Maturity Model,” and 
OCLC Research’s 2012 report, “You’ve Got to Walk Before You 
Can Run: First Steps for Managing Born-Digital Content 
Received on Physical Media.

Having happened across the notes (over a year and a half later, naturally), I thought to myself, “hey, maybe it would be useful to share these with people; and, were it not useful, perhaps posting these notes would make the neglect of my blog appear slightly less criminal.”

These notes are basically my “lit review” of the six models I took a look at and compared to the Levels. The published paper  consolidates all our reviews of most of these models, so may be worth a look as well (and is decidedly more concise). It is worth noting that “model” is used, here and in the paper, rather broadly.  It is also worth mentioning the caveat that these reading notes were for the specific purpose of writing that paper and comparing things to the Levels, so don’t expect a lot of whiz-bang, high-falutin’ cross-model analysis. There are, however, a number of one-liners. Enjoy!

[Note that the below was written in December 2012/January 2013]

——

Tessella, Digital Archiving Maturity Model (2012)

Audience: Tessella’s document is intended to define a gradated and progressional approach to the fundamental processes necessary to create and maintain a “digital archive.” In stressing information management and noting that digital preservation is “business critical,” the document is clearly drafted for an audience beyond cultural heritage – organizations for which the preservation of digital content is an operational or legal necessity. As such, the document is focused on resource and componential needs and not practical steps or activities. The document’s arguments are intended for administrators and executives and not necessarily practitioners.

Purpose/Focus: Given its business focus, the DAMM does not focus on the long-term preservation of digital content necessarily, but instead focuses on what is required for ensuring accessibility and “creating value from” digital materials. It’s focus is on the “key components” in a gradated approach.

Scope: The DAMM is organized into six levels within two broader areas: durable storage and information management. Levels 1-3 comprise durable storage and focus on creating a system wherein information can be managed, accessed, and its security ensured. Levels 4-6 comprise information management and ensure digital content can be located now and into the future and the processes are in place to ensure that all six levels can be automated. Levels: 1) Safe Storage; 2) Storage Management; 3) Storage Validation; 4) Information Organization 5) Information Processes; 6) Information Preservation.

Similarities/Differences with Levels: Though the six levels of the DAMM are presented as progressive and are visually arranged in a pyramid (the dreaded pyramid graphic), there is an interdependency to them that limits different approaches to implementation. It is hard to see how, for instance, Level 3 can be met without incorporating descriptive metadata. While the document does note that certain upper levels require meeting other levels in the pyramid, the dictates of each Level are broadly defined in a way that would make it difficult to built a preservation program at lower levels. The document takes a very high-level approach to its criteria and is intended more for use in planning, strategic mapping, and resource allotment and not in day-to-day digital preservation. With its presumption of an audience that may need only near-line, short-term asset storage, the document is useful; however the Levels’s focus on preservation through time provides both more detail as well as a more scalable and flexible approach to implementing certain preservation activities.

Becker et al., A Capability Model for Digital Preservation (Paper presented at iPRES, November 2011)

Audience: The capability model predicates its approach on expanding digital preservation requirements into non-archival, non-cultural heritage contexts such as business, enterprise architecture, and other “non-traditional DP settings.”

Purpose/Focus: This model looks to broaden the criteria and compliance requirements of digital preservation to include the increasingly number of fields in which digital preservation is needed, such as “information systems, governance, compliance, and risk, and organizational engineering.” It aims for a “cross-cutting” model to bring digital preservation into concert with non-LAM disciplines, especially ones in which “information plays a key role.” The capability model’s purpose is to move digital preservation models beyond their focus on memory institutions and the long-term preservation of digital content. Their model notes the lack of guidance on governance, implementation, improvement, and control in TRAC, etc.

Scope: Builds upon the Software Engineering Institute’s Capability Maturity Model Integration paradigm (name = ugh) which focuses on governance and improvement. Looks to take knowledge from the field of Enterprise Architecture and apply it to digital preservation needs (or at least graft DP needs into those models). The scope is to orient digital preservation models to “strategic processes and capability improvement.” The document is cogent and useful in defining relevant standards, domains, and audiences for its model – something Levels doc does not do (intentionally). The CMDP adopts the COBIT framework for IT governance and management and applies it to the “preservation operation” by inserting (some) digital preservation concerns such as “metadata” and “provenance.” As the conclusion acknowledges, this model is an attempt to insert digital preservation into the “overall architecture” of an organization.

Similarities/Differences with Levels: In taking an architectural approach to digital preservation compliance models, the CMDP acknowledges the evolutionary aspect of building systems and processes for digital preservation. By emphasizing ongoing improvement, the CMDP is similar to the LoP’s stepping-stone approach to building a workflow that can grow and become more robust. However, the CMDP, due to its abstract, holistic approach focuses largely on “capabilities” (organizational, staffing, technological) that the Levels intentionally avoids. By focusing on regulatory constraints, business needs and values, and enterprise concerns, the CMDP targets large, well financed organization operating largely in the TRAC/ISO domain. This makes it not useful for smaller organizations. Its difference is largely one of audience. CMDP is intended for corporate or bureaucratic organizations and not the environment of cultural heritage or academia. The CMDP’s model is exhaustively defined, which is useful, but that is the obverse approach of the simple, pared-down and (I/we would argue) practical. This model is indisputably useful to larger organizations with an existing technical infrastructure. But, given the Levels’s origins in the need to find a middle space between simplistic personal archiving practices and TRAC-compliance, the CMDP model will be of utility only to those organizations with the resources to operate at the highest levels o’ Levels. In addition, the CMDP is focused on organizations operating outside the cultural, educational, or memory domains that primarily drive the NDSA and its guidance documentation.

[Editorializing here, but this paper contains a visualization that may actually be more bewildering than DCC’s curation lifecycle model. Secondary editorializing, lord knows I love acronyms, but this paper may contain more acronyms than the alphabet soup in a government agency cafeteria.]

Pardo et al., Building State Government Digital Preservation Partnerships: A Capability Assessment and Planning Toolkit (2005)

Audience: As the title suggests, it was created to advise “state government partnerships” but the Executive Summary makes clear that it is intended for library, archives, RM, and IT professionals to use when considering or planning a digital preservation initiative. The doc is primarily a self-assessment tool and intended to “facilitate discussion” and do planning. As such, and upon examining the criteria, it is aimed at upper-level managers, executives, and administrators and not practitioners. Of course, that is its purpose, so that’s not a fault! But even the sample case study is built around an initiative of the State Librarian, State Archivist, and State CIO.

Purpose/Focus: To produce results that:

• inform planning and design of digital preservation initiatives;
• identify both strengths and weaknesses;
• focus investments in specific capability-building efforts;
• help identify risk and risk mitigation strategies; and
• highlight what additional information is needed to make sound decisions.

Scope: Examines 19 “dimensions and definitions of digital preservation capabilities” broken up into two levels (they don’t call them levels, but I’m not sure what else to call them), “threshold capabilities” and “additional capabilities.” Unlike the Levels, these are not task-based but are principle-based, for instance “maintaining comprehension and accountability.” Many of the “capabilities” seem (IMHO) pretty watery and ill-defined, like “leaders & champions” (someone give that executive a trophy!) or too overlapping, like “technology acceptance” and “technology knowledge” (not mentioned, “technology regret”). Like some of the other models, it is very organizationally-driven and includes many capabilities that aren’t specific to digital preservation and could be applied to any activity – really only a third of the 19 dimensions are specific to digital preservation. That said, when you dig into the doc, some of the tables (to wit Tables 4-6) are quite useful. Also, it has a crazy level of sample documents, including draft emails, workshop schedules and worksheets, and question sets in Appendix 4 & 8.

Similarities/Differences with Levels: Besides the audience and other differences mentioned above, the document also stresses both self-assessment and collaborative initiatives. While LoP can be used for self-assessment, we don’t stress it that much (which is good, I think). Also, this doc focuses largely on collaborative work, which the Levels don’t address (explicitly at least). I seem to remember a Levels blog comment asking if we were thinking of including “partners” or something similar anywhere in the levels. I’m not sure we need to, but food for thought. Otherwise, not too many similarities here though there is good stuff in this document.

Charles M. Dollar and Lori Ashley, “A Digital Preservation Capability Maturity Model in Action” (slides from PASIG October 2012)

Audience: Is geared more towards resource allocators and “C-level decision makers” (whatever the hell that means) and IT providers and support staff. It mentions archivists and RMs, but seems more geared towards executives. It also seems geared more towards the RM community and its needs than to the cultural heritage community.

Purpose/Focus: Very strategically-focused and uses self-assessment to identify risks. It does allow for different activities to be achieved at different levels, but this is driven more by self-assessment and business requirements than the Levels, which could be thought of as being more useful to the bewildered and self-motivated.

Scope: Inspired by a 5 stage “organizational records management model” of nominal, minimal, intermediate, advanced, optimal. These are organized around 15 “infrastructure elements”

1. Policy
2. Strategy
3. Governance
4. Collaboration
5. Technical Expertise
6. Open Source Neutral Formats
7. Designated Community
8. Electronic Records
9. Ingest
10. Storage
11. Device/Media Renewal
12. Integrity
13. Security
14. Metadata
15. Access

In the visualization, these 15 areas are oriented around Digital Preservation Infrastructure, Trustworthy Repository, and Digital Preservation Processes, but the key is 15 areas at 5 levels.

Similarities/Differences with Levels: It is similar in that it “supports incremental improvement” and “enables the settings of priorities based on risk, requirements, and resources.” It is also componential, much like LoP. In its gradated approach, it has some similarities, but when you dig into the document and look at the actual “metrics” by which users are self-evaluating, they are very TRAC-looking, with tangled, hard-to-parse verbiage and very rigorous. This model is very different in its usability, simplicity, and flexibility from the LoP.

[The Dollar/Ashley model can be "seen in action" in the COSA/SERI project, specifically in the SERI Phase 1 Final Report.]

Anne Kenney and Nancy McGovern, “The Five Organizational Stages in Digital Preservation”. (In Digital Libraries: A Vision for the 21st Century, 2003).

Audience: The paper is quite accessible for many audiences, but as most of the stats cited involve R1 universities, and the focus is largely on organizational issues, the implicit (and intended) audience is large research universities.

Purpose/Focus: Clearly stated, it is to “understand the organizational impediments to digital preservation practice” and to “to identify and describe digital preservation stages from an organizational perspective.” The “stages” model attests to the need to better understand how to get started in digital preservation and that what may be a partial plan at one place may be a very robust plan for another. The doc stresses the need to know your particular institutional requirements and abilities. Organizational readiness (not technology) is the great inhibitor of digital preservation programs.

Scope: Explicated the “five stages of organizational response” to digital preservation:

1. Acknowledge: Understanding that digital preservation is a local concern;
2. Act: Initiating digital preservation projects;
3. Consolidate: Segueing from projects to programs;
4. Institutionalize: Incorporating the larger environment; and
5. Externalize: Embracing inter-institutional collaboration and dependency.

These are then broken down into 3 “key indicators” policy and planning, technological infrastructure, and content and use.

Similarities/Differences: Like other models in this review, it relies on self-assessment and is geared towards organizational stages and aimed at larger institutions and higher-level administrators and strategic planning. Similarities include the authors’ desires to “to define a metric for quantifying progress towards a comprehensive digital preservation program” which is very similar to the Levels intention of providing benchmarks to denote progress or minimal best practices in certain areas of activity. Offering a “staircase rather than an unassailable wall” is very much in the spirit of the Levels, even if our document is not organizationally oriented.

Erway (OCLC), “You’ve Got to Walk Before You Can Run: First Steps for Managing Born-Digital Content Received on Physical Media” (2012)

Audience: This doc aims to provide guidance to collection stewards managing or beginning to acquire born-digital materials on physical media. It assumes little in the way of expertise and is oriented towards institutions that have not yet developed policies or practices towards managing, preserving, or (one could even argue) collecting, born-digital materials.

Purpose/Focus: The brief document’s primary intent is to provide practical, “modest measures” to the general – the titular “first steps” – in managing born-digital content currently housed on physical media. To this end, the document is not necessarily focused on digital preservation, but more on moving born-digital content off of decaying media and into a more stable preservation environment.

Scope: The document is divided into two different sections. The first sections focuses on the steps that can be taken to survey and inventory the physical media itself. This section has little to do with the digital content itself, though some steps, such as recording operating system and hardware metadata, could be seen as correlates to the metadata recommendations of the Levels guidance. The second section lists eleven specific “technical” steps for acquiring digital content off of physical media. A number of these steps, such as generating checksums and inventories and documenting original environments as well as preservation actions, overlap with Level 1 and 2 guidance.

Similarities/Differences: In offering guidance on “first steps” geared towards those with little digital preservation experience or expertise, and potentially limited resources and technical infrastructure, the document bears similarities to the Levels by offering non-experts and entry point and basic guidance on digital preservation requirements. The document articulates a number of core tenets of digital preservation – generating fixity information and technical and preservation metadata, as well as transferring bits to a stable archival environment including multiple non-collocated copies – that can be found in the Levels document. The primarily similarity is the document’s attempt to demystify digital preservation basics for non-practitioners. The primary difference between OCLC’s document and the Levels is OCLC’s admittedly limited scope (that said, the document is the first in a five-part series, so that is intentional). OCLC’s document does not look beyond “first steps” of acquiring bits off of physical media and into stable storage and much of the document is given over to managing the physical media itself and not the digital content contained therein. [Update: These overall notes were written in Nov/Dec 2012 and since then other papers in this series have been published and can be found on the Demystifying Born Digital project page.]

 

Creative Commons License
This work, unless otherwise expressly stated, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

2 Comments

  1. This is a nice overview of digital preservation models for people who are just getting familiar with the community landscape. Two general things I would note: 1) it would be very useful to note the date when the model was released – a number of these are really recent so it’s not yet possible to know what kind of impact they will have and a number of the models on the list were informed by pre-existing models; and 2) the models were developed for different purposes so in using them, it’s not a matter of this model vs that model but which model for which purpose. For example, our Five Stages model is a general model for any organization to use in developing their digital preservation program and we refer people to other models as appropriate, e.g., to the levels of preservation as a means to measure the development of what we call the technology leg of our three-legged stool. And one specific thing, though our Five Stages model was born at then hosted by large (or medium) research universities, the implicit or intended audience is not large research universities as stated – for more than a decade, the five stages model has proven effective (through dozens of workshops, the online tutorial, the adoption of the five stages model by the DPTP program in the UK, and general use) in helping any organization of any size managing any kind of digital content across generations of technology. People who are interested in models may want to read more about them. For those who are interested, the Five Stages model is at the center of the Digital Preservation Management workshop’s online tutorial at: http://dpworkshop.org/ and there is a more recent discussion of the application of the five stages in the ANADP volume that was published by Educopia and in part sponsored by the Library of Congress is available in a free PDF download at: http://www.educopia.org/publishing/anadp.

  2. Jefferson

    Hi Nancy!

    I’m thrilled you read the post and commented. Adding dates to the resources is a great idea and they’re now on there. I couldn’t agree more that all these models inform each other and all have value — anyone doing (or hoping to do) digital preservation will find value in all of them. Also, as noted in the intro, these are merely (old) reading notes forked into a blog post, so they don’t presume currency and certainly don’t presume comprehensiveness. Readers are well advise to research further and the DPM workshop you linked to is a great place to start as it’s a fantastic resource.

    Thanks again for the comment and I hope to one day do a far more exhaustive and analytical review of digital preservation “models.”

    Jefferson

Leave a Reply