Home

Chapter 7. Metadata Quality Measurement and Improvement

Chapter 7 section titles

7.1 Quality of Metadata
7.2 Meeting the Functional Requirements
7.3 Quality Measurement with Different Granularities
7.4 Measurement Indicators: CCCD
7.4.1 Completeness
7.4.2 Correctness
7.4.3 Consistency
7.4.4 Duplication Analysis
7.5 Metadata Evaluation
7.6 Enhancing Quality of Metadata
7.7 Entity-level Quality for Reusable Metadata
7.8 New Challenges and Demands
7.9 Conclusion

Links to sources

US Geological Survey’s Formal Metadata: Information and Software ...... 342
- a number of software tools are available for creating metadata that conforms to the Content Standard for Digital Geospatial Metadata (CSDGM) developed by the Federal Geographic Data Committee (FGDC)

Europeana Data Model (EDM) ...... 343
- The EDM XML Schema (xsd) allows for automatic validation of EDM metadata.
- The EDM Validation (pdf) explains how to make use of the validation rules with the Oxygen XML editor. (See also chapter 5, section 5.4.3) ...... 343

Resources mentioned in the section "Layer III: Obtaining and using machine-processable URIs/IRIs" ...... 347-348

OpenRefine tool ...... 347; 350-351
- Learn from DCMI webinars by watch recordings: (1) Introduction to OpenRefine & (2) Getty Vocabulary Program’s OpenRefine reconciliation service

Table 7-8-2 FAIR-based Quality Assessment for Open Data’s Metadata ...... 351
- Compiled based on EU Open Data Portal's Metadata Quality Assessment (MQA) Methodology which has a section "Dimensions" listing all dimensions that the MQA examines in order to determine the quality, based on the FAIR principles .
- Study "Metadata Quality" overview and the
contents in "Top Catalogues" section.

Exercises

1. Search a digital collection’s portal using different queries (e.g., “Mars” and “Mars exploration” at https://nsdl.oercommons.org/ or “WWII” and “Second World War” at https://classic.europeana.eu/portal/en). Sort by title or filter by media. If there are duplicate records, analyze and determine whether they are true duplicates. Find and examine a set of records that describe the same resource.

2. Write a brief, well-structured report about your findings. Discuss how the overall metadata quality could be improved and what the barriers for quality improvement are. Attach the records that you corrected or enhanced, and mark your corrections or additions on the original records. Suggest three to five steps for improving the quality of your chosen portal.

Readings

Bruce, Thomas R., and Diane I. Hillmann. 2004. "The Continuum of Metadata Quality: Defining, Expressing, Exploiting." In Metadata in Practice, edited by Diane I. Hillmann and Elaine L. Westbrooks, 238−56. Chicago: American Library Association.

EU Open Data Portal. n.d. “Metadata Quality Assessment Methodology.” Accessed July 24, 2021. https://data.europa.eu/mqa/methodology.

Guy, Marieke, Andy Powell, and Michael Day. 2004. "Improving the Quality of Metadata in EPrint Archives." Ariadne, no. 38. http://www.ariadne.ac.uk/issue38/guy.

Hillmann, Diane I., Naomi Dushay, and Jon Phipps. 2004. "Improving
Metadata Quality: Augmentation and Recombination." In DCMI International
Conference on Dublin Core and Metadata Applications: DC-2004—Shanghai
Proceedings
. https://dcpapers.dublincore.org/pubs/article/view/770.

Neumaier, Sebastian, Jürgen Umbrich, and Axel Polleres. 2016. “Automated Quality Assessment of Metadata across Open Data Portals.” Journal of Data and Information Quality 8 (1): 1–29. https://doi.org/10.1145/2964909.

O'Neill, Edward T., and Diane Vizine-Goetz. 1988. "Quality Control in Online
Databases." In Annual Review of Information Science and Technology, edited by
M. Williams, 23: 125−56. Medford, NJ: Learned Information.

Ryan, Maggie, Ruiwen Zhang, Matthew Durward, Krystyna Matusiak, and Peter Organisciak. 2020. “Challenges and Solutions of Identifying Similarities and Duplication in Digital Libraries.” In JCDL ’20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, 507–8. New York: Association for Computing Machinery. https://doi.org/10.1145/3383583.3398586.

van Hooland, Seth and Ruben Verborgh. 2014. Linked Data for Libraries, Archives and Museums—How to Clean, Link and Publish Your Metadata, Chapters 3−6. London: Facet Publishing.