{"id":18826,"date":"2020-04-14T10:47:00","date_gmt":"2020-04-14T14:47:00","guid":{"rendered":"https:\/\/michigan.it.umich.edu\/news\/?p=18826"},"modified":"2024-07-08T06:05:01","modified_gmt":"2024-07-08T10:05:01","slug":"combine-metadata-harvester-aggregate-all-the-data","status":"publish","type":"post","link":"https:\/\/michigan.it.umich.edu\/news\/2020\/04\/14\/combine-metadata-harvester-aggregate-all-the-data\/","title":{"rendered":"Combine Metadata Harvester: Aggregate ALL the data!"},"content":{"rendered":"\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"479\" src=\"https:\/\/michigan.it.umich.edu\/news\/wp-content\/uploads\/2020\/05\/Cereal_Harvest_in_Somerset_-_geograph.org_.uk_-_1484575.jpg\" alt=\"Combine harvesting wheat field.\" class=\"wp-image-18827\" srcset=\"https:\/\/michigan.it.umich.edu\/news\/wp-content\/uploads\/2020\/05\/Cereal_Harvest_in_Somerset_-_geograph.org_.uk_-_1484575.jpg 640w, https:\/\/michigan.it.umich.edu\/news\/wp-content\/uploads\/2020\/05\/Cereal_Harvest_in_Somerset_-_geograph.org_.uk_-_1484575-267x200.jpg 267w, https:\/\/michigan.it.umich.edu\/news\/wp-content\/uploads\/2020\/05\/Cereal_Harvest_in_Somerset_-_geograph.org_.uk_-_1484575-187x140.jpg 187w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><figcaption>(Rick Crowley \/ <a href=https:\/\/commons.wikimedia.org\/wiki\/File:Cereal_Harvest_in_Somerset_-_geograph.org.uk_-_1484575.jpg>Cereal Harvest in Somerset<\/a> \/ CC BY-SA 2.0)<\/figcaption><\/figure><\/div>\n\n\n\n<p>The\u00a0Digital Public Library of America\u00a0displays over 36 million records. While a large share come from \u2018Content Hubs\u2019 like the Smithsonian or HathiTrust, there are still millions of records ingested from a wide range of smaller institutions across America. &#8220;The technologies we use to transform and validate XML records, like XSLT, are well-established and highly reliable, but software for handling records at this scale, and performing mass transformation and validation operations, is a little harder to come by,&#8221; writes <strong>Esty Thomas<\/strong> of the U-M Library I.T. Division in a <a href=\"https:\/\/www.lib.umich.edu\/blogs\/library-tech-talk\/combine-metadata-harvester-aggregate-all-data\">recent blog post<\/a>.<\/p>\n\n\n\n<p>Thomas&#8217;s work involves the use and development of software originally developed by Wayne State University Libraries called Combine. Combine offers flexibility and repeatability to users handling diverse streams of metadata.\u00a0According to Thomas, the Library is working to incorporate several different types of ingestion processes, so that it\u2019s equally easy to pull in a spreadsheet of metadata from a very small local history museum as it is to fetch records from selected U-M collections.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The\u00a0Digital Public Library of America\u00a0displays over 36 million records. While a large share come from \u2018Content Hubs\u2019 like the Smithsonian or HathiTrust, there are still millions of records ingested from a wide range of smaller institutions across America. &#8220;The technologies we use to transform and validate XML records, like XSLT, are well-established and highly reliable, but software for\u2026 <span class=\"read-more\"><a href=\"https:\/\/michigan.it.umich.edu\/news\/2020\/04\/14\/combine-metadata-harvester-aggregate-all-the-data\/\">Read More &raquo;<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":18827,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_uag_custom_page_level_css":"","_umich_oidc_access":"","_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_ef_editorial_meta_date_first-draft-date":"","_ef_editorial_meta_paragraph_assignment":"","_ef_editorial_meta_checkbox_needs-photo":"","_ef_editorial_meta_number_word-count":"","footnotes":""},"categories":[5],"tags":[43,640,651,127],"class_list":["post-18826","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-campus-news","tag-data","tag-library","tag-productivity","tag-software"],"uagb_featured_image_src":{"full":["https:\/\/michigan.it.umich.edu\/news\/wp-content\/uploads\/2020\/05\/Cereal_Harvest_in_Somerset_-_geograph.org_.uk_-_1484575.jpg",640,479,false],"thumbnail":["https:\/\/michigan.it.umich.edu\/news\/wp-content\/uploads\/2020\/05\/Cereal_Harvest_in_Somerset_-_geograph.org_.uk_-_1484575-187x140.jpg",187,140,true],"medium":["https:\/\/michigan.it.umich.edu\/news\/wp-content\/uploads\/2020\/05\/Cereal_Harvest_in_Somerset_-_geograph.org_.uk_-_1484575-267x200.jpg",267,200,true],"medium_large":["https:\/\/michigan.it.umich.edu\/news\/wp-content\/uploads\/2020\/05\/Cereal_Harvest_in_Somerset_-_geograph.org_.uk_-_1484575.jpg",640,479,false],"large":["https:\/\/michigan.it.umich.edu\/news\/wp-content\/uploads\/2020\/05\/Cereal_Harvest_in_Somerset_-_geograph.org_.uk_-_1484575.jpg",600,449,false],"1536x1536":["https:\/\/michigan.it.umich.edu\/news\/wp-content\/uploads\/2020\/05\/Cereal_Harvest_in_Somerset_-_geograph.org_.uk_-_1484575.jpg",640,479,false],"2048x2048":["https:\/\/michigan.it.umich.edu\/news\/wp-content\/uploads\/2020\/05\/Cereal_Harvest_in_Somerset_-_geograph.org_.uk_-_1484575.jpg",640,479,false],"excerpt-thumbnail":["https:\/\/michigan.it.umich.edu\/news\/wp-content\/uploads\/2020\/05\/Cereal_Harvest_in_Somerset_-_geograph.org_.uk_-_1484575-200x140.jpg",200,140,true],"themonic-thumbnail":["https:\/\/michigan.it.umich.edu\/news\/wp-content\/uploads\/2020\/05\/Cereal_Harvest_in_Somerset_-_geograph.org_.uk_-_1484575-60x42.jpg",60,42,true],"ioslider-thumbnail":["https:\/\/michigan.it.umich.edu\/news\/wp-content\/uploads\/2020\/05\/Cereal_Harvest_in_Somerset_-_geograph.org_.uk_-_1484575-640x300.jpg",640,300,true],"post-thumbnail":["https:\/\/michigan.it.umich.edu\/news\/wp-content\/uploads\/2020\/05\/Cereal_Harvest_in_Somerset_-_geograph.org_.uk_-_1484575.jpg",640,479,false],"400x250-crop":["https:\/\/michigan.it.umich.edu\/news\/wp-content\/uploads\/2020\/05\/Cereal_Harvest_in_Somerset_-_geograph.org_.uk_-_1484575.jpg",334,250,false]},"uagb_author_info":{"display_name":"News Staff","author_link":"https:\/\/michigan.it.umich.edu\/news\/author\/mitnewsadm\/"},"uagb_comment_info":0,"uagb_excerpt":"The\u00a0Digital Public Library of America\u00a0displays over 36 million records. While a large share come from \u2018Content Hubs\u2019 like the Smithsonian or HathiTrust, there are still millions of records ingested from a wide range of smaller institutions across America. &#8220;The technologies we use to transform and validate XML records, like XSLT, are well-established and highly reliable,&hellip;","_links":{"self":[{"href":"https:\/\/michigan.it.umich.edu\/news\/wp-json\/wp\/v2\/posts\/18826","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/michigan.it.umich.edu\/news\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/michigan.it.umich.edu\/news\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/michigan.it.umich.edu\/news\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/michigan.it.umich.edu\/news\/wp-json\/wp\/v2\/comments?post=18826"}],"version-history":[{"count":2,"href":"https:\/\/michigan.it.umich.edu\/news\/wp-json\/wp\/v2\/posts\/18826\/revisions"}],"predecessor-version":[{"id":18829,"href":"https:\/\/michigan.it.umich.edu\/news\/wp-json\/wp\/v2\/posts\/18826\/revisions\/18829"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/michigan.it.umich.edu\/news\/wp-json\/wp\/v2\/media\/18827"}],"wp:attachment":[{"href":"https:\/\/michigan.it.umich.edu\/news\/wp-json\/wp\/v2\/media?parent=18826"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/michigan.it.umich.edu\/news\/wp-json\/wp\/v2\/categories?post=18826"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/michigan.it.umich.edu\/news\/wp-json\/wp\/v2\/tags?post=18826"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}