Skip to content

Develop Workflow for Data Export-Ingest for SCLFind

Description: We need a workflow that allows users from the Hargrett and Russell Libraries to be able to:

  • Export EAD.xml file(s) from ArchivesSpace on demand (this can be done using the ArchivesSpace API - see code for details)
    • The following export settings need to be set by default:
      • Include unpublished components = False
      • Include tags = True (aka digital objects)
      • Use numbered tags = True (aka numbered container levels)
      • Convert to EAD3 = False
  • Run a series of edits on the exported EAD.xml file(s) (Edits outlined in Details)
  • Upload said file(s) to the sclfind-eads GitLab repo
  • Kick off an indexing job on SCLFind/ArcLight to index the file(s) just uploaded (not index whole site)

Additionally, users should be able to do the following with above workflow:

  • Delete EAD.xml files from sclfind-eads GitLab repo and kick off indexing job to remove collection and data from SCLFind
  • Export PDF files from ArchivesSpace, upload them to a GitLab repo or equivalent, kick off indexing job on SCLFind to index the PDF to attach to the appropriate collection (see #118)

Related #7

Documentation (More Info): https://docs.google.com/document/d/1XxN1iflwrP15Vy9y-w-7d_ISkN8pxpGQvdUi4_N-0hI/edit?usp=sharing