Move GHNP infrastructure to cloud
Migrate the following components into AWS (checkboxes to track completion)
-
Webserver (Apache 2.4.46/Django ?.?.?) -
Database (Postgresql 11.1) -
Solr Index (Solr 6.6.6)
Retain the following components at UGA
- IIIF Image server (Cantaloupe 4.0.3)
- Newspaper JP2s (stored on a ZFS filesystem)
Components to test and/or infrastructure decisions to be made
-
Speed and function of search when run on a single EC2 instance as opposed to a distributed solrcloud -
Speed and function of database when run as Aurora RDS instance vs self managed EC2 -
Speed and function of Apache/Django running the Ubuntu 16 Amazon Machine Image (AMI)
Current potential issues
-
IIIF Server is currently a reverse proxy so data will flow through AWS unnecessarily. Because AWS charges for data egress this would be costly. A fix would be to make IIIF access directly by the user's browser. This enhancement would require light sysadmin work and potentially a firewall exception from UGA. -
It is unclear how long the AMI lifetimes are and whether they will be forcibly upgraded like RDS databases are. I'll investigate this. -
The newspaper batch load process is currently heavily dependent on the webserver. We will need to carefully introduce a method of continuing to load newspapers locally at UGA so that they are retained in the IIIF attached storage while also 'injecting' them into the cloud hosted service.
Edited by Kevin Cottrell