Hermes Software Configuration Changes in 2018

Links below were the guides to what was done. It should be assumed that the indicated steps were executed by root.

The first post-acquisition update, in 2018... simply took Hermes from CentOS 7.3 to the latest OS release, 7.4.

A straightforward "yum update" initially failed. Error messages pointed to libgpod, which led to this page of known issues:
Yum update issues

The following easy workaround, which was originally documented in the CentOS-7 (1708) Release Notes, solved the problem:

yum downgrade libgpod
yum update

Then yum downloaded 829 packages and made 1585 updates, including the kernel. So as of 2/15/18, everything was up-to-date.
The kernel update seemed desirable in light of the recent Spectre and Meltdown vulnerabilities at the processor level.
A complete fix would involve updating microcode in BIOS, but a stable patch from Intel was still pending at the time of this update.

After a reboot, the usual login screen did not appear on the console! Instead, the tail of the boot-time messages appeared.
But Hermes was still accessible via ssh and even X2Go with MATE desktop. The console command line opened with ctrl-alt-F6.
It was not immediately clear whether the GNOME desktop broke during the update, or whether some other issue was to blame.
A scan through /var/log/messages turned up only one sign of trouble:

hermes gdm: GdmLocalDisplayFactory: maximum number of X display failures reached: check X server log for errors

From delving into /var/log/Xorg.0.log, it immediately became clear that the problem had to do with the NVIDIA drivers.
Downloads of the original drivers were still present on disk, but it seemed logical to install the latest releases from NVIDIA instead.
Red Barn said that this alternative would be fine and recommended a simple installation procedure based on NVIDIA's scripts:

chmod +x *.run
./cuda_9.1.85_387.26_linux.run
./NVIDIA-Linux-x86_64-390.25.run

CUDA is usually installed first because it typically includes older drivers than the ones that come in NVIDIA's driver download:
Unix Drivers (the choice was "Linux x86_64... Latest Long Lived Branch version: 390.25")

After a further reboot, the console went back to normal. And upon logging in at the login screen, a working desktop appeared!

The next set of steps was performed in March, starting on 3/1/18.

To prevent the NVIDIA problem from recurring, the DKMS (Dynamic Kernel Module Support) package was installed.
DKMS enables kernel modules such as device drivers to be automatically rebuilt and installed when a new kernel is installed.
You have to configure DKMS so that it knows which modules it is responsible for. Luckily NVIDIA makes it super-easy for you:
DKMS on CentOS 7 (One command is generally all it takes!)

dkms add -m nvidia -v 390.25

Files from Perseus were recovered... The first stage took place in March; the second stage was done in September.

The Perseus server had been refusing to boot for several months, and the archival data had to be retrieved from its hard drives.
Old HDDs were inserted one by one into an open drive bay on Hermes for data recovery. Five bays were available; only 1 worked.
Another issue was that the old HDDs are 2.5", while the bays are 3.5". Red Barn came to the rescue with a couple of converters.
What Are the Dimensions of a 2.5 SATA Drive?

Users could retrieve their world-readable files from the Perseus HDDs, but old owner-exclusive files were inaccessible. Why?
Mappings from UIDs to usernames did not line up on old and new systems. This made the file owners "wrong" for Hermes users.
It was especially a problem for the old /home mount. Root had to copy over all those directories and run chown/chgrp manually.
Furthermore, the old boot drive with /home had to be hot-swapped in because Hermes refused to boot when it was present.

mkdir /data2/jason/oldhome
rsync -avz /data3/jason/ /data2/jason/oldhome
chown -R jason:jason /user2/jason/oldhome


Last updated on 8/7/20 by Steve Lantz (steve.lantz ~at~ cornell.edu)