[clug] The Tale of The Very Dead Ubuntu box

Stephen Hocking stephen.hocking at gmail.com
Thu Jun 6 20:22:29 UTC 2019


Yeah, I should've mentioned that we did that. It was useful to take
snaphots at a few milestones along the way.

On Thu., 6 Jun. 2019, 23:25 George at Clug via linux, <linux at lists.samba.org>
wrote:

>
>
> On Thursday, 06-06-2019 at 18:54 Stephen Hocking via linux wrote:
> > Hi all,
> >
> > Gather around for an entertaining story….
> >
> > We had a box, which was not supported by or known to us, that we were
> asked
> > to help with.
> >
> > It was used by our client, both in-country & overseas.
> > It was not backed up, except for an occasional DB dump
> > The source code for the website it ran wasn’t backed up either, as far as
> > anyone knew.
> > It wasn’t documented anywhere.
> >
> > It had crashed at some stage, and was so badly mangled that it was
> sitting
> > at the grub prompt, because the grub config that specified what kernel to
> > load had itself been trashed. After some research, it was determined that
> > we should use a rescue CD to see what kernels were actually on the box.
> > There were hundreds. Some were missing their initrd files. We picked an
> > intact one and typed the following:
> >
> > set root=(hd0,1)
> > linux /boot/vmlinuz-3.13.0-98-generic
> > initrd /boot/initrd-3.13.0-98-generic
> > boot
> >
> > The box booted, then quickly panicked, because it didn’t know what its
> root
> > partition was. Altering the “linux” line to:
> >
> > linux /boot/vmlinuz-3.13.0-98-generic root=/dev/sda1
> >
> > fixed that error, but another crash occurred because /sbin/init could not
> > load a shared library (why it wasn’t linked statically is a mystery to
> me).
> > This posed something of a problem. By looking at another Ubuntu box (my
> > laptop) we could determine what package provided that shared library (by
> > using the apt-file utility), so that we could reinstall it. In order to
> do
> > this, we needed to boot off the rescue CD and install packages from it.
> The
> > usual method of mounting the box’s root filesystem on /mnt, chrooting
> > to/mnt, and then using dpkg to install packages didn’t work, because
> > various shared libraries that the packaging utilities used were missing.
> > Getting out of the chroot environment and running dpkg or apt-get pointed
> > at a non / install environment was a bit beyond us at this point (we were
> > getting a little tired and were on a steep learning curve).
> >
> > Now Debian/Ubuntu packages are created using the “ar” utility, which is
> > normally used to create library archives for programs to be linked
> against.
> > The .deb file is an archive of three components - debian-binary,
> > control.tar.gz and data.tar.xz.  The  file data.tar.xz is where all the
> >  files that make up the package are.  We extract these files using the
> “ar
> > x somepkg.deb” command. Sometimes, for the older packages, the data.tar
> > file has a .gz extension. One can do a pseudo install by changing to the
> > root directory of the installation (which, if we’re in the rescue CD mode
> > and have mounted the root filesystem under /mnt, is /mnt) and unpacking
> the
> > data.tar file. This, of course, will not update the package utility
> > databases.
> >
> > Iterating through the process of booting the box from the grub command
> and
> > seeing what shared libraries were missing for /sbin/init and installing
> the
> > relevant packages allowed us to get past /sbin/init causing a crash. This
> > allowed the boot process to continue to a point where it would
> > spontaneously reboot. This, obviously, was not desirable. We got past
> this
> > by changing the “linux” command line to the old standby of
> >
> > linux /boot/vmlinuz-3.13.0-98-generic root=/dev/sda1 init=/bin/bash
> >
> > This revealed that there were a few other shared library packages that
> > needed reinstalling to run bash, so we do the dance with booting off the
> > rescue CD and extracting files to place the shared libraries until such
> > time as we end up with a bash shell running, we remount the root
> filesystem
> > as read/write, try running a couple of utilities, and discover that the
> > libc version that we installed off the rescue CD isn’t the right one. It
> > turns out we’re on an Ubuntu 14.04 system, whereas the rescue CD is
> Ubuntu
> > 12.04. Woops. A 14.04 CD is procured, and the packages we’d
> > ghetto-installed we re-install, including libc. Fortunately there was
> only
> > 4 packages installed via that method. This fixed the libc problem.
> > Obviously the ghetto install of packages really isn’t viable as a
> solution,
> > so after a couple of false starts, we manage to get the network up and
> > running and attempt to reinstall packages from a repository on the
> network.
> > We have ghetto-install from the mounted CD a few times to get things to
> the
> > point where apt-get doesn’t complain about missing shared libraries, and
> > then we re-install all the grub packages. This rebuilds the grub config
> > where we can reboot successfully without having to do the manual grub
> > configuration above. A number of pam library modules have to be
> > re-installed before we can login at the console, and a bunch of other
> > libraries have to be re-installed before the sshd daemon will start and
> > talk to LDAP.
> >
> > There’s a bunch of work to do with the app – it appears that a lot of the
> > shared libraries on the system have been thoroughly mangled, but we can
> > reboot the box and do the basic OS things.
> >
> > Enter the debsums utility. This fine thing examines the package database
> > and reports back if any of the files (excluding the config files which be
> > expected to be edited locally)  are corrupt. If the file isn’t there, it
> > reports it too stderr. Redirecting stderr through a suitable pipeline of
> > grep, sed & sort -u gives us a list of packages that have missing files.
> A
> > series of “apt-get install —reinstall  somepkgs” later, we can now start
> > mysql & apache2.
> >
> > We didn’t know the root password of the mysql database, so cracked that
> via
> > the usual stackoverflow advice, which then allowed one of my colleagues
> to
> >  install a dump of the DB that had been taken a few days earlier.  The
> box
> > is then snapshotted (it’s running under VMWare) After making sure all the
> > package installs are complete, we  do an “apt-get update && apt-get
> > dist-upgrade”, reboot again, verify that the app is working, and hand it
> > back to a deeply grateful client.
> >
>
> "reboot again, verify that the app is working, and hand it back to a
> deeply grateful client.", you mean this is a story with a happy ending?  I
> am so very impressed, good work to the team.
>
> Just to ask, if the computer was a virtual machine, did anyone do a clone
> as the first step, and then proceed to work on repairing the clone?
>
> Not that it seems taking a clone was required, so congratulations.
>
> As Bryan commented, "did the client learn from this experience", and has
> now implemented backups and system documentation?  ( that was a rhetorical
> question. I do hope they are, but ... I ask myself, do I ? )
>
> George.
>
> >
> > --
> >
> >   "I and the public know
> >   what all schoolchildren learn
> >   Those to whom evil is done
> >   Do evil in return"          W.H. Auden, "September 1, 1939"
> > --
> > linux mailing list
> > linux at lists.samba.org
> > https://lists.samba.org/mailman/listinfo/linux
> >
>
> --
> linux mailing list
> linux at lists.samba.org
> https://lists.samba.org/mailman/listinfo/linux
>


More information about the linux mailing list