Monday, September 30, 2019

CentOS 8 in the Cloud the Hard Way

Hello Internet,

Update 2020-02-08: Don't do this, there's proper images published now.

Consider this a draft. I know the formatting is pretty terrible. Not all the commands are clearly marked in italics. Italics also turned out to be a bad way to show which parts are to be entered in as commands. In particular, the blocks where you enter text up to the EOF don't copy and paste nicely because Blogger's editor appended stray spaces. I'll need to manually clean up the HTML but obviously it's been a while since I last even used this. Sorry. This isn't a beginners guide and you should expect to have to do some troubleshooting on your own.

It's been a while but recently CentOS 8 just got released this past week and I wanted to play with it in the cloud.

This post is going to document how I got it mostly running in Rackspace's cloud. I doubt anyone would want to use this for anything besides testing because as you'll find out, not all the pieces you need for a good experience are available for el8, more of a just for fun toy.

The "right" way to import a custom image into that cloud is not what I'll be doing here. That's basically building an image locally and then uploading.

The approach I'll be doing here doesn't require any local build infrastructure since we're just going to boot into the network installer on the Emergency Console manually.

I'll be using a 4 GB General Purpose Cloud server. Probably could get away with running the installer on a 2 GB but I didn't bother trying.

There's also a boot.rackspace.com image that I'm not gonna use here.

Let's get started.

Build Instructions

Get a Cloud Server

I'm creating a new server:
  • Named "centos8-template"
  • Using the CentOS 7 image.
  • 4 GB General Purpose v1 Flavor
  • Local Boot Source (a CBS volume may have been a good idea too)
  • I set an SSH key
  • Left PublicNet and ServiceNet attached
  • Un-checked all the Recommended Installs
I didn't bother recording the root password since I'm installing over this image and to log in for initial prep work uses my SSH key.

Prep the System

I'm trying to omit the public IP. Realistically I'm deleting this server when I'm done and I don't think it really matters but that's why you're not seeing it. It's hopefully redacted everywhere as "publicIP" or later I got lazier and just redacted the last octet with "xyz" so sub in your IP if you're following along.

The next step is to update the grub config to make it easy to boot into the CentOS 8 installer.
Set the timeout to something long enough to catch at the console:
[root@centos8-template ~]# sed -i.bak 's/^GRUB_TIMEOUT=.*/GRUB_TIMEOUT=128/' /etc/default/grub
[root@centos8-template ~]# grep ^GRUB_TIMEOUT= /etc/default/grub
GRUB_TIMEOUT=128


That's normally 1 second.

I need to add a new boot entry to grub to run the installer but because I'm writing this as I'm figuring it out, now it a good item to get the net installer kernel and initrd.
I'm going to use mirror.rackspace.com since it's nice and fast for that cloud: http://mirror.rackspace.com/centos/8/BaseOS/x86_64/os/images/pxeboot/

[root@centos8-template ~]# cd /boot/
[root@centos8-template boot]# wget https://mirror.rackspace.com/centos/8/BaseOS/x86_64/os/images/pxeboot/initrd.img https://mirror.rackspace.com/centos/8/BaseOS/x86_64/os/images/pxeboot/vmlinuz
...snip...
Total wall clock time: 0.7s
Downloaded: 2 files, 65M in 0.6s (103 MB/s)
[root@centos8-template boot]#


And for no good reason, I'm renaming both files:
[root@centos8-template boot]# mv {,centos8-}vmlinuz
[root@centos8-template boot]# mv {,centos8-}initrd.img
[root@centos8-template boot]# ls -al /boot/centos8-*
-rw-r--r--. 1 root root 60484580 Aug 15 21:22 /boot/centos8-initrd.img
-rw-r--r--. 1 root root  7872760 Jun  4 09:27 /boot/centos8-vmlinuz
[root@centos8-template boot]#
[root@centos8-template boot]# sha256sum  /boot/centos8-*
7d2374d0f91c2003b31711dc29f288fcea085af66c5ed639c0a726efe82ca926  /boot/centos8-initrd.img
44ac0373e729e06dd52be8ad925d2789f9a6e822aa9508b1694d58593f32d7a6  /boot/centos8-vmlinuz
[root@centos8-template boot]#


Ideally I should verify these against something authoritative...

With those, I'm going to build out the grub menu entry. 

To make that a little easier, I'm prep'ing some environment variables. Actually it's not easier, I just did it by hand my first pass through this attempt and didn't have this generalized scripting.

The installer will need the network configuration so building that up first:
[root@centos8-template boot]# export IFCFG=/etc/sysconfig/network-scripts/ifcfg-eth0
[root@centos8-template boot]# echo $IFCFG
/etc/sysconfig/network-scripts/ifcfg-eth0
[root@centos8-template boot]# export IPLINE="ip=$(awk -F= '/^IPADDR=/{print $2}' $IFCFG)::$(awk -F= '/^GATEWAY=/{print $2}' $IFCFG):$(awk -F= '/^NETMASK=/{print $2}' $IFCFG):centos8-template:eth0:none:$(awk -F= '/^DNS1=/{print $2}' $IFCFG)"
[root@centos8-template boot]# echo $IPLINE
ip=104.239.142.xyz::104.239.142.1:255.255.255.0:centos8-template:eth0:none:72.3.128.241
[root@centos8-template boot]#


That ip= line was probably the most annoying part of all this.

In this cloud we're on a Xen Hypervisor so I really need two kernel modules loaded early, or at least one of them. xen_netfront and xen_blkfront.

For reference, here's what the existing CentOS 7 system is using:
[root@centos8-template boot]# lsmod | grep xen
xenfs                  12667  1
xen_privcmd            13206  1 xenfs
xen_blkfront           26922  2
xen_netfront           27082  0
[root@centos8-template boot]#


How do I get those to load early in the boot?
http://man7.org/linux/man-pages/man7/dracut.cmdline.7.html
"rd.driver.pre" didn't seem to work or I did it wrong so I'm going another direction.

[root@centos8-template boot]# grep -H "" /sys/module/xen_netfront/parameters/*
/sys/module/xen_netfront/parameters/max_queues:4


I'm going to force setting that parameter and as a side effect get this driver to load early.

[root@centos8-template boot]# export FORCEMOD="xen_netfront.max_queues=4"
[root@centos8-template boot]#


[root@centos8-template boot]# export REPO="inst.repo=https://mirror.rackspace.com/centos/8/BaseOS/x86_64/os/"
[root@centos8-template boot]#


[root@centos8-template boot]# export CMDLINE="$IPLINE $REPO $FORCEMOD"
[root@centos8-template boot]# echo $CMDLINE
ip=104.130.127.xyz::104.130.127.1:255.255.255.0:centos8-template:eth0:none:72.3.128.241 inst.repo=https://mirror.rac
kspace.com/centos/8/BaseOS/x86_64/os/ xen_netfront.max_queues=4
[root@centos8-template boot]#



Assuming you're starting from this:
[root@centos8-template boot]# cat /etc/grub.d/40_custom
#!/bin/sh
exec tail -n +3 $0
# This file provides an easy way to add custom menu entries.  Simply type the
# menu entries you want to add after this comment.  Be careful not to change
# the 'exec tail' line above.
[root@centos8-template boot]#


Run:
[root@centos8-template boot]# cat >> /etc/grub.d/40_custom <<EOF
menuentry 'CentOS 8 Installer $(date)' {
 load_video
 insmod gzio
 insmod part_msdos
 insmod ext2
 set root='hd0,msdos1'
 linux16 /boot/centos8-vmlinuz $CMDLINE
 initrd16 /boot/centos8-initrd.img
}
EOF


And you'll see:
[root@centos8-template boot]# cat /etc/grub.d/40_custom
#!/bin/sh
exec tail -n +3 $0
# This file provides an easy way to add custom menu entries.  Simply type the
# menu entries you want to add after this comment.  Be careful not to change
# the 'exec tail' line above.
menuentry 'CentOS 8 Installer Mon Sep 30 05:14:13 UTC 2019' {
 load_video
 insmod gzio
 insmod part_msdos
 insmod ext2
 set root='hd0,msdos1'
 linux16 /boot/centos8-vmlinuz ip=104.130.127.xyz::104.130.127.1:255.255.255.0:centos8-template:eth0:none:72.3.128.
241 inst.repo=https://mirror.rackspace.com/centos/8/BaseOS/x86_64/os/ xen_netfront.max_queues=4
 initrd16 /boot/centos8-initrd.img
}
[root@centos8-template boot]#


Now to actually update the grub config.
[root@centos8-template boot]# grub2-mkconfig -o /etc/grub2.cfg
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-3.10.0-957.21.3.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-957.21.3.el7.x86_64.img
Found linux image: /boot/vmlinuz-0-rescue-f938a40b07ef4f51af489d035b0c5605
Found initrd image: /boot/initramfs-0-rescue-f938a40b07ef4f51af489d035b0c5605.img
done
[root@centos8-template boot]#


And it's in there:
[root@centos8-template boot]# grep centos8 /boot/grub2/grub.cfg
 linux16 /boot/centos8-vmlinuz ip=104.130.127.217::104.130.127.1:255.255.255.0:centos8-template:eth0:none:72.3.128.
241 inst.repo=https://mirror.rackspace.com/centos/8/BaseOS/x86_64/os/ xen_netfront.max_queues=4
 initrd16 /boot/centos8-initrd.img
[root@centos8-template boot]#


The fun begins.


Start the Installer

First, open the Emergency Console for the Cloud Server. Gonna have to reconnect after the reboot but might as well make sure it's working before we start the harder to debug section. And the lazy way to reboot, click the Ctrl-Alt-Del button on the console page.

The console will get disconnected, wait 15 seconds and click Connect or hit F5. If it's not ready, don't spam it or your console request may get ratelimited, just wait 10 seconds between attempts, you have a two minute timeout window for grub anyways. Down arrow to the installer entry, cross your fingers, and hit enter.

The waiting game beings, will it actually work the first time? In my case, amazingly yes... if we don't count all the tries last time I tried writing this up the other day.

You can probably figure out how to use the installer yourself.
The big changes I made:
  • Turn off KDUMP. It uses kexec and AFAIK doesn't work on Xen guest VMs.
  • Mess with the partitioning. I used custom partitioning with 1 18GiB / partition and no others to match how the CentOS 7 image was built out and only 18 GiB so I can copy this disk to a smaller server size easier later.
    • I ignored the lack of swap.

  • Then finally, I changed the Software Selection to "Server" without the GUI and checked Guest Agents.
  • Begin Install! (Good news for readers, I messed something I've since removed at this point (was being silly on the partitioning screen) and started over using the steps above so they probably work right... except where blogger split my long lines awkwardly and otherwise ruined a bunch of the pre-formatted text copied from ssh)
  • Set a root password on the next screen, and wait for the install to complete.
    I picked an ok password but as a precaution, while this was going I assigned the both Cloud Networks to security groups whitelisting only my IP and the Rackspace monitoring poller ranges and the DNS servers IP. (For DFW that means 72.3.128.241 and .240, the last part of the kernel command line's ip= section)
  • Click the reboot button!
  • Open the console again. Your session for the tab probably timed out so close the console tab, refresh the server info page, and then open the console again.
  • You should now have a CentOS Linux 8 (Core) prompt!

Post-Installer Configuration

Next I need nova-agent working. Oh there's no el8 package I could find so we're building from source!
Working from the console isn't any fun so I'm ssh'ing back in. You'll need to delete the old known_hosts entry line.
First thing I did is add my authorized_keys file so I don't have to enter that root password I set in the installer.
You might be tempted to image at this point but without nova-agent I don't think it's going to be as useful.
First, install git.
[root@centos8-template ~]# dnf -y install git

And some python3 packages:
[root@centos8-template ~]# dnf install python3-pip python3-virtualenv python3-devel

Change directory over to /opt:
[root@centos8-template ~]# cd /opt/
[root@centos8-template opt]#

Get nova-agent:
[root@centos8-template opt]# git clone https://github.com/Rackspace-DOT/nova-agent.git

Create a virtual-environment for nova-agent to run out of:
[root@centos8-template opt]# python3 -m venv env-nova-agent

[root@centos8-template opt]# source env-nova-agent/bin/activate
(env-nova-agent) [root@centos8-template opt]#

Install some more missing pre-req's:
(env-nova-agent) [root@centos8-template opt]# dnf group install 'Development Tools'

Now build and install nova-agent!
(env-nova-agent) [root@centos8-template opt]# pip3 install nova-agent/

When it's done:
(env-nova-agent) [root@centos8-template opt]# which nova-agent /opt/env-nova-agent/bin/nova-agent

Copy the systemd unit file to manually install it:
(env-nova-agent) [root@centos8-template opt]# cp nova-agent/etc/nova-agent.service /etc/systemd/system/

And fix the path:
(env-nova-agent) [root@centos8-template opt]# sed -i.bak 's#/usr/bin/nova-agent#/opt/env-nova-agent/bin/nova-agent#' /etc/systemd/system/nova-agent.se
rvice

(env-nova-agent) [root@centos8-template opt]#

(env-nova-agent) [root@centos8-template opt]# cat /etc/systemd/system/nova-agent.service
[Unit]
Description=Nova Agent for xenstore
DefaultDependencies=no
Wants=cloud-init-local.service
Before=network-online.target
Before=cloud-init.service

[Service]
Type=notify
TimeoutStartSec=360
ExecStart=/opt/env-nova-agent/bin/nova-agent --no-fork True -o /var/log/nova-agent.log -l info

[Install]
WantedBy=multi-user.target
(env-nova-agent) [root@centos8-template opt]#

Notice the ExecStart line now refers to the result of the earlier "which" command.

Get it started!
(env-nova-agent) [root@centos8-template opt]# systemctl daemon-reload
(env-nova-agent) [root@centos8-template opt]# systemctl enable --now nova-agent.service
(env-nova-agent) [root@centos8-template opt]# systemctl status nova-agent.service
nova-agent.service - Nova Agent for xenstore
  Loaded: loaded (/etc/systemd/system/nova-agent.service; enabled; vendor preset: disabled)
  Active: active (running) since Mon 2019-09-30 02:12:27 CDT; 7s ago
Main PID: 29252 (nova-agent)
   Tasks: 2 (limit: 25014)
  Memory: 14.5M
  CGroup: /system.slice/nova-agent.service
          └─29252 /opt/env-nova-agent/bin/python3 /opt/env-nova-agent/bin/nova-agent --no-fork True -o /var/log/nova-agent.log -l info

Sep 30 02:12:27 centos8-template systemd[1]: Starting Nova Agent for xenstore...
Sep 30 02:12:27 centos8-template systemd[1]: Started Nova Agent for xenstore.
(env-nova-agent) [root@centos8-template opt]#

Probably want to install cloud-init as well:
First disconnect ssh and reconnect to get out the the virtualenv.
[root@centos8-template ~]# dnf install cloud-init

After that's done installing, add a custom config file:

[root@centos8-template ~]# cat > /etc/cloud/cloud.cfg.d/10_rackspace.cfg <<EOF
datasource_list: [ ConfigDrive, None ]
disable_root: False
ssh_pwauth: True
ssh_deletekeys: False
resize_rootfs: noblock
growpart:
 mode: auto
 devices: ['/']

cloud_config_modules:
- disk_setup
- mounts
- ssh-import-id
- locale
- set-passwords
- package-update-upgrade-install
- yum-add-repo
- timezone
- puppet
- chef
- salt-minion
- mcollective
- disable-ec2-metadata
- runcmd
- byobu

cloud_init_modules:
- migrator
- bootcmd
- write-files
- growpart
- resizefs
- set_hostname
- update_hostname
- update_etc_hosts
- rsyslog
- users-groups
- ssh

cloud_final_modules:
- rightscale_userdata
- scripts-per-once
- scripts-per-boot
- scripts-per-instance
- scripts-user
- phone-home
- final-message
network: {config: disabled}
EOF

And then enable it:
[root@centos8-template ~]# systemctl enable cloud-init

nova-agent doesn't know NetworkManager so install the legacy ifup scripts:
[root@centos8-template ~]# dnf install network-scripts

And turn on the old network service unit:
[root@centos8-template ~]# systemctl enable network.service

If you're good with only making servers 4GB and larger skip this step.
And to prevent growpart from working (for now), make a second partition.

[root@centos8-template ~]# parted -a opt /dev/xvda mkpart primary ext4 37750784s 37754784s

Let's reboot!

Just before rebooting, make sure you can log in via the console. We're about to mess with networking.
Good right, ok.

[root@centos8-template ~]# reboot

After about 30 seconds it should be back up. Check the console if it's not.

Once it's back up, we'll try triggering nova-agent using a reset-network api call.
The easy way with "Pitchfor" to make the api call:
https://pitchfork.rax.io/servers/#resize_server-cloud_servers
Click the Log In link in the upper right and log in. I used my Username and API key.
Set the region to the correct one for your cloud server.

Hit Details on the "Reset Network" line. Put your cloud server's UUID in the box and click Send API Call.
Network should drop a moment and when it's back, "ip a" should show eth0 and eth1 configured. Reboot again. Regret you forgot to enable network.service the first time.

So that network and cbs hotplug works, let's install some xentools:
# dnf -y install golang
# cd /opt/
# git clone https://github.com/xenserver/xe-guest-utilities.git
# cd xe-guest-utilities
go get golang.org/x/sys/unix
# make
# cd /
tar -hzxf /opt/xe-guest-utilities/build/dist/xe-guest-utilities_6.*_x86_64.tgz
chkconfig xe-linux-distribution on
# reboot

Extra credit

I wanted to rebuild this as a 1 GB server. This is why I limited / to 18 GiB (to stay safely away from 20 GiB which is the default storage size for the 1GB GP v1 server).

The idea here is get two servers in rescue mode, dd /dev/xvdb from the the new centos8 server to a new 1 GB centos7 server's /dev/xvdb. Then delete the guard partition #2, exit rescue mode on the smaller one, and send it the reset-network api call again. Finally take an image of that one and call it the CentOS 8 Template.
I'm at my quota limit so I'm dd'ing to a file another existing server, deleting this 4GB server, then building the 1GB server to dd onto in rescue mode.

Power down this server:
[root@centos8-template ~]# poweroff

Put the server in Rescue Mode, note the password (also your ssh key should work but known_hosts will complain again since the rescue mode has a different key than the normal running os.

Build or log into a second Cloud Server 1 GB cloud server in the same datacenter.

If you're going direct:

SSH in to the small server with agent forwarding enabled (ssh -A).
At this point both servers should be in rescue mode so get the ServiceNet IP of the centos8-template server.
Run:
# ssh root@SERVICENETIPHERE
If this isn't working, a security group on the cloud network might be getting in your way.
The lazy fix is to remove it. The better fix is to just add both server's servicenet IP's to a whitelist for both.

You should be on the centos8-template in rescue mode, install pv and pigz (pigz dropped the transfer time from 3 minutes 30 seconds to 1 minute 19 seconds).
# yum -y install pv pigz
Then exit back to the 1GB server's rescue mode
# exit

You're back on the 1GB in rescue mode (and know ssh works so start the copy)
# ssh root@SERVICENETIPHERE "pv -f -sS 19330449920 /dev/xvdb | pigz --fast" | zcat > /dev/xvdb

You'll get a nice little progress indicator which should vary wildly depending on how much the image is compressing at the moment. 

Once it's done, delete the partition /dev/xvdb2.
# parted /dev/xvdb rm 2

Power off
# poweroff

Give it a minute and then exit rescue mode on this new 1GB server.
Check the console to see when it's up.

When it's up, try ssh'ing in. If it doesn't work, make the reset-network api call again using this server's uuid.

SSH should work at this point. Make a server image.
Try building a server from that image. You should be able to delete the first two now assuming this third from the image turns out well.