Systemd Boot order and Tomcat7 and nslcd

I had a problem yesterday with on of our servers not starting up tomcat after a reboot. I eventually tracked it down to an error in the boot ordering. I thought it might be useful to write down the steps I took to work out what was happening and how I fixed it.

First thing is tracking down the error. The logs has the following line:

start-stop-daemon: user 'xyz' not found

Now this suggests that the users I am trying to run tomcat as is not available. This is linked to the fact that the user xyz is a user that comes from ldap. My hunch is that this is a boot order thing. So how do we start to find out what is going on?

Luckily systemd comes with quite a lot of tools to help sort this out. But first of all a quick scan of the logs shows that nslcd, the daemon that provides ldap users, does indeed start after tomcat.

Lets start to look at what is happening. The first tool I picked on is 'systemd-analyse' and this shows a little information.

systemd-analyze dot "tomcat7*"
digraph systemd {
        "graphical.target"->"tomcat7.service" [color="green"];
        "graphical.target"->"tomcat7.service" [color="grey66"];
        "multi-user.target"->"tomcat7.service" [color="green"];
        "multi-user.target"->"tomcat7.service" [color="grey66"];
        "shutdown.target"->"tomcat7.service" [color="green"];
        "tomcat7.service"->"network-online.target" [color="green"];
        "tomcat7.service"->"local-fs.target" [color="green"];
        "tomcat7.service"->"remote-fs.target" [color="green"];
        "tomcat7.service"->"systemd-journald.socket" [color="green"];
        "tomcat7.service"->"sysinit.target" [color="green"];
        "tomcat7.service"->"basic.target" [color="green"];
        "tomcat7.service"->"nss-lookup.target" [color="green"];
        "tomcat7.service"->"system.slice" [color="green"];
        "tomcat7.service"->"sysinit.target" [color="black"];
        "tomcat7.service"->"system.slice" [color="black"];
        "tomcat7.service"->"network-online.target" [color="grey66"];
        "tomcat7.service"->"shutdown.target" [color="red"];
}

This is a dot file and you can view it in graphical form using something like dotty but for this small section I can read it fine. We can see here that there is not dependency between the two services that we are interested in. Now it maybe there are other things in play here but lets continue to look at nslcd.

systemd-analyze dot "nslcd*"
   digraph systemd {
        "atd.service"->"nslcd.service" [color="green"];
        "courier-pop-ssl.service"->"nslcd.service" [color="green"];
        "apache2.service"->"nslcd.service" [color="green"];
        "courier-ldap.service"->"nslcd.service" [color="green"];
        "kdm.service"->"nslcd.service" [color="green"];
        "mail-transport-agent.target"->"nslcd.service" [color="green"];
        "masqmail.service"->"nslcd.service" [color="green"];
        "courier-pop.service"->"nslcd.service" [color="green"];
        "graphical.target"->"nslcd.service" [color="green"];
        "graphical.target"->"nslcd.service" [color="grey66"];
        "kolab-cyrus-common.service"->"nslcd.service" [color="green"];
        "multi-user.target"->"nslcd.service" [color="green"];
        "multi-user.target"->"nslcd.service" [color="grey66"];
        "nullmailer.service"->"nslcd.service" [color="green"];
        "nslcd.service"->"system.slice" [color="green"];
        "nslcd.service"->"time-sync.target" [color="green"];
        "nslcd.service"->"basic.target" [color="green"];
        "nslcd.service"->"network-online.target" [color="green"];
        "nslcd.service"->"remote-fs.target" [color="green"];
        "nslcd.service"->"nss-lookup.target" [color="green"];
        "nslcd.service"->"slapd.service" [color="green"];
        "nslcd.service"->"sysinit.target" [color="green"];
        "nslcd.service"->"systemd-journald.socket" [color="green"];
        "nslcd.service"->"shishi-kdc.service" [color="green"];
        "nslcd.service"->"heimdal-kcm.service" [color="green"];
        "nslcd.service"->"heimdal-kdc.service" [color="green"];
        "nslcd.service"->"krb5-kdc.service" [color="green"];
        "nslcd.service"->"systemd-journald-dev-log.socket" [color="green"];
        "nslcd.service"->"sysinit.target" [color="black"];
        "nslcd.service"->"system.slice" [color="black"];
        "nslcd.service"->"network-online.target" [color="grey66"];
        "nslcd.service"->"shutdown.target" [color="red"];
        "citadel.service"->"nslcd.service" [color="green"];
        "courier-mta.service"->"nslcd.service" [color="green"];
        "cyrus-imapd.service"->"nslcd.service" [color="green"];
        "sendmail.service"->"nslcd.service" [color="green"];
        "cron.service"->"nslcd.service" [color="green"];
        "wdm.service"->"nslcd.service" [color="green"];
        "xdm.service"->"nslcd.service" [color="green"];
        "courier-mta-ssl.service"->"nslcd.service" [color="green"];
        "am-utils.service"->"nslcd.service" [color="green"];
        "slim.service"->"nslcd.service" [color="green"];
        "autofs.service"->"nslcd.service" [color="green"];
        "shutdown.target"->"nslcd.service" [color="green"];
        "display-manager.service"->"nslcd.service" [color="green"];
        "gdm3.service"->"nslcd.service" [color="green"];
        "exim4.service"->"nslcd.service" [color="green"];
        "dovecot.service"->"nslcd.service" [color="green"];
}

Again not link between the two but notice all those other services? I think we are heading in the right direction. Time for a different tool now. Lets look at the config for some of these services.

systemctl cat tomcat7
    # /run/systemd/generator.late/tomcat7.service
    # Automatically generated by systemd-sysv-generator

    [Unit]
    Documentation=man:systemd-sysv-generator(8)
    SourcePath=/etc/init.d/tomcat7
    Description=LSB: Start Tomcat.
    Before=multi-user.target
    Before=multi-user.target
    Before=multi-user.target
    Before=graphical.target
    Before=shutdown.target
    After=local-fs.target
    After=remote-fs.target
    After=network-online.target
    After=nss-lookup.target
    Wants=network-online.target
    Conflicts=shutdown.target

    [Service]
    Type=forking
    Restart=no
    TimeoutSec=5min
    IgnoreSIGPIPE=no
    KillMode=process
    GuessMainPID=no
    RemainAfterExit=yes
    ExecStart=/etc/init.d/tomcat7 start
    ExecStop=/etc/init.d/tomcat7 stop

This tells us a couple of things. First off systemd is using the old sysv init script to start tomcat. And second there is little in there to indicate a dependency on anything more than a basic system. Now lets look at nslcd

systemctl cat nslcd
    /run/systemd/generator.late/nslcd.service
    # Automatically generated by systemd-sysv-generator

    [Unit]
    Documentation=man:systemd-sysv-generator(8)
    SourcePath=/etc/init.d/nslcd
    Description=LSB: LDAP connection daemon
    Before=multi-user.target
    Before=multi-user.target
    Before=multi-user.target
    Before=graphical.target
    Before=shutdown.target
    Before=mail-transport-agent.target
    Before=display-manager.service
    Before=am-utils.service
    Before=apache2.service
    Before=atd.service
    Before=autofs.service
    Before=citadel.service
    Before=courier-ldap.service
    Before=courier-mta.service
    Before=courier-mta-ssl.service
    Before=courier-pop.service
    Before=courier-pop-ssl.service
    Before=cron.service
    Before=cyrus-imapd.service
    Before=dovecot.service
    Before=exim4.service
    Before=gdm3.service
    Before=kdm.service
    Before=kolab-cyrus-common.service
    Before=mail-transport-agent.target
    Before=masqmail.service
    Before=nullmailer.service
    Before=sendmail.service
    Before=slim.service
    Before=wdm.service
    Before=xdm.service
    After=remote-fs.target
    After=systemd-journald-dev-log.socket
    After=time-sync.target
    After=nss-lookup.target
    After=network-online.target
    After=slapd.service
    After=krb5-kdc.service
    After=heimdal-kdc.service
    After=heimdal-kcm.service
    After=shishi-kdc.service
    Wants=network-online.target
    Conflicts=shutdown.target

    [Service]
    Type=forking
    Restart=no
    TimeoutSec=5min
    IgnoreSIGPIPE=no
    KillMode=process

Ah okay antother sysv init script and this is where my deps are defined. Not that this info is different from the systemd-analyse output as it shows the config from the files and not a full dependency tree. So lets have a look at the top of the nslcd init script:

head -35 /etc/init.d/nslcd |tail -15

### BEGIN INIT INFO
# Provides:          nslcd
# Required-Start:    $remote_fs $syslog $time
# Required-Stop:     $remote_fs $syslog
# Should-Start:      $named $network slapd krb5-kdc heimdal-kdc heimdal-kcm shishi-kdc
# Should-Stop:       $network
# X-Start-Before:    $mail-transport-agent $x-display-manager am-utils apache2 atd autofs citadel courier-ldap courier-mta courier-mta-ssl courier-pop courier-pop-ssl cron cyrus-imapd dovecot exim4 gdm3 kdm kolab-cyrus-common mail-transport-agent masqmail nullmailer sendmail slim wdm xdm
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: LDAP connection daemon
# Description:       nslcd is a LDAP connection daemon that is used to
#                    do LDAP queries for the NSS and PAM modules.
### END INIT INFO

These are the LSB headers that systemd will use to work out the start order for old sysv init scripts. And the X-Start-Before header is the one we want to fix. Adding "tomcat7 tomcat8" to the end of that line will "fix" this and I have added a bug to ubuntu to try and get that fixed. https://bugs.launchpad.net/ubuntu/+source/nss-pam-ldapd/+bug/1605167

In actual fact I ended up adding "nslcd" to the end of the Required-Start line in the tomcat7 init script as it felt like a better fit for my setup.

I think the correct systemd way is to add a file '/etc/systemd/system/tomcat7.service.d/override.conf' with the following content

[Unit]
After=nslcd.service

This can be done simply by running 'systemctl edit tomcat7'.

Either of those solution seem to be a little odd, having all those services hard coded seems odd. Maybe a better solution would be to have another $ service such as $networkusers and have services depend on that?

Comments

Comments powered by Disqus