Saturday, March 4, 2017

SSSD: {DBus,Socket}-activated responders (2nd try!)

Second time's the charm! :-)


Since the first post about this topic some improvements have been done in order to fix a bug found and reported by a Debian user (Thanks Stric!).

The fix is part of SSSD 1.15.1 release and altogether with the release, some other robustness improvements have been done! Let's go through the changes ...


Avoid starting the responders before SSSD is up!


I've found out that the NSS responder had been started up before SSSD and it's quit problematic during the boot up process as libc does initgroups on pretty much any account, checking all NSS modules in order to be precise.

By calling sss_nss the NSS responder is triggered and tries to talk to the data providers (which are not up yet, as SSSD is not up yet ...), causing the boot up process to hang until libc gives up (causing a timeout on services like systemd-login and all the services depending on this one).

The fix for this issue looks like:
1
2
3
4
5
6
7
8
@@ -1,6 +1,7 @@
  [Unit]
  Description=SSSD @responder@ Service responder socket
  Documentation=man:sssd.conf(5)
+ After=sssd.service
  BindsTo=sssd.service
 
  [Socket]

And, as I've been told by systemd developers that "BindsTo=" must always come together with "After=" (although it is not documented yet ...) this fix has been applied for all responders' unit files.


Avoid starting the responders' sockets before SSSD is up!


We really want (at least for now) to have the responders' sockets completely tied up to SSSD service. We want the responders to be socket-activated only after SSSD is up and just right above this section you can see an explanation why we want to have this kind of control.

In order to achieve this some changes were needed in the sockets' units, as systemd automatically adds "Before=sockets.target" to any socket unit by default (and sockets.target is started up in an really early phase of the boot process).

And there I went again to talk to systemd developers about the best approach to do not start the responder's sockets before SSSD is up and the patch that came out as a result of the discussion looks like:

1
2
3
4
5
6
7
8
9
@@ -3,6 +3,8 @@
  Documentation=man:sssd.conf(5)
  After=sssd.service
  BindsTo=sssd.service
+ DefaultDependencies=no
+ Conflicts=shutdown.target

  [Socket]
  ListenStream=@pipepath@/@responder@

By doing this change the sockets are no longer started before sockets.target, but just after SSSD service is started. The downside of this approach is that we have to deal with conflicts by our own and that is the reason the "Conflicts=shutdown.target" has been added.


Be more robust against misconfigurations!


As now that we have two completely different ways to manage the services, we really have to be robust in order to avoid that the admins will mix them up wrongly.

So far we have been flexible enough to allow admins to have some of the services being started up by the monitor, while other services left for systemd. And it's okay! The problem would start when the monitor has been told to start a responder (by having the responder listed in the services' line of sssd.conf) and this very same responder is supposed to be socket-activated (the admin did systemctl enable sssd-@responder@.socket).

In the situation describe above we could end up with two responders' services running (for the very same responder). The best way found to fix this issue is adding a simple program to check whether the socket-activated responder is also mentioned in the sssd.conf services' line. In case it's mentioned there, just do not start the socket up and leave the whole responsibility to the monitor. Otherwise, take advantage of systemd machinery!

The change on the sockets' unit looks like:
1
2
3
4
5
6
7
8
@@ -7,6 +7,7 @@
  Conflicts=shutdown.target

  [Socket]
+ ExecStartPre=@libexecdir@/sssd/sssd_check_socket_activated_responders -r @responder@
  ListenStream=@pipepath@/@responder@
  SocketUser=@SSSD_USER@
  SocketGroup=@SSSD_USER@


Also, I've decided to be a little bit stricter on our side and also refuse manual start up of the responders' services and the change for this looks like:
1
2
3
4
5
6
7
8
@@ -3,6 +3,7 @@
  Documentation=man:sssd.conf(5)
  After=sssd.service
  BindsTo=sssd.service
+ RefuseManualStart=true

  [Install]
  Also=sssd-@responder@.socket


And how can I start using the socket-activated services?


As by default we still use the monitor to manage services, some little configuration change is need.

See the example below explaining how to enable the PAM and AutoFS services to be socket-activated.

Considering your /etc/sssd/sssd.conf has something like:

1
2
3
[sssd]
services = nss, pam, autofs
...

Enable PAM and AutoFS responders' sockets:
# systemctl enable sssd-pam.socket
# systemctl enable sssd-autofs.socket

Remove both PAM and AutoFS responders from the services' line, like:

1
2
3
[sssd]
services = nss
...

Restart SSSD service
    # systemctl restart sssd.service
    

    And you're ready to go!


    Is there any known issue that I should be aware of?


    Yes, there is! You should avoid having PAC responder, needed by IPA domains, socket-activated for now. The reason for this is that due to an ugly hack on SSSD code this responder is added to the services' list anytime an IPA domain is detected.

    By doing this, the service is always started by the monitor and there is nothing that could be done on our socket's units to detected this situation and avoid starting up the PAC socket.

    A possible way to fix this issue is patching ipa-client-install to either explicitly add the PAC responder to the services' list (in case the admin wants to keep using the monitor) or to enable the PAC responders' socket (in case the admin wants to take advantage of socket-activation).

    Once it's done on IPA side, we would be able to drop the code that enables the PAC responder automatically from SSSD. However, doing this right now would break backwards compatibility!


    Where can I find more info about SSSD?


    More information about SSSD can be found on the project page: https://pagure.io/SSSD/sssd/

    If you want to report us a bug, please, follow this web page and file an issue in the SSSD pagure instance.

    Please, keep in mind that currently we're in the middle of a migration process from FedoraHosted to Pagure and it will take a while to have everything in place, again.

    Even though, you can find more info about SSSD's internals here.

    In case you want to contribute to the project, please, read this webpage and feel free to approach us at #sssd on freenode (irc://irc.freenode.net/sssd).

    No comments:

    Post a Comment