Linux 6.5 Merge Window

Linux v6.4 was released this past Sunday, with the Linux v6.5 merge window opening immediately afterwards. Below are the highlights of the SELinux and audit pull requests which Linus merged this week.

SELinux

  • Fixed a longstanding issue with MultiPath TCP (MPTCP) where the MPTCP subflows were not labeled properly. Starting in Linux v6.5, MPTCP subflows will now be correctly labeled using the main MPTCP socket instead of the currently executing task. A special thanks to Paolo Abeni, and the other MPTCP developers, for their help on this issue.

  • Fixed an issue where labeled NFS mounts that were mounted prior to the initial SELinux policy load were not properly labeled once the policy was loaded. Now these existing labeled NFS mounts are labeled using the same deferred labeling mechanisms we use for local filesystems.

  • The “fs” object context was deprecated. SELinux policy parser support for this object context was included in the original SELinux kernel patches, but the object context was never utilized and was either ignored or marked as deprecated by all of the available SELinux policy we could find.

  • We continued the SELinux makefile improvements and cleanups we started in Linux v6.4.

  • A small number of code cleanups to remove dead code and generally improve the quality of the SELinux kernel code.

Audit

  • A minor fix to resolve some missing function prototype warnings when compiling the kernel.

Linux 6.4 Released

Linux v6.4 was released on Sunday, June 25th; there were no changes to the audit subsystem, but the SELinux highlights are below. Beyond these highlights, LWN.net has summarized the major changes in this release made during the first and second weeks of the merge window.

SELinux

  • After several years of work by the userspace and distro folks, we are finally in a place where we feel comfortable removing the runtime disable functionality, which was initially deprecated at the start of 2020. This was done to improve the security of all the LSMs in the kernel, not just SELinux, by hardening the LSM hook infrastructure. In addition to a LWN.net article on the removal, the commit description has some additional information as well as notes on what users who manage their own SELinux configuration might expect with this change:

    The existing kernel deprecation notice explains the functionality and why we want to remove it:

    The selinuxfs “disable” node allows SELinux to be disabled at runtime prior to a policy being loaded into the kernel. If disabled via this mechanism, SELinux will remain disabled until the system is rebooted.

    The preferred method of disabling SELinux is via the "selinux=0"
    boot parameter, but the selinuxfs "disable" node was created to
    make it easier for systems with primitive bootloaders that did not
    allow for easy modification of the kernel command line.
    Unfortunately, allowing for SELinux to be disabled at runtime makes
    it difficult to secure the kernel's LSM hooks using the
    "__ro_after_init" feature.
    

    It is that last sentence, mentioning the ‘__ro_after_init’ hardening, which is the real motivation for this change, and if you look at the diffstat you’ll see that the impact of this patch reaches across all the different LSMs, helping prevent tampering at the LSM hook level.

    From a SELinux perspective, it is important to note that if you continue to disable SELinux via “/etc/selinux/config” it may appear that SELinux is disabled, but it is simply in an uninitialized state. If you load a policy with load_policy -i, you will see SELinux come alive just as if you had loaded the policy during early-boot.

    It is also worth noting that the “/sys/fs/selinux/disable” file is always writable now, regardless of the Kconfig settings, but writing to the file has no effect on the system, other than to display an error on the console if a non-zero/true value is written.

  • In addition to removing the runtime disable functionality, we also removed the checkreqprot functionality. The reason for removing the checkreqprot tunable, as well as what administrators can expect from this change is explained in the commit description:

    We originally promised that the SELinux ‘checkreqprot’ functionality would be removed no sooner than June 2021, and now that it is March 2023 it seems like it is a good time to do the final removal. The deprecation notice in the kernel provides plenty of detail on why ‘checkreqprot’ is not desirable, with the key point repeated below:

    This was a compatibility mechanism for legacy userspace and
    for the READ_IMPLIES_EXEC personality flag.  However, if set to
    1, it weakens security by allowing mappings to be made executable
    without authorization by policy.  The default value of checkreqprot
    at boot was changed starting in Linux v4.4 to 0 (i.e. check the
    actual protection), and Android and Linux distributions have been
    explicitly writing a "0" to /sys/fs/selinux/checkreqprot during
    initialization for some time.
    

    Along with the official deprecation notice, we have been discussing this on-list and directly with several of the larger SELinux-based distros and everyone is happy to see this feature finally removed. In an attempt to catch all of the smaller, and DIY, Linux systems we have been writing a deprecation notice URL into the kernel log, along with a growing ssleep() penalty, when admins enabled checkreqprot at runtime or via the kernel command line. We have yet to have anyone come to us and raise an objection to the deprecation or planned removal.

    It is worth noting that while this patch removes the checkreqprot functionality, it leaves the user visible interfaces (kernel command line and selinuxfs file) intact, just inert. This should help prevent breakages with existing userspace tools that correctly, but unnecessarily, disable checkreqprot at boot or runtime. Admins that attempt to enable checkreqprot will be met with a removal message in the kernel log.

  • Restructured the avc_has_perm_noaudit() function to improve performance. The avc_has_perm_noaudit() function is on the critical path for SELinux access control decisions, so any change which impacts performance can be significant. Unfortunately, despite being explicitly marked with an inline tag, the function had grown large enough that the compilers were no longer inlining the function, resulting in noticeable slowdowns when performance was measured during a kernel compile. The changes in Linux v6.4 not only reduce the size of avc_has_perm_noaudit() by relocating the slow path to a separate function, they also improve the related RCU locking, cleaning up the code and improving performance at the same time.

  • A minor change to stop passing the internal SELinux state as a function parameter and instead reference the global instance directly. While the change is conceptually very small, the scope of the change meant the patch was quite large. This change simplifies the code and in theory should help boost SELinux performance a small amount.

  • Small makefile improvements to correct dependency issues.

  • Minor code cleanups.

Linux 6.3 Released

Linux v6.3 was released on Sunday, April 23rd; the SELinux and audit highlights are below. Beyond these highlights, LWN.net has summarized the major changes in this release made during the first and second weeks of the merge window.

SELinux

  • Minor changes to support the ID-mapped mounts work and some newly created virtual memory flag accessor functions.

Audit

  • The AUDIT_FANOTIFY record was updated to record the full event response. The patch’s author, Richard Guy Briggs, provides a description of the change, as well as sample record types, in the commit description:

    Currently the only type of fanotify info that is defined is an audit rule number, but convert it to hex encoding to future-proof the field. Hex encoding suggested by Paul Moore.

    The {subj,obj}_trust values are {0,1,2}, corresponding to no, yes, unknown.

    type=FANOTIFY msg=audit(1600385147.372:590): resp=2 fan_type=1 fan_info=3137 subj_trust=3 obj_trust=5
    type=FANOTIFY msg=audit(1659730979.839:284): resp=1 fan_type=0 fan_info=0 subj_trust=2 obj_trust=2
    
  • Minor changes to support the ID-mapped mounts work and the conversion of the kernel’s capabilities data type from a u32[2] array to a single u64.

  • Update the upstream Linux Kernel audit mailing list in MAINTAINERS to avoid the moderation problems with the old list. The new mailing list can be found in the MAINTAINERS file under the AUDIT SUBSYSTEM section.