Pidfd

Since Linux 5.1, pidfds have more or less allowed one to refer to a process using a file descriptor, making it possible to eliminate a set of race conditions and ambiguities.

The first "pidfds" were created by opening a /proc/PID directory, and used with pidfd_send_signal(2). These pidfds are now considered incomplete (they cannot be polled for process termination, nor can they be used with the waitid(2) system call).

Kernel

At the heart of the modern pidfd abstraction is the CLONE_PIDFD flag to the clone(2) system call (note that this recycles the deprecated CLONE_PID bit). It is not possible to use this flag with CLONE_THREAD, and thus the created process will always be a thread group leader. It cannot be used together with the (deprecated) CLONE_DETACHED flag. The resulting file descriptor is placed in parent_tid when used with clone(2) (and thus it cannot there be used together with CLONE_PARENT_SETTID) and cl_args.pidfd when used with clone3(2).

Kernel 5.3 introduced pidfd_open(2), allowing a pidfd to be opened for an arbitrary existing process. If you for some reason can't use clone3(2), this can be employed together with fork(2) for a race-free pidfd acquisition on children (see pidfd_spawn(3) below).

Kernel 5.6 added pidfd_getfd(2), supporting duplication of an existing file descriptor in a process identified by a pidfd. This operation requires PTRACE_MODE_ATTACH_REALCREDS.

Kernel 5.8 added support for supplying a pidfd to setns(2).

Kernel 5.10 brought process_madvise(2), generalizing madvise(2) to multiple memory regions and allowing it to be applied to another process's address space. When used on another process (as opposed to oneself), it requires PTRACE_MODE_READ_FSCREDS and CAP_SYS_NICE.

As of 6.5, credentials can be sent using pidfds rather than SCM_CREDENTIALS-style PIDs using the SCM_PIDFD type. getsockopt(2) can use SO_PEERPIDFD to get a pidfd for the peer process on a unix(7) socket.

6.9 introduces pidfdfs, a pseudo filesystem similar to nsfs. This is not visible from userspace, and is not mounted. It facilitates system calls which demand a filesystem backing: statx() now works, as do LSMs (Linux Security Modules).

Glibc

GNU libc added pidfd system call support in 2.36. Glibc 2.39 added two functions for pidfds, complementing posix_spawn(3). They return pidfds rather than PIDs:

pidfd_spawn(3): analogous to posix_spawn(3)
pidfd_spawnp(3): analogous to posix_spawnp(3)

The posix_spawnattr_t argument allows (among other things) spawning the process within a different control group.

Also added was pidfd_getpid(3):

pid_t pidfd_getpid(3): get the PID from a pidfd

External links

Race-free process creation in the GNU C library, LWN
GNU C Library version 2.39, LWN
A new filesystem for pidfds, LWN

Pidfd

Kernel

Glibc

External links

navigation menu

Search