Uniquely identifying processes in Linux

Sun, 28 Jun 2015 12:20:50 +0000

If you are doing any kind of analysis of a Linux based system you quickly come to the point of needing to collect statistics on a per-process basis. However one difficulty here is how to uniquely identify a process.

The simplest answer is that you can just use the process identifier (PID), unfortunately this is not a good answer in the face of PID reuse. The total number of PIDs in the system is finite, and once exhausted the kernel will start re-using PIDs that have been previously allocated to different processes.

Now you may expect that this reuse would be relatively rare, or occur on a relatively large period, but it can happen relatively quickly. On a system under load I've seen PID reuse in under 5 seconds.

The practical consequence of this is that if you are, for example, collecting statistics about a process via any interface that identifies processes via PID you need to be careful to ensure you are collecting the right statistics! For example, if you are a collecting statistics of a process identified via PID 37 you might read /proc/37/stat at one instance and receive valid data, but at any later time /proc/37/stat could return data about a completely different process.

Thankfully, the kernel associates another useful piece of information with a process: it’s start time. The combination of PID and start time provides a reasonably robust way of uniquely identifying processes over the life-time of a system. (For the pedantic, if a process can be created and correctly reaped all within the granularity of the clock, then it would be theoretically possible for multiple different processes to have existed in the system that have the same PID and start time, but that is unlikely to be a problem in practise.)

The start time is one of the fields that is present in the /proc/<pid>/stat information, so this can be used to ensure you are correctly matching up statistics.

blog comments powered by Disqus