Operating system latency

The preceding example describes a somewhat unrealistic system, in that 64 bytes of data can be sent directly to the transmission media uninterrupted. Realistically, an operating system (OS) will be handling requests from the application and system software to send over that hypothetical packet, and it will be doing so while simultaneously processing thousands of other requests from hundreds of other pieces of software running on the host machine. Almost all modern OSes have a system for interlacing operations from multiple requests so that no one process is unreasonably delayed by the execution of another. So really, we will never expect to achieve latency as low as the minimum mechanical latency defined by our clock speed. Instead, what might realistically happen is that the first byte of our packet will be queued up for transport, and then the OS will switch to servicing another operation on its procedure queue, some time will be spent executing that operation, and then it might come back and ready the second byte of our packet for transport. So, if your software is trying to send a packet on an OS that is trying to execute a piece of long-running or blocking software, you may experience substantial latency that is entirely out of your control. The latency introduced by how your software's requests are prioritized and processed by the OS is, hopefully very obviously, called OS latency.