{{Short description|Linux system call for I/O event notification mechanism}} {{Lowercase title}} '''<code>epoll</code>''' is a Linux kernel system call for a scalable I/O event notification mechanism, first introduced in version 2.5.45 of the Linux kernel in October, 2002.<ref>{{Cite web |last=Corbet |first=Jonathan |date=October 30, 2002 |title=Kernel development |url=https://lwn.net/Articles/13587/ |access-date=2025-07-15 |website=LWN.net |language=en}}</ref><ref>{{Cite web |last=Torvalds |first=Linus |date=October 30, 2002 |title=Linux v2.5.45 |url=https://lore.kernel.org/lkml/Pine.LNX.4.44.0210301651120.6719-100000@penguin.transmeta.com/ |access-date=July 15, 2025 |website=linux-kernel.vger.kernel.org archive mirror}}</ref> Its function is to monitor multiple file descriptors to see whether I/O is possible on any of them. It is meant to replace the older POSIX <code>select(2)</code> and <code>poll(2)</code> system calls, to achieve better performance in more demanding applications, where the number of watched file descriptors is large (unlike the older system calls, which operate in ''O''(''n'') time, <code>epoll</code> operates in ''O''(1) time).<ref>{{cite web |author=Oleksiy Kovyrin |date=2006-04-13 |title=Using epoll() For Asynchronous Network Programming |url=http://kovyrin.net/2006/04/13/epoll-asynchronous-network-programming |access-date=2014-03-01 |publisher=Kovyrin.net}}</ref>
<code>epoll</code> is similar to FreeBSD's <code>kqueue</code>, in that it consists of a set of user-space functions, each taking a file descriptor argument denoting the configurable kernel object, against which they cooperatively operate. <code>epoll</code> uses a red–black tree (RB-tree) data structure to keep track of all file descriptors that are currently being monitored.<ref>{{cite web |url=https://idndx.com/the-implementation-of-epoll-1/ |title=The Implementation of epoll (1) |website=idndx.com |date=September 2014 }} </ref>
==API== <syntaxhighlight lang="c">int epoll_create1(int flags);</syntaxhighlight> Creates an <code>epoll</code> object and returns its file descriptor. The <code>flags</code> parameter allows epoll behavior to be modified. It has only one valid value, <code>EPOLL_CLOEXEC</code>. <code>epoll_create()</code> is an older variant of <code>epoll_create1()</code> and is deprecated as of Linux kernel version 2.6.27 and glibc version 2.9.<ref>{{cite book|last=Love|first=Robert|title=Linux System Programming|year=2013|publisher=O’Reilly|isbn=978-1-449-33953-1|pages=97, 98|edition=Second}}</ref> <syntaxhighlight lang="c">int epoll_ctl(int epfd, int op, int fd, struct epoll_event* event);</syntaxhighlight> Controls (configures) which file descriptors are watched by this object, and for which events. <code>op</code> can be ADD, MODIFY or DELETE. <syntaxhighlight lang="c">int epoll_wait(int epfd, struct epoll_event* events, int maxevents, int timeout);</syntaxhighlight> Waits for any of the events registered for with <code>epoll_ctl</code>, until at least one occurs or the timeout elapses. Returns the occurred events in <code>events</code>, up to <code>maxevents</code> at once. <code>maxevents</code> is the maximum number of <code>epoll_event</code>/file descriptors to be monitored.<ref>{{cite web |url=https://stackoverflow.com/questions/2969425/epoll-wait-maxevents |title=epoll_wait: maxevents |date=Jun 3, 2010|access-date=2023-07-06}}</ref><ref>{{cite web |url=https://man7.org/linux/man-pages/man2/epoll_wait.2.html |title=epoll_wait(2) — Linux manual page |date=2023-03-30|access-date=2023-07-06}}</ref> In most case, <code>maxevents</code> is set to the value of the size of <code>*events</code> argument (<code>struct epoll_event* events</code> array).
==Triggering modes== <code>epoll</code> provides both edge-triggered and level-triggered modes. In edge-triggered mode, a call to <code>epoll_wait</code> will return only when a new event is enqueued with the <code>epoll</code> object, while in level-triggered mode, <code>epoll_wait</code> will return as long as the condition holds.
For instance, if a pipe registered with <code>epoll</code> has received data, a call to <code>epoll_wait</code> will return, signaling the presence of data to be read. Suppose, the reader only consumed part of data from the buffer. In level-triggered mode, further calls to <code>epoll_wait</code> will return immediately, as long as the pipe's buffer contains data to be read. In edge-triggered mode, however, <code>epoll_wait</code> will return only once new data is written to the pipe.<ref name="epoll(7) - Linux manual page">{{cite web |date=2012-04-17 |title=epoll(7) - Linux manual page |url=http://man7.org/linux/man-pages/man7/epoll.7.html |access-date=2014-03-01 |publisher=Man7.org}}</ref>
==Bugs== Bryan Cantrill pointed out that <code>epoll</code> had mistakes that could have been avoided, had it learned from its predecessors: input/output completion ports, event ports (Solaris) and kqueue.<ref>Archived at [https://ghostarchive.org/varchive/youtube/20211205/l6XQUciI-Sc Ghostarchive]{{cbignore}} and the [https://web.archive.org/web/20151202133409/https://www.youtube.com/watch?v=l6XQUciI-Sc Wayback Machine]{{cbignore}}: {{cite web| url = https://www.youtube.com/watch?v=l6XQUciI-Sc&t=57m| title = Ubuntu Slaughters Kittens {{!}} BSD Now 103 | website=YouTube| date = 20 August 2015 }}{{cbignore}}</ref> However, a large part of his criticism was addressed by <code>epoll</code>'s <code>EPOLLONESHOT</code> and <code>EPOLLEXCLUSIVE</code> options. <code>EPOLLONESHOT</code> was added in version 2.6.2 of the Linux kernel mainline, released in February 2004. <code>EPOLLEXCLUSIVE</code> was added in version 4.5, released in March 2016.<ref>{{cite web |url=https://idea.popcount.org/2017-02-20-epoll-is-fundamentally-broken-12/ |title=Epoll is fundamentally broken 1/2 |publisher=idea.popcount.org |date=2017-02-20 |access-date=2017-10-06}}</ref>
== See also == {{Portal|Linux}}
* Input/output completion port (IOCP) * kqueue * libevent
==References== {{reflist}}
== External links == * [http://man7.org/linux/man-pages/man7/epoll.7.html epoll manpage] * [http://www.xmailserver.org/linux-patches/nio-improve.html epoll patch]
{{Linux kernel}}
Category:Events (computing) Category:Linux kernel features Category:System calls