Possibly unnecessary polling in gthread #3317

ankush · 2024-10-29T09:50:52Z

I was tracing system calls for some unrelated reason and observed that an idle gthread worker makes epoll_wait system call every second.

λ sudo strace -p `pgrep -f "gunicorn: worker" | head -n1`
strace: Process 30815 attached
epoll_wait(7, [], 1, 666)               = 0
getppid()                               = 30800
utimensat(6, NULL, [{tv_sec=3157, tv_nsec=198136276} /* 1970-01-01T06:22:37.198136276+0530 */, {tv_sec=3157, tv_nsec=198136276} /* 1970-01-01T06:22:37.198136276+0530 */], 0) = 0
epoll_wait(7, [], 1, 1000)              = 0
getppid()                               = 30800
utimensat(6, NULL, [{tv_sec=3158, tv_nsec=204192934} /* 1970-01-01T06:22:38.204192934+0530 */, {tv_sec=3158, tv_nsec=204192934} /* 1970-01-01T06:22:38.204192934+0530 */], 0) = 0
epoll_wait(7, [], 1, 1000)              = 0
getppid()                               = 30800
utimensat(6, NULL, [{tv_sec=3159, tv_nsec=210145196} /* 1970-01-01T06:22:39.210145196+0530 */, {tv_sec=3159, tv_nsec=210145196} /* 1970-01-01T06:22:39.210145196+0530 */], 0) = 0
epoll_wait(7, [], 1, 1000)              = 0
getppid()                               = 30800
utimensat(6, NULL, [{tv_sec=3160, tv_nsec=215517372} /* 1970-01-01T06:22:40.215517372+0530 */, {tv_sec=3160, tv_nsec=215517372} /* 1970-01-01T06:22:40.215517372+0530 */], 0) = 0
epoll_wait(7, ^Cstrace: Process 30815 detached
 <detached ...>

Here is the relevant code:

gunicorn/gunicorn/workers/gthread.py

Line 210 in bacbf8a

events = self.poller.select(1.0)

This poller is used for waking up process to:

Accept incoming connections -

gunicorn/gunicorn/workers/gthread.py

Line 201 in bacbf8a

self.poller.register(sock, selectors.EVENT_READ, acceptor)

Read incoming data -

gunicorn/gunicorn/workers/gthread.py

Lines 261 to 262 in bacbf8a

    
           self.poller.register(conn.sock, selectors.EVENT_READ, 
        
                                partial(self.on_client_socket_readable, conn))

Since same poller is used for both, there's no question of delays induced by blocking epoll call... and if blocking was indeed happening then 1 second blocking is anyways unacceptable.

This doesn't happen for sync worker which has effective timeout same as request timeout. That too is only done to let master process know worker is still alive.

gunicorn/gunicorn/workers/sync.py

Line 36 in bacbf8a

ret = select.select(self.wait_fds, [], [], timeout)

My question is why the timeout needs to be so short? This short timeout causes the process to wake up every second even though nothing has happened.

Is this done to clean up futures frequently?

gunicorn/gunicorn/workers/gthread.py

Lines 216 to 217 in bacbf8a

    
           result = futures.wait(self.futures, timeout=0, 
        
                                 return_when=futures.FIRST_COMPLETED)

The text was updated successfully, but these errors were encountered:

ankush · 2024-10-29T09:55:36Z

Oof, I meant to post it in https://github.com/benoitc/gunicorn/discussions 😐

Anyway, this is kind of somewhere in the middle.

gthread calls epoll_wait (and 2 other syscalls) every second because it specifies timeout to be 1 second. ``` λ sudo strace -p `pgrep -f "gunicorn: worker" | head -n1` strace: Process 30815 attached epoll_wait(7, [], 1, 666) = 0 getppid() = 30800 utimensat(6, NULL, [{tv_sec=3157, tv_nsec=198136276} /* 1970-01-01T06:22:37.198136276+0530 */, {tv_sec=3157, tv_nsec=198136276} /* 1970-01-01T06:22:37.198136276+0530 */], 0) = 0 epoll_wait(7, [], 1, 1000) = 0 getppid() = 30800 utimensat(6, NULL, [{tv_sec=3158, tv_nsec=204192934} /* 1970-01-01T06:22:38.204192934+0530 */, {tv_sec=3158, tv_nsec=204192934} /* 1970-01-01T06:22:38.204192934+0530 */], 0) = 0 epoll_wait(7, [], 1, 1000) = 0 getppid() = 30800 utimensat(6, NULL, [{tv_sec=3159, tv_nsec=210145196} /* 1970-01-01T06:22:39.210145196+0530 */, {tv_sec=3159, tv_nsec=210145196} /* 1970-01-01T06:22:39.210145196+0530 */], 0) = 0 epoll_wait(7, [], 1, 1000) = 0 getppid() = 30800 utimensat(6, NULL, [{tv_sec=3160, tv_nsec=215517372} /* 1970-01-01T06:22:40.215517372+0530 */, {tv_sec=3160, tv_nsec=215517372} /* 1970-01-01T06:22:40.215517372+0530 */], 0) = 0 epoll_wait(7, ^Cstrace: Process 30815 detached <detached ...> ``` Timing out every second wakes up the process and loads it on CPU even if there is nothing to service. This can be detrimental when you have total workers >> total cores and multi-tenant setup where certain tenants might be sitting idle. (but not "idle enough" because of 1s polling timeout) This can possibly keep a completed future in queue for a small while, but I don't see any obious problem with it except few bytes of extra memory usage. I could be wrong here. fixes benoitc#3317

ankush changed the title ~~Potential CPU cycles waste in gthread polling mechanism?~~ Possibly unnecessary polling in gthread Oct 29, 2024

ankush mentioned this issue Oct 29, 2024

perf(gthread): Use request timeout / 2 for epoll timeout ankush/gunicorn#1

Closed

ankush linked a pull request Oct 29, 2024 that will close this issue

perf(gthread): Use request timeout / 2 for epoll timeout #3319

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possibly unnecessary polling in gthread #3317

Possibly unnecessary polling in gthread #3317

ankush commented Oct 29, 2024 •

edited

Loading

ankush commented Oct 29, 2024 •

edited

Loading

Possibly unnecessary polling in gthread #3317

Possibly unnecessary polling in gthread #3317

Comments

ankush commented Oct 29, 2024 • edited Loading

ankush commented Oct 29, 2024 • edited Loading

ankush commented Oct 29, 2024 •

edited

Loading

ankush commented Oct 29, 2024 •

edited

Loading