perf(gthread): Use request timeout / 2 for epoll timeout #3319

ankush · 2024-10-29T14:06:57Z

gthread calls epoll_wait (and 2 other syscalls) every second because it
specifies timeout to be 1 second.

λ sudo strace -p `pgrep -f "gunicorn: worker" | head -n1`
strace: Process 30815 attached
epoll_wait(7, [], 1, 666)               = 0
getppid()                               = 30800
utimensat(6, NULL, [{tv_sec=3157, tv_nsec=198136276} /* 1970-01-01T06:22:37.198136276+0530 */, {tv_sec=3157, tv_nsec=198136276} /* 1970-01-01T06:22:37.198136276+0530 */], 0) = 0
epoll_wait(7, [], 1, 1000)              = 0
getppid()                               = 30800
utimensat(6, NULL, [{tv_sec=3158, tv_nsec=204192934} /* 1970-01-01T06:22:38.204192934+0530 */, {tv_sec=3158, tv_nsec=204192934} /* 1970-01-01T06:22:38.204192934+0530 */], 0) = 0
epoll_wait(7, [], 1, 1000)              = 0
getppid()                               = 30800
utimensat(6, NULL, [{tv_sec=3159, tv_nsec=210145196} /* 1970-01-01T06:22:39.210145196+0530 */, {tv_sec=3159, tv_nsec=210145196} /* 1970-01-01T06:22:39.210145196+0530 */], 0) = 0
epoll_wait(7, [], 1, 1000)              = 0
getppid()                               = 30800
utimensat(6, NULL, [{tv_sec=3160, tv_nsec=215517372} /* 1970-01-01T06:22:40.215517372+0530 */, {tv_sec=3160, tv_nsec=215517372} /* 1970-01-01T06:22:40.215517372+0530 */], 0) = 0
epoll_wait(7, ^Cstrace: Process 30815 detached
 <detached ...>

Timing out every second wakes up the process and loads it on CPU even
if there is nothing to service.

This can be detrimental when you have total workers >> total cores and
a multi-tenant setup where most tenants might be sitting idle. (but not
"idle enough" because of 1s polling timeout)

This can possibly keep a completed future in the queue until the next
request arrives, but I don't see any obvious problem with it except a few
bytes of extra memory usage? I could be wrong here, please check this.

fixes #3317 (more details on issue)

gthread calls epoll_wait (and 2 other syscalls) every second because it specifies timeout to be 1 second. ``` λ sudo strace -p `pgrep -f "gunicorn: worker" | head -n1` strace: Process 30815 attached epoll_wait(7, [], 1, 666) = 0 getppid() = 30800 utimensat(6, NULL, [{tv_sec=3157, tv_nsec=198136276} /* 1970-01-01T06:22:37.198136276+0530 */, {tv_sec=3157, tv_nsec=198136276} /* 1970-01-01T06:22:37.198136276+0530 */], 0) = 0 epoll_wait(7, [], 1, 1000) = 0 getppid() = 30800 utimensat(6, NULL, [{tv_sec=3158, tv_nsec=204192934} /* 1970-01-01T06:22:38.204192934+0530 */, {tv_sec=3158, tv_nsec=204192934} /* 1970-01-01T06:22:38.204192934+0530 */], 0) = 0 epoll_wait(7, [], 1, 1000) = 0 getppid() = 30800 utimensat(6, NULL, [{tv_sec=3159, tv_nsec=210145196} /* 1970-01-01T06:22:39.210145196+0530 */, {tv_sec=3159, tv_nsec=210145196} /* 1970-01-01T06:22:39.210145196+0530 */], 0) = 0 epoll_wait(7, [], 1, 1000) = 0 getppid() = 30800 utimensat(6, NULL, [{tv_sec=3160, tv_nsec=215517372} /* 1970-01-01T06:22:40.215517372+0530 */, {tv_sec=3160, tv_nsec=215517372} /* 1970-01-01T06:22:40.215517372+0530 */], 0) = 0 epoll_wait(7, ^Cstrace: Process 30815 detached <detached ...> ``` Timing out every second wakes up the process and loads it on CPU even if there is nothing to service. This can be detrimental when you have total workers >> total cores and multi-tenant setup where certain tenants might be sitting idle. (but not "idle enough" because of 1s polling timeout) This can possibly keep a completed future in queue for a small while, but I don't see any obious problem with it except few bytes of extra memory usage. I could be wrong here. fixes benoitc#3317

pajod · 2024-10-29T22:55:37Z

There is a mismatch between the title and content. I suspect you meant to apply some fraction of the timeout.

I wonder if TTFB for a burst of request after long idle is meaningfully impacted by cleaning up keep-alive sockets all at once. Either also consider the potentially lower poll time for that, or document the changed --keep-alive behavior.

ankush · 2024-10-30T06:05:13Z

@pajod timeout is divided by 2 when the worker is created. So self.timeout is half of request timeout.

gunicorn/gunicorn/arbiter.py

Lines 590 to 592 in 787c914

    
           worker = self.worker_class(self.worker_age, self.pid, self.LISTENERS, 
        
                                      self.app, self.timeout / 2.0, 
        
                                      self.cfg, self.log)

I'll look into 2nd part of your comment in some time.

ankush · 2024-10-30T08:31:17Z

@pajod I think best course of action is to keep timeout as min(req_timeout/2, keep_alive) when there are kept alive sockets, this addresses your concern and mine.

I can set keep-alive to a much higher value. The setup I have like most others, runs gunicorn behind a reverse proxy.
TTFB has minimal or no effect from this change.
keep-alive keeps working as advertised... with a worst-case deviation of up to 2x keep-alive value only occurring when workers go idle. We can document this. E.g. a socket had almost reached keep-alive timeout but we just blocked on epoll_wait. So, nearly 2x keep-alive time will be required in idle conditions.

When worker goes idle, two problems can occur: 1. keepalived connections might stay open for request_timeout / 2. That can be significantly longer than keep-alive timeout. 2. When next request comes up after idle period, all keepalived sockets must be cleaned up first before we can start responding to requests. This will increase TTFB. After this change: 1. keepalived sockets are cleaned up with frequency of *at least* keepalive timeout. 2. TTFB is minimally affected. 3. Worst case: worker goes idle with keepalived connection that was almost hitting timeout . That connection will be kept alive for ~2x keepalive timeout.

ankush · 2024-10-30T09:06:58Z

gunicorn/workers/gthread.py

@@ -206,8 +206,13 @@ def run(self):

            # can we accept more connections?
            if self.nr_conns < self.worker_connections:
+                # Block for new events until worker times out or keepalive timeout.
+                select_timeout = self.timeout or 1.0
+                if self._keep:


I can hoist this change out of the while loop, but then folks running the default config (keepalive=2) won't see any noticeable difference.

Keeping it inside the loop allows making timeout adaptive to the presence of keepalived sockets.

ankush · 2024-11-09T07:41:12Z

@benoitc kind reminder to review this, small change. :)

ankush commented Oct 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(gthread): Use request timeout / 2 for epoll timeout #3319

perf(gthread): Use request timeout / 2 for epoll timeout #3319

ankush commented Oct 29, 2024 •

edited

Loading

pajod commented Oct 29, 2024

ankush commented Oct 30, 2024 •

edited

Loading

ankush commented Oct 30, 2024 •

edited

Loading

ankush Oct 30, 2024 •

edited

Loading

ankush commented Nov 9, 2024 •

edited

Loading

perf(gthread): Use request timeout / 2 for epoll timeout #3319

Are you sure you want to change the base?

perf(gthread): Use request timeout / 2 for epoll timeout #3319

Conversation

ankush commented Oct 29, 2024 • edited Loading

pajod commented Oct 29, 2024

ankush commented Oct 30, 2024 • edited Loading

ankush commented Oct 30, 2024 • edited Loading

ankush Oct 30, 2024 • edited Loading

Choose a reason for hiding this comment

ankush commented Nov 9, 2024 • edited Loading

ankush commented Oct 29, 2024 •

edited

Loading

ankush commented Oct 30, 2024 •

edited

Loading

ankush commented Oct 30, 2024 •

edited

Loading

ankush Oct 30, 2024 •

edited

Loading

ankush commented Nov 9, 2024 •

edited

Loading