The runner has received a shutdown signal. #7188
Replies: 61 comments 43 replies
-
@JakubMosakowski we cannot do any investigation without additional info. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
The interesting part is that it doesn't seem to be related to any of our changes. I created a branch that is reverted by the last X commits (to the point in history where our builds were smooth) and they are not passing anymore. |
Beta Was this translation helpful? Give feedback.
-
We are also seeing this after upgrading our self hosted runners from 20.04 to 22.04 with no other seemingly related changes. Do the 22.04 runners have more conservative limits even when using self hosted? |
Beta Was this translation helpful? Give feedback.
-
The same happens to us in private repo. We didn't do any significant changes to workflows. |
Beta Was this translation helpful? Give feedback.
-
Hi @ihor-panasiuk95, please send me a links to workflow runs both with positive and negative results. |
Beta Was this translation helpful? Give feedback.
-
@erik-bershel will you be able to visit them taking into account that they are in private repo? |
Beta Was this translation helpful? Give feedback.
-
@ihor-panasiuk95 it's not a problem. There is no need to check what is going on in your private repository in the first step. I want to check the load on agents and compare successful and failed jobs. If the information is not enough, then we will discuss the repro-steps. For example: https://github.com/owner/repo/actions/runs/runID or https://github.com/erik-bershel/erik-tests/actions/runs/3680567148. |
Beta Was this translation helpful? Give feedback.
-
@erik-bershel |
Beta Was this translation helpful? Give feedback.
-
I find that this issue only occurs when using |
Beta Was this translation helpful? Give feedback.
-
We are also seeing these errors regularly. Link to one of our most recent runs: https://github.com/rstudio/connect/actions/runs/3912304431/jobs/6687076068 The output from the job (a |
Beta Was this translation helpful? Give feedback.
-
I've been seeing similar issues where I either get |
Beta Was this translation helpful? Give feedback.
-
I just recently began experiencing this issue. I have never experienced it before. Here's the error I receive:
Here's a link to one of our recent runs: |
Beta Was this translation helpful? Give feedback.
-
We are seeing this issue consistently on pr/branch workflows at the step run with configure-aws-credentials on
|
Beta Was this translation helpful? Give feedback.
-
Hey @chrisui! |
Beta Was this translation helpful? Give feedback.
-
As previously stated, this issue was becoming more prevalent for us. Downgrading the runner image to Couple of links: |
Beta Was this translation helpful? Give feedback.
-
I also recently have same issue on my workflow. for no reason jobs keep getting failed. I've change runner OS from ubuntu-latest to ubuntu-20 but no luck. |
Beta Was this translation helpful? Give feedback.
-
I'm also suddenly experiencing this issue. I can no longer build my application. What are the enforced RAM limits? I can not find any information on this. Until now I was using ubuntu-latest, no playing around with other images but no luck so far. Also not sure how that would make any sense. Or is there a form of soft rate-limit for public repositories where workflow runs get limited after exceeding a threshold? |
Beta Was this translation helpful? Give feedback.
-
I too am suddenly experiencing this issue when running automated tests, even without any changes to the tests. I haven't testsed exhaustively, but it seems Ubuntu and Windows images are failing for me, but MacOS might be working, which suggests a memory issue since MacOS images have 14 GB as opposed to 7 (see here). I agree it would be helpful to know if the memory allotment or other VM specs have changed recently, or if there is some other rate-limiting for public repos. For reference, my tests started failing about two weeks ago. |
Beta Was this translation helpful? Give feedback.
-
I'm also suddenly experiencing this issue. I can no longer build my application reliably. We have tried changing runners, modifying the pipeline to don't execute some commands on parallel. But, nothing seems to fix the root cause. |
Beta Was this translation helpful? Give feedback.
-
Hi |
Beta Was this translation helpful? Give feedback.
-
I had similar issues that was occurring on both ubuntu latest and self hosted ec2 instances running amazon linux 2023. In the end I managed to investigate the issue by benchmarking gradle. In my case improper initialization / lifecycle management of testcontainers drained all the resources. If you happen to have similar issue I recommend you profile / benchmark your build. |
Beta Was this translation helpful? Give feedback.
-
Also experiencing the same issues with ubuntu-latest and ubuntu-20.04. Usually takes 3-6 attempts to get a build. 2023-07-27T12:16:55.9472717Z > Task :apptv:dexBuilderTsnAndroidtvQaRelease
2023-07-27T12:18:31.3463035Z > Task :apptv:mergeExtDexTsnAndroidtvQaRelease
2023-07-27T12:18:41.7472315Z > Task :apptv:mergeTsnAndroidtvQaReleaseJavaResource
2023-07-27T12:18:59.2463021Z > Task :apptv:mergeDexTsnAndroidtvQaRelease
2023-07-27T12:19:00.2463262Z > Task :apptv:buildTsnAndroidtvQaReleasePreBundle
2023-07-27T12:19:01.4471900Z > Task :apptv:compileTsnAndroidtvQaReleaseArtProfile
2023-07-27T12:19:02.6652087Z > Task :apptv:packageTsnAndroidtvQaReleaseBundle
2023-07-27T12:19:11.4704449Z ##[error]The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
2023-07-27T12:19:11.6687468Z Cleaning up orphan processes
2023-07-27T12:19:11.7638576Z Terminate orphan process: pid (2017) (java) |
Beta Was this translation helpful? Give feedback.
-
We could fix this problem by increasing the size of ubuntus swapfile: https://stackoverflow.com/a/76921482/1185087
|
Beta Was this translation helpful? Give feedback.
-
I recently started experiencing this problem in one of my repos: I tried limiting the amount of jobs cargo is using to build my project. It seems that the resource usage skyrockets at link time, so limiting cargo jobs wouldn't help. Normally I'd use mold or bump up the size of the swapfile to mitigate this- but I can't do that on Windows! |
Beta Was this translation helpful? Give feedback.
-
I have a kinda related question regards the Job "status" Is there a way to get "a" JobWorkflow result flag mention explicit that the runner received a shutdown signal while the Job was running? Looking at https://docs.github.com/en/rest/actions/workflow-jobs?apiVersion=2022-11-28 gives me a bunch of API information like Context: We use custom runner on Google spot vm instances and once this instance was taken away during a Workflow run, we want to trigger a new Workflow to run the Job again on a new runner. FYI @erik-bershel |
Beta Was this translation helpful? Give feedback.
-
We've started hitting this issue as well within the last couple of weeks. Always the same action, always the same intermittent problem. After re-running the failed test works more often than not. I wouldn't expect resource exhaustion as we're running on here's an example of failure and an example of success As of this post its been ~24 hours since we've been able to get a reliable build out. Thanks, and happy to provide more info if it helps. |
Beta Was this translation helpful? Give feedback.
-
Is there a fix for this? We are seeing this as well after applying some of the recommendations here: https://developer.android.com/build/optimize-your-build Our
We started encountering this when running unit tests with a standard runner ( |
Beta Was this translation helpful? Give feedback.
-
In our case, we solved this error when we understood that we had created multiple ansible automation platform installation concurrently reading and writing to the same postgres database. We believe that the The running ansible process received a shutdown signal. was linked to this "concurrency" of process towards the same database. It could be related some some "timeouts" of calls. We stopped all the services on those machines and kept only the original process running. After stopping the services, we saw no interruptions in AAP project updates and project launch. |
Beta Was this translation helpful? Give feedback.
-
I had the same issue starting with the recent update of the Ubuntu 24.04 LTS runner image. My hunch was also that I was exhausting resources, but the logs were not helpful. The runner is using Makefile with the |
Beta Was this translation helpful? Give feedback.
-
Description
Since yesterday, our GitHub action builds started to randomly fail (we didn't change anything in our configuration). The error is not very precise, unfortunately.
The process is stopped in random stages of the build (but always after at least 15 minutes or so). Even if the build passes it takes much longer than before (~25 min clean build to ~35 min now).
Sometimes before the shutdown signal, there is also such log:
Idle daemon unexpectedly exit. This should not happen.
Workflow passes normally on builds that are shorter (for example those from cache).
Platforms affected
Runner images affected
Image version and build link
Image: ubuntu-22.04
Version: 20221127.1
Current runner version: '2.299.1'
Unfortunately, it happens on the private repo.
Is it regression?
No
Expected behavior
Job should pass
Actual behavior
Job fails
Repro steps
Looks similar to: #6680
Beta Was this translation helpful? Give feedback.
All reactions