Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle failure cases of report gc #4014

Merged
merged 3 commits into from
Oct 21, 2024

Conversation

austb
Copy link
Contributor

@austb austb commented Oct 17, 2024

Fixes #4013

@austb austb force-pushed the gh-4013/main/handle-report-gc-failures branch from 6780c59 to 5e1d0f5 Compare October 17, 2024 00:11
@austb austb changed the base branch from main to 7.x October 17, 2024 15:18
@austb austb force-pushed the gh-4013/main/handle-report-gc-failures branch 4 times, most recently from 2b63c31 to 39cf59d Compare October 17, 2024 16:08
@austb
Copy link
Contributor Author

austb commented Oct 17, 2024

The beaker acceptance tests actually passed, but after the tests passed the Jenkins instance encountered an error

@austb austb removed the don't merge label Oct 17, 2024
@austb
Copy link
Contributor Author

austb commented Oct 17, 2024

Tagged PuppetDB for the release, so this can be merged anytime now

postgres will coerce unquoted table names to lowercase, so have this
function return lowercase from the start
@austb austb force-pushed the gh-4013/main/handle-report-gc-failures branch 2 times, most recently from fa72125 to 3adda41 Compare October 18, 2024 15:59
@austb austb marked this pull request as ready for review October 18, 2024 16:35
@austb austb requested review from a team as code owners October 18, 2024 16:35
@austb austb force-pushed the gh-4013/main/handle-report-gc-failures branch from 3adda41 to 16d3cdc Compare October 21, 2024 22:00
The process of detaching a partition is a two transaction process. When
the first transaction succeeds, the partition is now "pending". The
second transaction needs an ACCESS EXCLUSIVE lock on the partition and
can therefore sometimes fail.

When this happens, subsequent GCs will fail because only one pending
partition detachment is allowed. To handle this, catch the SQLException
and finalize the pending detach operation. If that was a different
partition from the partition we are trying to remove, retry the detach
operation that failed.
If a partition is fully detached, but fails to be dropped during its GC
operation, subsequent GC operations will not see that partition at all.
It will be stranded and PuppetDB will never remove it.

During GC, search for stranded partitions that need to be removed and
add them to the list of partitions that need to be dropped.

There is no structural way to tell the difference between a
non-partitioned table and a detached partition table. This PR uses a
regular expression, which means that PuppetDB cannot have any
non-partitioned tables matching the regular expressions used to identify
stranded partitions.
@austb austb force-pushed the gh-4013/main/handle-report-gc-failures branch from 16d3cdc to 7d86914 Compare October 21, 2024 22:04
@rbrw rbrw merged commit 18fa7cc into puppetlabs:7.x Oct 21, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PuppetDB garbage collection may not always remove partitions
2 participants