Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix JENKINS-56740 - AWS CLI Option for artifacts over 5Gb with S3 Storage Class select option (STANDARD, STANDARD_IA). #467

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

salmira
Copy link

@salmira salmira commented Feb 27, 2024

Add AWS configuration options:

  1. Add AWS CLI Checkmark Option as a fix or workaround for JENKINS-56740
  2. Add Custom S3 Storage Class Select Option (STANDARD_IA) for AWS CLI.

With this change, the plugin artifacts upload method depends on the Jenkins setting at Jenkins > Setting > AWS.
Default AWS API method fails to upload artifacts bigger than 5GB.

AWS CLI Option "Use AWS CLI for files upload to S3"

Workaround or fix for JENKINS-56740.
AWS CLI Option is a workaround for big files transfer API error for file bigger than 5Gb.
AWS CLI mode uses AWS S3 Command Line Interface (CLI) commands for artifacts upload to AWS S3.

Default AWS API fails to upload artifacts bigger than 5GB, MaxSizeAllowed is 5368709120 bytes (5Gb).
See JENKINS-56740 - plugin fails to upload artifacts having size over 5Gb with errors like:

ERROR: Step ‘Archive the artifacts’ failed ..., response: 400 Bad Request,
EntityTooLarge: Your proposed upload exceeds the maximum allowed size

AWS option "Use AWS CLI for files upload to S3" switches the default AWS API mode to AWS CLI mode.

Add AWS S3 Storage Class selection option

Warning: Works for AWS CLI option only.
Allows to select the AWS S3 Storage Class for uploaded artifacts.
The default S3 Storage Class is S3 Standard (STANDARD).
This option allows so select S3 Storage Class Standard IA (STANDARD_IA).
Useful for cost saving.

Testing done

Tested at local Jenkins test installation and on our development Jenkins.
Add Jenkins job to test the plugin:

Test 1. PASSED. Failed(as expected) big file upload with default AWS API

Fails upload gif files (6Gb) for default AWS API configuration with errors as expected by JENKINS-56740.

Jenkins job Artifact-Plugin-Test, build #47_AWS-API_STANDARD
Console Output:

...
+ + ls -laR
tee -a build_47.log
.:
total 6291668
drwxr-xr-x  2 jenkins jenkins       4096 Feb 27 20:53 .
drwxr-xr-x 51 jenkins jenkins       4096 Feb  9 15:38 ..
-rw-r--r--  1 jenkins jenkins         77 Feb 27 20:53 build_47.log
-rw-r--r--  1 jenkins jenkins 6442450944 Feb 27 20:53 rand_20240227-205300.dat
New run name is '#47_AWS-API_STANDARD'
Archiving artifacts
AWS Storage Class selected option: STANDARD
AWS option selected:: Use AWS API...
ERROR: Step ‘Archive the artifacts’ failed: Failed to upload /var/lib/jenkins/workspace/Artifact-Plugin-Test/rand_20240227-205300.dat to https://XXX-build-artifacts.s3-accelerate.amazonaws.com/jobs/Artifact-Plugin-Test/47/artifacts/rand_20240227-205300.dat?…, response: 400 Bad Request, body: 
EntityTooLargeYour proposed upload exceeds the maximum allowed size6442450944536870912097CMGW5283HFF06FfriyoCjx5RJYxUiTjvFQIs2lAyShpRX7YrS9CWD425ANzDQk5sIwRqsCXgffXD3615sIW8vE02g=
Finished: FAILURE

AWS S3 has no artifacts for this build.

Test 2. PASSED. big file upload with AWS CLI option checked and STANDARD Storage Class selected:

Jenkins job Artifact-Plugin-Test, build #48_AWS-CLI_STANDARD
Console Output:

...
+ ls -laR
+ tee -a build_48.log
.:
total 6291636
drwxr-xr-x  2 jenkins jenkins       4096 Feb 27 20:56 .
drwxr-xr-x 51 jenkins jenkins       4096 Feb  9 15:38 ..
-rw-r--r--  1 jenkins jenkins         77 Feb 27 20:56 build_48.log
-rw-r--r--  1 jenkins jenkins 6442450944 Feb 27 20:57 rand_20240227-205642.dat
New run name is '#48_AWS-CLI_STANDARD'
Archiving artifacts
AWS Storage Class selected option: STANDARD
AWS option selected: Use AWS CLI...
Copy rand_20240227-205642.dat to s3://XXX-build-artifacts/jobs/Artifact-Plugin-Test/48/artifacts/rand_20240227-205642.dat
[Artifact-Plugin-Test] $ aws s3 cp --quiet --no-guess-mime-type rand_20240227-205642.dat s3://XXX-build-artifacts/jobs/Artifact-Plugin-Test/48/artifacts/rand_20240227-205642.dat --storage-class STANDARD
Copy build_48.log to s3://XXX-build-artifacts/jobs/Artifact-Plugin-Test/48/artifacts/build_48.log
[Artifact-Plugin-Test] $ aws s3 cp --quiet --no-guess-mime-type build_48.log s3://XXX-build-artifacts/jobs/Artifact-Plugin-Test/48/artifacts/build_48.log --storage-class STANDARD
Uploaded 2 artifact(s) to https://XXX-build-artifacts.s3.amazonaws.com/jobs/Artifact-Plugin-Test/48/artifacts/
Finished: SUCCESS

AWS S3 has artifacts for this build with the Standard file objects Storage Class.

Test 3. PASSED. big file upload with AWS CLI option checked and STANDARD_IA Storage Class selected:

Jenkins job Artifact-Plugin-Test, build #49_AWS-CLI_STANDARD_IA
Console Output:

...
+ + ls -laR
tee -a build_49.log
.:
total 6291656
drwxr-xr-x  2 jenkins jenkins       4096 Feb 27 21:05 .
drwxr-xr-x 51 jenkins jenkins       4096 Feb  9 15:38 ..
-rw-r--r--  1 jenkins jenkins         77 Feb 27 21:05 build_49.log
-rw-r--r--  1 jenkins jenkins 6442450944 Feb 27 21:05 rand_20240227-210508.dat
New run name is '#49_AWS-CLI_STANDARD_IA'
Archiving artifacts
AWS Storage Class selected option: STANDARD_IA
AWS option selected: Use AWS CLI...
Copy rand_20240227-210508.dat to s3://XXX-build-artifacts/jobs/Artifact-Plugin-Test/49/artifacts/rand_20240227-210508.dat
[Artifact-Plugin-Test] $ aws s3 cp --quiet --no-guess-mime-type rand_20240227-210508.dat s3://XXX-build-artifacts/jobs/Artifact-Plugin-Test/49/artifacts/rand_20240227-210508.dat --storage-class STANDARD_IA
Copy build_49.log to s3://XXX-build-artifacts/jobs/Artifact-Plugin-Test/49/artifacts/build_49.log
[Artifact-Plugin-Test] $ aws s3 cp --quiet --no-guess-mime-type build_49.log s3://XXX-build-artifacts/jobs/Artifact-Plugin-Test/49/artifacts/build_49.log --storage-class STANDARD_IA
Uploaded 2 artifact(s) to https://XXX-build-artifacts.s3.amazonaws.com/jobs/Artifact-Plugin-Test/49/artifacts/
Finished: SUCCESS

AWS S3 has artifacts for this build with the Standard IA file objects Storage Class.

Submitter checklist

tetiana.tvardovska added 4 commits February 27, 2024 21:10
Option to use AWS CLI for big artifacts (over 5Gb) upload
to AWS S3 with default Storage Class (STANDARD).

CLI - Command Line Interface.
Fix for JENKINS-56740 https://issues.jenkins.io/browse/JENKINS-56740

Build note:

1. Some tests fail, so may run with ignoring tests: -DskipTests

2. build with Maven option -Dchangelist=plugin-version
for example:

 mvn clean package -Dchangelist=master-tag.patch.version1.0

or with cuirrent date, the lates tag and skipping tests:

 mvn clean package -DskipTests \
 -Dchangelist=$(git tag -l --sort=creatordate | tail -n 1).patch2.0.$(date +%Y%m%d-%H%M%S)

Signed-off-by: tetiana.tvardovska <tetiana.tvardovska@globallogic.com>
Change-Id: Ic0e35c2009afe88802bfb0a0476ede7311aec056
STANDARD_IA S3 Storage Class was required for AWS cost-saving.

Build command sample:
 mvn clean package -DskipTests \
  -Dchangelist=$(git tag -l --sort=creatordate | tail -n 1).patch2.1.$(git rev-list --count HEAD).$(date +%Y%m%d-%H%M%S)

Signed-off-by: tetiana.tvardovska <tetiana.tvardovska@globallogic.com>
Change-Id: Ia92e5731549d7161b6bdfa1a27f68a5acd72d70a
1. Add option to select a custom AWS S3 Storage Class for uploaded objects:
- STANDARD (default)
- STANDARD_IA

Note:
 a) SUPPORTED ONLY by AWS CLI mode.
 Selected custom Storage Class is supported by AWS CLI mode only!

 b) NOT SUPPORTED by AWS API mode.
 Default AWS API mode ignores any selected custom AWS S3 Storage Class -
 it will always upload filed with STANDARD Storage Class.

2. Updated Readme with build and test notes.

3.Add Maven property patch.version

Notes:
- Jenkins (2.443) or higher required
- Build with skip test and Maven option -Dchangelist=plugin-version
For example:

 mvn clean package -DskipTests \
 -Dchangelist=$(git tag -l --sort=creatordate | tail -n 1).patch2.2.$(git rev-list --count HEAD).$(date +%Y%m%d-%H%M%S)

Or with patch.version:

 mvn clean package -DskipTests \
 -Dchangelist=$(git rev-parse --abbrev-ref HEAD)_v\${patch.version}-$( \
 git rev-list --count HEAD).$(git rev-parse --short HEAD)$(git diff --quiet || echo  .diff).$(date +%Y%m%d-%H%M%S

Signed-off-by: tetiana.tvardovska <tetiana.tvardovska@globallogic.com>
Change-Id: Ibeb1fbf74edba428ea0f0332f6cba12fdeb0a049
Add notes to Readme.md about new AWS options and other changes.

Signed-off-by: tetiana.tvardovska <tetiana.tvardovska@globallogic.com>
Change-Id: Ice71386e51411665cb68588c25f7b81f01fccd02
@salmira salmira requested a review from a team as a code owner February 27, 2024 21:32
@salmira
Copy link
Author

salmira commented Feb 27, 2024

For information: I did not managed to pass Maven build tests for the original plugin code.
Thus, I could not pass my changes with Maven build tests as well.
May need help in configuring environment for Maven build tests.

@jglick
Copy link
Member

jglick commented Feb 27, 2024

Compare #88, which if it works (I have not reviewed it in detail) seems more attractive because it does not rely on a particular executable being present on the agent.

@jglick
Copy link
Member

jglick commented Feb 27, 2024

help in configuring environment for Maven build tests

This plugin is tricky. (@cloudbees also runs CI for trunk and selected PRs against a real AWS account.) I use a ~/.mavenrc that sets AWS_PROFILE, AWS_REGION, S3_BUCKET, and S3_DIR, after using aws sso login.

tetiana.tvardovska added 2 commits February 29, 2024 12:37
Fixes for failed ci.jenkins.io / SpotBugs checks
https://github.com/jenkinsci/artifact-manager-s3-plugin/pull/467/checks?check_run_id=22052415895
for Review Request jenkinsci#467
jenkinsci#467

Signed-off-by: tetiana.tvardovska <tetiana.tvardovska@globallogic.com>
Change-Id: Ie446e4b334b71959905b82da92be0eda08465a85
Changes for AWS CLI and Storage Class options:
1. Update README.md - add description with images.
2. Update README.md images related options:
bucket-settings and custom-s3-service-configuration.
3. Update config validation messages.
4. Update Help texts.

Signed-off-by: tetiana.tvardovska <tetiana.tvardovska@globallogic.com>
Change-Id: I63cd59fa01425ad01b48f50cf5a2e6410449d603
@salmira
Copy link
Author

salmira commented Feb 29, 2024

Hi @jglick ,

Compare #88, which if it works (I have not reviewed it in detail) seems more attractive because it does not rely on a particular executable being present on the agent.

Sometimes it is better to have a not super ideal working workaround with rely on a particular executable being present on the agent, than to avoid using this plugin at all... or patch its each new version to keep the working environment...

Besides big files upload, we also need artifacts to be uploaded with S3 Storage Class Standard-IA (STANDARD_IA) for cost saving.
But I yet have not found a solution how make AWS API to set non-standard S3 Storage Class for objects when uploading with used libs... There are only possible solutions to change object Storage Class after upload... Users will have to pay month fee for all uploaded artifacts as for Standard objects with such solution anyway... I am still researching it...

By the way, should I split 'big files upload' and 'S3 Storage Class' issues into separate pull requests?...

Anyway, I will check #88 and related #141.

But, unfortunately, neither #88 nor #141 would meet our needs for big files and Standard-IA Storage Class...

@kuisathaverat
Copy link
Contributor

kuisathaverat commented Feb 29, 2024

IMHO upload/download an artifact of 5GB size is wrong. There is many ways to upload it making it easy to download it, for example making volumes of a fixed size. To use the command line to upload an artifact it it is above some size looks wrong too any error that happen in the CLI would not be controlled and properly reported. So I will not block this PR but I think it will cause more harm than benefit. Make things complicated does not have good endings.

@jglick
Copy link
Member

jglick commented Feb 29, 2024

I yet have not found a solution how make AWS API to set non-standard S3 Storage Class for objects

Huh, this is surprising. One possible issue: we are still using the v1 SDK. It may be that these features and more are covered by the v2 SDK (see e.g. #351), which would need to get packaged analogously to https://github.com/jenkinsci/aws-java-sdk-plugin so that various plugins can migrate to it.

Please be aware that the intent of this plugin is not to support every S3 feature, it is to make it convenient for people to use stock Pipeline code with steps like stash or archiveArtifacts without needing to worry about controller performance or disk space. I think handling “large” files falls within scope of the plugin; I am not familiar enough with storage classes to comment on that. My recollection is that you can configure your bucket to automatically move blobs to Glacier over time, which may be helpful.

Since I do not really have time to maintain this plugin I have asked some colleagues to help review and test this and some other important-looking PRs, but it may take some time.

any error that happen in the CLI would not be controlled a properly reported

Well, this is I think handled already. Yes the extra complexity is a downside. I do not have a strong opinion either way for now.

@kuisathaverat
Copy link
Contributor

BTW If you’re uploading a 5GB Artifact to Jenkins you should think about to upload it to a OCI register or directly to a service designed to store those kind of artifacts. Jenkins is not a OCI register and it should not be one.

@salmira
Copy link
Author

salmira commented Mar 4, 2024

@kuisathaverat ,

IMHO upload/download an artifact of 5GB size is wrong.

It could be wrong, but my project had started using such schema several years ago. It takes Jenkins just about 20 minutes to upload artifacts from build nodes to AWS S3 in our environment. Similar reasonable time is taken to retrieve artifacts for other jobs. So, why not use Artifact Manager S3 Plugin to upload build results to S3 even if result files are pretty big? As long as system performance is acceptable, why not use it?...

To use the command line to upload an artifact is above some size looks wrong too any error that happen in the CLI would not be controlled and properly reported.

Provided code logs (reports) all errors that may happen in the CLI. Errors are reported to the console log and could be addressed effectively.

BTW If you’re uploading a 5GB Artifact to Jenkins you should think about to upload it to a OCI register or directly to a service designed to store those kind of artifacts. Jenkins is not a OCI register and it should not be one.

Strange... Artifacts are stored at AWS S3 service, thanks to this plugin. Not at Jenkins server or build nodes.
AWS S3 is designed to store this and other kind of artifacts and works pretty fine over 5 years.
We are not 5GB Artifact to Jenkins. Such artifacts are created during builds on build agents. Then, thank to this plugin, artifacts are seamlessly and simply uploaded to Amazon S3 without any additional actions. Why should we complicate it with other special procedures?

@kuisathaverat
Copy link
Contributor

kuisathaverat commented Mar 4, 2024

Strange... Artifacts are stored at AWS S3 service, thanks to this plugin. Not at Jenkins server or build nodes.
AWS S3 is designed to store this and other kind of artifacts and works pretty fine over 5 years.
We are not 5GB Artifact to Jenkins. Such artifacts are created during builds on build agents. Then, thank to this plugin, artifacts are seamlessly and simply uploaded to Amazon S3 without any additional actions.

I know it, I developed part of this plugin, its tests, and the documentation.

Why should we complicate it with other special procedures?

the simply use of two ways to upload the files make the maintain more complex. Introduce a CLI wrapper implies any change in the behavior of the CLI will break the plugin, the CLI could be any version of it so you have a new world of posible issues.

It takes Jenkins just about 20 minutes to upload artifacts from build nodes to AWS S3 in our environment.

20 min uploading a file are 20 min paying to not have any network issue.

@salmira
Copy link
Author

salmira commented Mar 4, 2024

@jglick,

Huh, this is surprising. One possible issue: we are still using the v1 SDK. ...

May be... I am a newbie in Jenkins and AWS programming, so could miss something. Will check it.

Please be aware that the intent of this plugin is not to support every S3 feature

Sure! We just wanted to reduce our costs when using Amazon S3 with archiveArtifacts.
By default, objects are uploaded by archiveArtifacts to Amazon S3 with S3 Standard Storage Class. And immediately, the user's Amazon account is changed for the size of all uploaded objects for the whole month! Even if this object is moved to another Storage Class right after the creation (by backet configuration), the user account is yet changed for the whole month already for Standard price. Moreover, then it will be charged for another Storage Class.

Thus, we considered to create objects from the very beginning with desired S3 Standard-IA Storage Class (Standard-Infrequent Access), as the most convenient type of storage in our case.

Nevertheless, I may add all other Storage Classes for this option. May be someone will consider other Storage Class as more suitable... And it will enhance the plugin functionality.

By the way, I have found a workaround how to PUT objects with required Storage Class. I am going to add the update soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants