Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modified S3 uploads to be done in parallel #114

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

beryl-factset
Copy link

I've modified a few of the S3 functions to use parallelstream() vs the for loop this should improve upload times drastically. I saw a drop from 26 mins to 9 mins for 10,000 5KB files.
Jenkins issue JENKINS-61936 https://issues.jenkins-ci.org/browse/JENKINS-61936

changed artifactUrls to use ConcurrentHashMap
@jglick jglick self-requested a review May 6, 2020 20:35
swapped /n to %n per error

// Map artifacts to urls for upload
for (Map.Entry<String, String> entry : artifacts.entrySet()) {
// Map artifacts to urls for upload
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spacing issue here.

@hutson
Copy link

hutson commented May 7, 2020

One point to make, all our testing has been with regard to upload performance. We had not investigated what it would take to improve download performance of artifacts hosted on S3.

@jglick if I'm reading this plugin correctly, it does not handle Artifact downloads. I would assume, therefore, that artifact download (Specifically downloading a Zip of all artifacts) is handled by Jenkins core. That may be something our team will need to look into in the future. Our use case involves uploading ~5,000 files from one host, and then pulling a zip (Using the Zip archive link generated by Jenkins) of those files down onto another host for deployment. We've been experiencing performance issues on both sides of the process.

@jglick
Copy link
Member

jglick commented May 7, 2020

Our use case involves uploading ~5,000 files from one host, and then pulling a zip (Using the Zip archive link generated by Jenkins) of those files down onto another host for deployment.

Do not do that. *zip* URLs are not optimized and will force the Jenkins master to retrieve every file from S3, ZIP them up, then send that in a giant HTTP response. Not what you want. Much better to zip up all files inside the workspace where they are generated and archive that one file. The resulting artifact/… URL from the Jenkins master will just be a redirect to S3.

See https://jenkins.io/jep/202 for background.

@beryl-factset
Copy link
Author

@jglick so what are my next actions ?

@jglick
Copy link
Member

jglick commented May 7, 2020

My next actions are to review, and perhaps rewrite a bit, once I have time to work on this plugin—it is on the back burner. But I do not recommend using this plugin with a large number of artifact files to begin with. If you use a handful of large artifacts it should behave better without modification. (The overhead of lots of little artifacts with external URLs is more with Jenkins core and other JEP-202 plugins than with this plugin per se.)

I should say that download performance of a single file would be fine, so uploading thousands (assuming this patch) but only actually using one or two would not be a problem. But if your use case is similar to @hutson’s then you should certainly switch to archiving a tarball or zip.

@beryl-factset
Copy link
Author

Okay Thanks, Please let me know if you need anything from me!

@hutson
Copy link

hutson commented May 15, 2020

See https://jenkins.io/jep/202 for background.

Thank you for the reference.

Not what you want.

I definitely concur. We are working on adopting the practice of pre-zipping files before archiving, but given the breadth of our user base, we know it's going to take time.

My next actions are to review, and perhaps rewrite a bit, once I have time to work on this plugin

Thank you for your time. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants