Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DontReview Garnitin/add gke load testing/v1 #2225

Draft
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

gargnitingoogle
Copy link
Collaborator

Description

Link to the issue in case of a bug fix.

NA

Testing details

  1. Manual - NA
  2. Unit tests - NA
  3. Integration tests - NA

Copy link

codecov bot commented Jul 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.47%. Comparing base (5ae03ff) to head (743af55).
Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2225   +/-   ##
=======================================
  Coverage   77.47%   77.47%           
=======================================
  Files         110      110           
  Lines       15705    15705           
=======================================
  Hits        12167    12167           
  Misses       3016     3016           
  Partials      522      522           
Flag Coverage Δ
unittests 77.47% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

fi

function validateMachineConfig() {
echo "Validiting input parameters ..."
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

local cluster_name=${1}
local zone=${2}
local node_pool=${3}
if [ $(gcloud container node-pools list --cluster=${cluster_name} --zone=${zone} | grep -ow ${node_pool} | wc -l) -gt 0 ] ; then
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplify this with grep -q

if ClusterExists ${cluster_name} ; then
gcloud container clusters update ${cluster_name} --location=${zone} --workload-pool=${project_id}.svc.id.goog
else
# gcloud container --project "${project_id}" clusters create ${cluster_name} --zone "${zone}" --cluster-version "${cluster_version}" --workload-pool=${project_id}.svc.id.goog
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove commented code

echo "Enabling/disabling csi add-on ..."
# By default, disable the managed csi driver.
if ${useCustomCsiDriver}; then
# gcloud -q container clusters update ${cluster_name} \
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-enable this

gcloud container clusters get-credentials ${cluster_name} --location=${zone}
kubectl create namespace ${appnamespace}
kubectl create serviceaccount ${ksa} --namespace ${appnamespace}
for workload_bucket in ${buckets} ; do
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add code to get buckets from somewhere.

}

# validateMachineConfig ${machine_type} ${num_nodes} ${num_ssd}
# installDependencies
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-enable all these disabled steps.

def get_cpu(pod_name: str, start: str, end: str) -> Tuple[float, float]:
# for some reason, the mash filter does not always work, so we fetch all the metrics for all the pods and filter later.
result = subprocess.run(["mash", "--namespace=cloud_prod", "--output=csv",
f"Query(Fetch(Raw('cloud.kubernetes.K8sContainer', 'kubernetes.io/container/cpu/core_usage_time'), {{'project': '927584127901'}})| Window(Rate('10m'))| GroupBy(['pod_name', 'container_name'], Max()), TimeInterval('{start}', '{end}'), '5s')"],
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put the project-number back to the original and change it by code during runtime and then revert it back when done during runtime.

function updateMachineTypeInPodConfigs() {
for file in ${gke_testing_dir}/examples/fio/loading-test/values.yaml ${gke_testing_dir}/examples/dlio/unet3d-loading-test/values.yaml ; do
test -f ${file}
sed -i -E "s/nodeType: [0-9a-z_-]+$/nodeType: ${machine_type}/g" ${file}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add code to revert this back

for file in ${gke_testing_dir}/examples/fio/loading-test/values.yaml ${gke_testing_dir}/examples/dlio/unet3d-loading-test/values.yaml ; do
test -f ${file}
# sed -i -E "s/mountOptions: [0-9a-zA-Z,\:\"-_]+$/mountOptions: \"${gcsfuse_mount_options}\"/g" ${file}
sed -i -E "s/mountOptions:[ \t]*\"?[a-zA-Z0-9,:_-]+\"? *$/mountOptions: \"${gcsfuse_mount_options}\"/g" "${file}"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add code to revert this back when done

@gargnitingoogle gargnitingoogle force-pushed the garnitin/add-gke-load-testing/v1 branch from 2958a3c to ad97f82 Compare July 29, 2024 10:47
return utc_timestamp_string

def standard_timestamp(timestamp: int) -> str:
return timestamp.split('.')[0].replace('T', ' ') + " UTC"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

insert newline after this line

@@ -0,0 +1,425 @@
#!/bin/bash

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a header and a help option at the top.

@gargnitingoogle gargnitingoogle force-pushed the garnitin/add-gke-load-testing/v1 branch 7 times, most recently from ff7ee55 to 478921c Compare August 6, 2024 09:53
@gargnitingoogle gargnitingoogle force-pushed the garnitin/add-gke-load-testing/v1 branch 9 times, most recently from 1e4353a to b6a0e76 Compare August 16, 2024 14:45
@gargnitingoogle gargnitingoogle force-pushed the garnitin/add-gke-load-testing/v1 branch 9 times, most recently from b62b517 to d008a84 Compare August 22, 2024 11:53
@gargnitingoogle gargnitingoogle force-pushed the garnitin/add-gke-load-testing/v1 branch 6 times, most recently from 80d5046 to ba27f43 Compare October 25, 2024 04:27
@gargnitingoogle gargnitingoogle force-pushed the garnitin/add-gke-load-testing/v1 branch 3 times, most recently from 0735b33 to b4ae245 Compare November 6, 2024 06:17
@gargnitingoogle gargnitingoogle force-pushed the garnitin/add-gke-load-testing/v1 branch 5 times, most recently from 12704ce to d2952ab Compare November 13, 2024 10:27
Utilities:
1. Append given tabular data to the given gsheet
   id and worksheet name.
2. Return url for a gsheet given its ID.
3. Adds unit tests for the above utilities.
add generic utility to append to a gsheet

Add utility for gsheet

improve gsheet utility

export fio output to gsheet

encapsulate cpu/memory calculation in fio

disable repeat operations for quick testing

add dlio output export to gsheet

fix a bug in dlio output parsing

fix a column-name in fio csv output

Revert "disable repeat operations for quick testing"

This reverts commit 04bf834.

add log of successful addition to gsheet

clean-up code changes

added some error-checking

wip

wip

fix calls to download_gcs_objects

support key-file on gcs in gsheet

put back cpu/memory metrics

fix couple of logs

fix couple of help messages

put back accidentally deleted command
Adds row with "ERROR" for all values
rather than crashing during printing in
CSV file.
Purposes.
* Consistent behavior across all machines
* Monitoring API has faster runtime than mash.
* Monitoring API is supported on GCE VM too.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant