-
Notifications
You must be signed in to change notification settings - Fork 87
How to calculate
In this stage, we use pipeline as the only datasource.
Deployment frequency(DF): The number of successful deployments to target environment under the given date range.
Now let's tear it apart:
- Successful deployment: A successful record in given target stage. Note the build itself could be failed, but as long as the stage is successful, it's a successful deployment.
- Target environment: The name of a stage which users want to measure.
- Date range: A time period which users want to measure.
Please note although the name is "frequency", we count the times only, because the calculation for frequency requires both times and time length, once we have times, we can get frequency for any time length.
DF = number of successful deployments to target environment under the given date range
Let's assume we have a pipeline like above, the larger the build number, the newer it is. Each row represents a build record, and each column represents a stage.
And assume we want to calculate the deployment frequency(DF) for UAT environment from Oct 15 to Oct 28. Based on the definition, the calculation will be:
Successful deployment = D1, D2, D3, D4
Therefore: DF = 4
Q: Should we include failure deployments when calculating DF?
A: No. DF is supposed to measure the frequency we ship value to the target environments. A failure deployment does not ship value to the target environments hence should not be considered.
Q: Should we include "empty builds" (i.e. the builds which include no new commits) when calculating DF?
A: No. Empty builds need to be excluded.
In this stage, we use pipeline as the only datasource.
Mean lead time for changes(MLT): The average amount of time it takes to go from code commit to code running in target environment under the given date range.
More details:
- Lead time: The time between code commit and the first deployment which contains this commit in target environment
- Target environment: The name of a stage which users want to measure.
- Date range: A time period which users want to measure.
- ct(n): The time in which the commit #n happens
- dt(n): Time of the success deployment #n
- LT(d#n): Lead time of deployment #n
MLT = ((dt#1 - ct#2) + (dt#1 - ct#3) + (dt#1 - ct#4) + (dt#1 - ct#5) + (dt#2 - ct#6)) / 5 = x (hours)
Q: Is there a better way to calculate MLT instead of calculate the mean value of multiple deployments? Shouldn't we consider the weight of different deployments due to the various batch size?
A: Correct. Batch size need to be considered. Refer to the latest algorithm to see how it works in our system.
Q: How should we compute "delayed deployments"? Does "delayed deployments" result in negative impact to the MLT value?
A: When compute MLT value, commits are selected to base on the time range in which the deployment occurs instead of the build trigger.i.e. When deployment occurs, compute its lead time. Otherwise, no.
In this stage, we use pipeline as the only datasource.
Change failure rate(CFR): The percentage of deployment failures in the target environment under the given date range.
More details:
- Deployment failure: A failed record in given target stage.
- Target environment: The name of a stage which users want to measure.
- Date range: A time period which users want to measure.
CFR = number of failed deployments to target environment under the given date range / total deployments to target environment under the given date range
Let's assume we have a pipeline like above, the larger the build number, the newer it is. Each row represents a build record, and each column represents a stage.
And assume we want to calculate the Change failure rate(CFR) for UAT environment from Oct 15 to Oct 28. Based on the definition, the calculation will be:
Total 2 failures: F0, F1 (aborted build doesn't count)
CFR = 2/6 = 33.33%
Q: What if the deployment was successful, but some failure happened after the deployment?
A: For errors that happened after the deployment, it's difficult to determine if the error was introduced by released code or some other reasons, e.g. infrastructure failure, previous bug, etc. Therefore, for this kind of case, some manual work will be needed. In this stage, we want to use as little manual work as possible, so we've chosen pipeline as the only data source. But in the future, in order to make the statistic more accurate, some user input or JIRA/GitHub issue integration could be considered.
Q: Is an "aborted build" a failure build? Should it be included in CFR?
A: No. Since for each failure we would also want to calculate its mean time to restore value. An aborted build doesn't necessarily result in a failure to which the team wants to apply a fix.
Q: When calculating CFR, except for the failure execution to the target environment, should we also calculate the ones failed at preceding stages? For example, failed at build stage.
A: No. Because it doesn't sound reasonable for counting an execution as failure when it did not even attempt to deploy to the target environment. On the other hand, technically it is also cumbersome to identify the order of different stages.
In this stage, we use pipeline as the only datasource.
Meantime to restore(MTTR): The average amount of time it takes to restore from deployment failures in the target environment under the given date range.
More details:
- Time to restore: The time between a failed record in given stage to next success record in given stage.
- Target environment: The name of a stage which users want to measure.
- Date range: A time period which users want to measure.
MTTR = total restore time / failed build count
Let's assume we have a pipeline like above, the larger the build number, the newer it is. Each row represents a build record, and each column represents a stage.
And assume we want to calculate the mean time to restore(MTTR) for UAT environment from Oct 15 to Oct 28. Based on the definition, the calculation will be:
F0 time to restore, TTR0 = D2 - F0
F1 time to restore, TTR1 = D4 - F1
F2 time to restore, TTR2 = (omit since F1 not yet fixed)
MTTR = ( TTR0 + TTR1 ) / 2
Q: For consecutive failures, do we measure the TTR time for all of them, or the first failure only?
A: For consecutive failures, we measure the TTR value for first failure only.