Deployment Frequency

Background Information

Definition

Deployment Frequency is the DORA corollary to Batch Size in Lean Theory. More specifically, deployment frequency measures the reciprocal of batch size - increasing deployment frequency reduces batch size.

Batch size measures inventory - work in process, a form of waste.

A closer definition for batch size would be “pending change sets”:

  • Pending, to designate it is waiting to be deployed to production
  • Change set, a deployable unit since individual commits may be too fine grained to be considered deployable.

However, identification and measurement of the above definition is more complex to track, and ultimately DORA’s definition of deployment frequency is a suitable corollary:

  • If a team is practising Continuous Delivery, then there should be a close to 1-1 correlation between commit and change-set.
  • If a team is deploying multiple times per day, then by definition it demonstrates the team are delivering in small batches.

Classification

WayFinder calculates a classification tag for deployment frequency to allow you to compare your performance with wider industry performance.

Classification tags are based on industry standards and DORA findings.

elitehighmediumlow
deployment FrequencyOn demand (multiple deploys per day)Between once per day and once per weekBetween once per week and once per monthBetween once per month and once every six months

Outcomes and Risks

Measurement

  • Deployment frequency is calculated based on any successful deployment to production with no success criteria. This is the most optimistic measurement method.
    • For teams following true Trunk Based Development, this is an incentive.
  • For teams performing staged rollouts (e.g. canary releasing), deployments are counted for any percentage of production traffic.
  • Deployment frequency calculated using median number of days per week with at least one successful deployment >=3.
  • Successful Deployments are tracked against your deployment pipeline by measuring successful execution of production deployment pipeline stages.

Data Sources

Deployment frequency requires data from your deployment pipeline to be calculated.

If your deployment process requires manual steps, include placeholder steps within your automated deployment pipeline to track when deployments occur.

Worked Examples

Scheduled pipeline re-deploys changes

Context: The Time Team have a pipeline that runs every day at 0900. Sometimes, the pipeline fails due to flaky tests, meaning the team must re-run the pipeline manually.

Scheduled Daily Pipeline
Scheduled Daily Pipeline

At time t:

  • t=0
    • A commit triggers the pipeline - Commit 1
    • The pipeline takes 4 hours to complete and is successful
  • t=24
    • The source has not been updated - it is still at commit 1
    • A time based schedule triggers the pipeline
    • The pipeline takes 4 hours and deploys to production, but the pipeline fails due to flaky post deployment verification
  • t=29
    • The source has not been updated - it is still at commit 1
    • A developer manually re-triggers the pipeline
    • The pipeline takes 4 hours and is successful
  • t=48
    • The source has not been updated - it is still at commit 1
    • A time based schedule triggers the pipeline
    • The pipeline takes 4 hours and is successful

Outcome: In the example above, the same code has been deployed to production 4 times. All the pipeline executions count towards deployment frequency (note that pipeline 2 is included even though it failed - deployment to production was performed, even though a subsequent failure occurred)

  • 4 deployments were performed
  • Deployment Frequency (based on the data available) is 4 / week

Pipeline ‘deploys’ to a container registry which then triggers further actions

Context: Team Kraken have a two stage pipeline:

  • The build phase builds and packages source into a container, publishing it to a container registry
  • The deployment phase is triggered by updates to the container registry, pulls the image and deploys it to the production kubernetes cluster.

In this scenario, the build pipeline should not be used to capture deployment data. Deployments should be tracked following successful deploy tasks from the deployment process.

Container Deployment Process
Container Deployment Process

An application supports elasticity through auto-scaling

Context: Team Elastic manage a product that needs to handle highly unpredictable usage. As a result they have settled on a solution using autoscaling to dynamically scale to meet demand.

Auto Scaling Groups
Auto Scaling Groups

Following a marketing event, the product experiences a large spike in traffic and an additional seven instances are provisioned to handle the load.

Outcome: Deployment frequency is not affected - the additional instances provisioned for auto scaling do not contribute towards deployment frequency.

FAQ

Q: Does deployment frequency include failed deployments?

A: No. If the deployment fails and the code does not get into production and therefore to users, this does not contribute to DF. For example the pipeline fails in the deploy to production stage because an acceptance test has fails and the pipeline is rolled back. In this case the new code has not got to the end users and therefore does not increment the DF count.

From a lean perspective, a failed deployment means inventory (a change set) is still work in progress (WIP) and therefore increasing the batch size. This is what deployment frequency is intending to measure.

Q: Why does re-releasing changes count as multiple deployments? Doesn’t this encourage bad practice?

Deployments are calculated based on successful pipeline deployments to production. However, during measurement we cannot guarantee that deployments based on the same source hash are idempotent - other side effects may be observed: the infrastructure may have been modified externally; operating system upgrades may have been performed; code dependencies may be updated. All these examples contribute to deployment frequency: deploying the same code following an OS upgrade reduces the size of your change and should contribute to your calculations.

This does not prevent attempts to ‘game’ the metric: repeated deployments of an identical environment could artificially increase your deployment frequency. In our opinion this risk is an acceptable trade-off - exercising your deployment pipeline should not have negative consequences. This is also why we do not consider each metric in isolation. Repeated deployments do nothing to improve your lead times; nor your time to restore. Whilst it could artificially reduce your change failure rate, it would not materially reduce the number or impact of your incidents.

Q: How does compute elasticity (autoscaling) affect deployment frequency?

Deployments are calculated based on successful pipeline deployments to production. We do not count auto-scaling events as individual deployments. Autoscaling is an architectural technique to support operational efficiency, scalability and cost optimisation.

Autoscaling is good practice we encourage, but does not contribute to reducing batch size - the goal for measuring deployment frequency.