DevOpsMetrics

A project to experiment with high performing metrics. A demo website displaying the metrics can be viewed here.

All four of these metrics are based on production environments, where the value to our end users is delivered:

Lead time for changes: Time from committing a change to deployment to production
Deployment frequency: How often we deploy to production
Mean time to restore (MTTR): How quickly we can restore production in an outage or degradation
Change failure rate: after a production deployment, was it successful? Or did we need to deploy a fix/rollback?

(Chart from page 18 of https://services.google.com/fh/files/misc/state-of-devops-2019.pdf)

More information about high performing DevOps metrics can be found in a blog post here: https://samlearnsazure.blog/2020/04/30/high-performing-devops-metrics/

The current solution:

We currently have all four of the metrics implemented and undergoing a pilot

Deployment Frequency, in both Azure DevOps and GitHub:
- How does it work? We look at the number of successful pipeline runs.
- Assumptions/things we can't currently measure:
  - The build is multi-stage, and leads to a deployment in a production environment.
  - We only look at a single branch (usually the master branch), hence we ignore feature branches (as these probably aren't deploying to production')
- Current limitations: Only one build/run/branch can be specified
Lead time for changes, in both Azure DevOps and GitHub:
- How does it work? We look at the number of successful pipeline runs and match it with Pull Requests
- Assumptions/things we can't currently measure:
  - We currently count the pull request and deployment durations, averaging them for the time period to create the lead time for changes metric.
  - We start measuring at the first commit for a branch. Development is variable that depends on the task, and doesn't help with this measurement.
  - We assume we are following a git flow process, creating feature branches and merging back to the master branch, which is deployed to production on the completion of pull requests
  - We assume that the user requires pull requests to merge work into the master branch - we are looking at all work that is not on this master branch - hence we currently only support one master branch.
- Current limitations: Only one repo and master branch can be specified
Time to restore service, in Azure
- How does it work? We setup Azure Monitor alerts on our resources, for example, on our web service, where we have an alerts for HTTP500 and HTTP403 errors, as well as monitoring CPU and RAM. If any of these alerts are triggered, we capture the alert in an Azure function, and save it into a Azure table storage, where we can aggregate and measure the time of the outage. When the alert is later resolved, this also triggers through the same workflow to save the the resolution and record the restoration of service.
- Assumptions/things we can't currently measure:
  - Our project is hosted in Azure
  - The production environment is contained in a single resource group
  - There are appropriate alerts setup on each of the resources, each with action groups to save the alert to Azure Storage
- Current limitations:
  - Only one production resource group can be specified
  - If there is catastrophic resource group failure, (e.g. deleted), there is a high chance that some/all of the alerts will also be deleted
Change failure rate, in Azure DevOps and GitHub
- How does it work? We look at builds, and let the user indicate if it was successful or a failure. By default (currently), the build is considered a failure. (We are going to change this to success by default later)
- Assumptions/things we can't currently measure:
  - The build is multi-stage, and leads to a deployment in a production environment.
  - We only look at a single branch (usually the master branch), hence we ignore feature branches (as these probably aren't deploying to production)
  - The user has reviewed the build/deployment and confirmed that the production deployment was successful
- Current limitations: Only one build/run can be specified

Architecture

Uses .Net CORE 3.1 & MSTest. A GitHub action runs the CI/CD process.

Currently the CI/CD process:

Builds the code
Runs the unit tests
Deploys the webservice to a single/prod Azure web app (https://devopsmetrics-prod-eu-service.azurewebsites.net)
Deploys the demo website to a single/prod Azure web app (https://devopsmetrics-prod-eu-web.azurewebsites.net)

Dependabot runs daily to check for dependency upgrades, and will automatically create a pull request, and approve/close it if all of the tests pass successfully

References

Azure DevOps API: https://docs.microsoft.com/en-us/rest/api/azure/devops/build/builds/list?view=azure-devops-rest-5.1
GitHub API: https://developer.GitHub.com/v3/actions/workflow-runs/

Name		Name	Last commit message	Last commit date
Latest commit History 202 Commits
.github/workflows		.github/workflows
DevOpsMetrics		DevOpsMetrics
ReadmeImages		ReadmeImages
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

DevOpsMetrics

DevOpsMetrics

ReadmeImages

ReadmeImages

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

DevOpsMetrics

The current solution:

Architecture

References

About

Releases

Packages

Languages

License

codebytes/DevOpsMetrics

Folders and files

Latest commit

History

Repository files navigation

DevOpsMetrics

The current solution:

Architecture

References

About

Resources

License

Stars

Watchers

Forks

Languages