A curated list of Site Reliability and Production Engineering resources.
-
Updated
Jun 26, 2021
{{ message }}
A curated list of Site Reliability and Production Engineering resources.
A curated list of Site Reliability and Production Engineering Tools
Serverless chaos monkey for AWS (runs on AWS Lambda)
Probabilistic Risk Analysis Tool (fault tree analysis, event tree analysis, etc.)
A curated list of awesome Site Reliability and Production Engineering resources.
GOV.UK PaaS - Cloud Foundry
The Chaos Toolkit core library
A collection of SRE tools
An opinionated list of attributes and policies that need to be met in order to establish a stable software system.
A terraform provider for Concourse
GSP is a container platform and curated suite of components helping government deploy, run, observe and secure their services
The k6 documentation website.
A collection templates ported from the SRE Workbook
Terraform configuration to manage a Prometheus server running on AWS.
A Go application for generating billing data from cloudfoundry events
A service broker to provide Aiven Elasticsearch and InfluxDB services to Cloud Foundry users
Administration tool for GOV.UK PaaS
Bootstrap a VPC with BOSH and Concourse to run PaaS
A concourse resource for creating and updating Grafana annotations
Code for the paper "Deep Cox Mixtures for Survival Regression", Machine Learning for Healthcare Conference 2021
A cloud foundry compatible route service that imposes an IP safelist
A small, underdocumented Puppet module for hardening Ubuntu systems.
Technical documentation for GOV.UK PaaS
Add a description, image, and links to the reliability-engineering topic page so that developers can more easily learn about it.
To associate your repository with the reliability-engineering topic, visit your repo's landing page and select "manage topics."
Currently this will work:
But this won't: