Description
Site Reliability Engineer - 6 month contract - outside IR35 - Flexible Rate
Join an exciting B2B subscription software start-up that has great financial backing and a fast growing team. You'll be the first SRE hire, helping to implement standards and set the direction for the SRE function within the company.
Responsibilities:
- Contribute to development, support & scaling of a high throughput, high availability, public-facing multi-tenanted application, in a fast growing Series-A-funded startup environment
- Work closely with Back End engineers to improve monitoring, diagnose and resolve performance issues, and extend platform architecture
- Be responsible for system stability and performance, aided by metrics and alerts which you will take primary responsibility for maintaining.
- Assist with diagnosis of networking-related integration problems
- Work with support and Back End engineers to quickly resolve production incidents, and take lead in writing customer-facing incident reports
Our stack:
- AWS (Cloudfront, ALB, ECS, DynamoDB, RDS) via terraform, Java
- Prometheus, Grafana, PagerDuty
- Soon: TimescaleDB, Varnish
Experience:
- Has worked at the production end of high-traffic, public-facing websites and multi-tenanted SaaS applications
- Has worked extensively with monitoring tools (datadog, ELK, grafana, etc)
- Has configured at least 2 different reverse proxies (Nginx, HAProxy etc)
- Deep understanding of AWS, including cost optimisation
On-call expected (1 week out of every 4)