Sr. Site Reliability Automation Engineer
BNY Mellon | |
United States, Florida, Lake Mary | |
Jun 24, 2026 | |
|
We're seeking a future team member for the role of Sr. Site Reliability Automation Engineer to join our Technology team. This role is located in Lake Mary, FL and Pittsburgh, PA. In this role, you'll make an impact in the following ways: *Design and implement end-to-end observability (logs, metrics, traces) across distributed systemsBuild Observability & Monitoring *Integrate and optimize tools such as AppDynamics, Dynatrace, Grafana, and Splunk *Develop dashboards, alerts, and telemetry frameworks to provide real-time visibility *Identify gaps in monitoring and drive adoption of best practices Drive Automation & Reduce Toil *Identify repetitive operational work and automate it using code and tooling *Build self-healing and auto-remediation solutions *Enable scalable, reliable processes through automation and engineering rigor *Improve operational efficiency across production environments Support Production & Incident Triage *Troubleshoot and resolve complex production issues across distributed systems *Participate in incident management, triage, and root cause analysis *Improve monitoring and automation based on recurring incident patterns *Collaborate with support and engineering teams to improve system stability Improve Reliability & Performance *Define and measure service health using SLIs/SLOs and key performance metrics *Identify system bottlenecks and reliability risks *Contribute to performance optimization and capacity planning *Provide input into system architecture to improve resilience and scalability To be successful in this role, we're seeking the following: *3-6 years of experience in Site Reliability Engineering, Software Engineering *Strong programming background in Java (preferred) or another modern language *Experience with at least one observability platform: *AppDynamics, Dynatrace, Grafana, or Splunk *Hands-on experience supporting and troubleshooting production systems *Strong analytical and problem-solving skills *Ability to identify inefficiencies and drive automation Preferred Qualifications *Experience with distributed systems or microservices architectures *Familiarity with CI/CD pipelines and DevOps practices *Exposure to cloud platforms and/or Kubernetes *Experience scripting (Python, Bash, etc.) for automation *Knowledge of SRE concepts like observability, incident management, and reliability engineering | |
Jun 24, 2026