Job Description
We are looking for a highly motivated
Site Reliability Engineer (SRE) to enhance the stability, performance, and scalability of our ServiceNow SaaS platform . This hybrid role blends automation, operations, and process improvement with a focus on building reliable, customer-centric systems. As part of a global SRE community, you’ll collaborate with engineers, developers, and stakeholders to implement SRE best practices, support incident response, and continuously improve service quality.
Whether your background is in development, infrastructure, or systems administration, if you’re passionate about reliability, performance, and delivering measurable impact, we’d love to hear from you.
Key Responsibilities - Optimize and automate operational tasks to improve the availability, reliability, and scalability of the ServiceNow SaaS platform.
- Design and build observability solutions (metrics, logging, tracing, dashboards) to enable real-time system performance monitoring.
- Respond to and lead incident resolution efforts for ServiceNow services and occasionally for on-premise Linux-based infrastructure.
- Participate in a global on-call rotation , ensuring timely incident remediation (with time-off in lieu for on-call participation).
- Maintain and improve documentation and system dependency mapping to support operational clarity and knowledge sharing.
- Identify and remediate technical debt that negatively impacts system reliability, efficiency, or client satisfaction.
- Contribute to architecture reviews, operational tooling , and process optimization to advance SRE capabilities.
- Provide feedback on policies and operational processes to foster a culture of continuous improvement and service excellence .
Required Skills & Qualifications - 7+ years of experience in software development, systems administration, or infrastructure engineering roles.
- Strong programming or scripting proficiency, preferably in Python (or similar languages).
- Demonstrated troubleshooting expertise across ServiceNow and Linux-based environments .
- Excellent communication and collaboration skills , with the ability to work cross-functionally and globally.
- Proven ability to manage and resolve high-impact, time-sensitive technical issues .
- Commitment to continuous learning, operational improvement, and delivering a reliable user experience.
Preferred Qualifications - Hands-on experience in ServiceNow development or administration (training available if needed).
- Familiarity with SRE principles : automation, toil reduction, incident management, capacity planning, and performance monitoring.
- Prior experience in enterprise-scale SRE, DevOps, or production support environments.
- Exposure to IT Service Management (ITSM) , SaaS architectures, and enterprise toolchains .
Job Tags