Site Reliability Engineer
4 months ago
BMC is looking for a SaaS Site Reliability Engineer (SRE) to operate and ensure our SaaS service availability. As an SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our customers' systems and services.
Here is how, through this exciting role, YOU will contribute to BMC's and your own success:
· Ensuring service availability and performance of the SaaS platform.
· Handling incidents and errors on the platform and resolving them.
· Managing upgrades of future releases and patches to the Production environment.
· Creating and Maintain Production runbooks.
· Performing assigned daily, weekly, and monthly operations maintenance duties.
· Supporting platform maintenance and testing initiatives.
· Developing processes, documentation, and automation.
· Monitoring system performance and resolving incidents promptly to ensure uptime and reliability.
· Developing and implementing strategies for proactive monitoring and alerting.
To ensure you’re set up for success, you will bring the following skillset & experience:
· 2-4 years of experience working with SaaS production environments.
· Experience working and collaborating with Software Delivery teams.
· Excellent Written and verbal communication skills.
· Strong analytical and problem-solving skills.
· Proven experience in supporting mission-critical applications on a global scale preferred.
To ensure you’re set up for success, you will bring the following skillset & experience:
· Experience supporting SaaS critical infrastructure.
· UNIX/Linux experience and system administration knowledge.
· Experience with cloud platforms (e.g., AWS, GCP) and container orchestration (e.g., Kubernetes) is an advantage.
· Education: Academic degree, minimum on bachelor level, in engineering (IT, Telecom); however, equivalent work experience may be substituted for educational requirements