We are looking for an enthusiastic and motivated Site Reliability Engineer (SRE) to join our growing team. In this role, you will have the opportunity to learn and contribute to the stability, performance, and scalability of our critical systems. We place a strong emphasis on security in all aspects of our operations. You will work closely with teams to maintain and improve our infrastructure, monitor services, and respond to incidents. This is an excellent opportunity to develop your skills in a dynamic and supportive environment.
<Responsibilities>
1.Assist in Maintaining and Optimizing Infrastructure: Support teams in the day-to-day maintenance and optimization of our infrastructure components.
2.Monitor Services and Address Issues: Monitor system health and service performance, and assist in troubleshooting and resolving issues in a timely manner.
3.Track Resource Usage and System Status: Help monitor various resource indicators and the overall status of the system, contributing to optimization efforts.
4.Support System Stability and Incident Response: Assist in maintaining system stability and participate in incident response procedures under guidance.
5.Contribute to Preventing System Failures: Work with the team to implement measures that help avoid system failures and service interruptions.
6.Collaborate with Other Teams: Work alongside other teams to continuously learn about and contribute to improving system architecture and service quality.
7.Support System Maintenance and Deployment Processes: Assist in the execution of established processes for system maintenance, deployment, and upgrades.
8.Learn and Apply SRE Best Practices: Actively learn and apply SRE principles and best practices in daily tasks.