Functions:
We are seeking a motivated and knowledgeable Cloud Operations Engineer (L1/L2 Support) with a strong foundation in AWS services and Kubernetes management. The ideal candidate will have hands-on experience with AWS components such as EKS, EC2, RDS, IAM, and CloudWatch, along with basic Linux administration skills.
This role is critical in maintaining the health, performance, and security of our cloud infrastructure, providing first and second-level support, and ensuring seamless operations.
Key Responsibilities:
AWS Infrastructure Support:
Monitor and manage AWS services including EKS, EC2, RDS, IAM, VPC, IPSEC VPN, ECS and CloudWatch and other Services.
Respond to and resolve incidents, service requests, and alerts in a timely manner.
Kubernetes Management:
Assist in managing EKS clusters, ensuring their availability and performance.
Perform basic troubleshooting and maintenance tasks on Kubernetes clusters.
Knowledge of Rancher is an added advantage
System Monitoring and Incident Response:
Utilize monitoring tools to oversee the health and performance of cloud infrastructure
Analyze and respond to system alerts, resolving issues to minimize downtime.
Linux Administration:
Perform basic Linux system administration tasks, including user management, file permissions, and system updates
Troubleshoot and resolve basic Linux-related issues that impact cloud operations.
Support and Maintenance:
Provide L1/L2 support, addressing tickets and escalating complex issues to higher-level support when necessary
Conduct regular system maintenance, updates, and patches to ensure security and compliance
Infrastructure Automation:
Automate cloud operations using infrastructure-as-code tools such as CloudFormation and Terraform.
Develop and maintain CI/CD pipelines to streamline deployment processes and enhance efficiency
Documentation and Reporting:
Maintain detailed documentation of incidents, troubleshooting steps, and resolutions.
Generate and present regular reports on system performance, incidents, and resolution timelines.
Collaboration and Communication:
Work closely with other IT and development teams to coordinate and implement cloud solutions
Communicate effectively with stakeholders regarding the status of issues, incidents, and system health.
Qualifications:
Kubernetes and Terraform experience is mandatory
Proficiency in AWS services, including EKS, EC2, RDS, IAM, and CloudWatch
Basic knowledge of Kubernetes and experience managing EKS clusters.
Fundamental understanding of Linux administration and basic troubleshooting
1-3 years of experience in a cloud operations or similar support role
Experience with monitoring and incident management tools
AWS Certified Cloud Practitioner or AWS Certified Solutions Architect Associate preferred.
Strong analytical and problem-solving skills.
Excellent verbal and written communication skills.
Ability to work in a fast-paced environment and manage multiple tasks simultaneously
Should be okay with rotating shift and on-call set-up