SSUSA Job #957: Site Reliability Engineers (SRE) or DevOps Engineers
Job Description
Site Reliability Engineer or DevOps Engineer
One of our Media Communications companies in NYC is seeking an (SRE) Site Reliability Engineer or DevOps Engineer to be part of an experienced technical team that handles applications hosted in an AWS Cloud environment handling security, reliability and performance of the applications. You will be responsible for all aspects of our builds and AWS deployment environments including scaling, provisioning, monitoring, and automation. You should have significant CI/CD experience and strong AWS system administration skills. Java and Spring Boot development experience will also be a plus.
RESPONSIBILITIES:
Maintain and improve the architectures of current cloud services, as well as assist the engineering team to design and build new services with deployment, availability, and scalability in mind
Manage our cloud infrastructure (compute instances, databases, etc.)
Improve real-time monitoring of services to allow the team to address and resolve issues proactively
Debug and troubleshoot live production issues, which will occasionally occur off-hours
Develop procedures to ensure team members follow best practices for server configuration, database configuration, etc.
Work closely with developers to find ways to automate and improve existing processes
Work with the Technical Support team to triage/catalog issue reports
Document and monitor processes and performance
EXPERIENCE:
Minimum of 5 years of experience in a similar role
Experience and knowledge of cloud management, deployment, and distributed systems architecture with Amazon Web Services (Selected list of AWS technologies that we use: CloudFormation, VPC, EC2, Elastic Beanstalk, DynamoDB, RDS, S3, Lambda, Step Functions)
Effortless fluency in one or more programming languages (most of the existing deployment codebase is in Python)
Experience with RedHat Linux
Extensive, in-the-trenches experience maintaining live production systems
Knowledge of best practices for software development and deployment architecture
Experience working with data security standards, vulnerability scanning, identity management, and other security best practices
Deployment automation platform experience, particularly Ansible
Experience with real-time streaming media systems
Experience with documentation and process surrounding third party security reviews, and audit process and requirements for SOC 2 / ISO 27001 and similar data protection certifications
SEND YOUR RESUME TO CLIFF@SSUSA.COM
MENTION JOB 957 IN THE SUBJECT BOX
Job Location
Stamford, CT area
Position Type
Permanent
Salary Range
TBD