SSUSA Job #957: Site Reliability Engineers (SRE) or DevOps Engineers

Job Description

Site Reliability Engineer or DevOps Engineer

 

One of our Media Communications companies in NYC is seeking an (SRE) Site Reliability Engineer or DevOps Engineer to be part of an experienced technical team that handles applications hosted in an AWS Cloud environment handling security, reliability and performance of the applications. You will be responsible for all aspects of our builds and AWS deployment environments including scaling, provisioning, monitoring, and automation. You should have significant CI/CD experience and strong AWS system administration skills. Java and Spring Boot development experience will also be a plus.

RESPONSIBILITIES:

Maintain and improve the architectures of current cloud services, as well as assist the engineering team to design and build new services with deployment, availability, and scalability in mind

Manage our cloud infrastructure (compute instances, databases, etc.)

Improve real-time monitoring of services to allow the team to address and resolve issues proactively

Debug and troubleshoot live production issues, which will occasionally occur off-hours

Develop procedures to ensure team members follow best practices for server configuration, database configuration, etc.

Work closely with developers to find ways to automate and improve existing processes

Work with the Technical Support team to triage/catalog issue reports

Document and monitor processes and performance

 

EXPERIENCE:

Minimum of 5 years of experience in a similar role

Experience and knowledge of cloud management, deployment, and distributed systems architecture with Amazon Web Services (Selected list of AWS technologies that we use: CloudFormation, VPC, EC2, Elastic Beanstalk, DynamoDB, RDS, S3, Lambda, Step Functions)

Effortless fluency in one or more programming languages (most of the existing deployment codebase is in Python)

Experience with RedHat Linux

Extensive, in-the-trenches experience maintaining live production systems

Knowledge of best practices for software development and deployment architecture

Experience working with data security standards, vulnerability scanning, identity management, and other security best practices

Deployment automation platform experience, particularly Ansible

Experience with real-time streaming media systems

Experience with documentation and process surrounding third party security reviews, and audit process and requirements for SOC 2 / ISO 27001 and similar data protection certifications

SEND YOUR RESUME TO CLIFF@SSUSA.COM 

MENTION JOB 957 IN THE SUBJECT BOX 

Job Location
Stamford, CT area

Position Type
Permanent