Site Reliability Engineer

Spartan Technologies, Inc. - Atlanta, GA

Cox Automotive - Inventory Solutions (AiM, CentralDispatch, Deal Shield, Manheim, Ready

Logistics, RMS Automotive)

Site Reliability Engineer

This Software Engineer will be part of the Site Reliability Engineering (SRE) team. The SRE

team is an innovative team devoted to providing automated solutions and services for Cox

Automotive to measure, evaluate and plan for visible, reliable application delivery and

maintenance. As a member of the SRE team, you will work with development teams to help

create automated pipelines and solutions required for continuous delivery in an Agile Dev/Ops

environment. The tools and use-cases are diverse, and our challenge is to increase the

development velocity by optimizing various parts of the pipeline and increase application

stability. This is an opportunity to create automation, monitoring, and pipelines to improve

deploy and response time across the board. We are looking for engineers who are passionate

about infrastructure as code and continuous deployment to build scalable and highly reliable

applications.

If you love to figure out how all the pieces are put together and if automation and building tools

to monitor and manage your applications sounds interesting to you, we want to talk to you.

What you will do:

Automate anything and everything! (Infrastructure build out, testing, deploying, monitoring, etc)

Design and assist in the authoring of software tools that reliably manage application delivery

Design and assist in the setup and maintenance of application monitoring and alerting

Engage with Development/Capability Teams to ensure best practices are implemented

Improve predictability and reliability of software releases, workflows and operating software.

Reduce application deployment windows by leading company towards a Continuous

Deployment environment

Reduce mean time to recovery (MTTR) by helping troubleshoot, monitor, alert, and automating recovery.

The skills we require:

Python, Ruby, Go or other systems programming (moderate skills required)

Experience with configuration management systems (Octopus, Chef, Puppet)

Experience rolling out redundant, mission-critical applications in a highly available production environment

Experience with version control systems (Git or SVN)

Experience with Cloud Computing platforms (Amazon AWS, Kubernetes, Heroku, etc)

Experience with continuous integration tools (Jenkins, CircleCI, etc), Artifactory (or Nexus) -

Excellent written communication, problem solving, and process management skills

Desire to work in a fast paced, evolving, growing, dynamic environment

The skills we prefer:

Linux system engineering expertise - VMWare, VirtualBox experience.

Experience supporting Ruby or Java applications

Experience supporting Database Server infrastructure (MySQL, Postgres, etc)

Networking Knowledge

Experience with Hashicorp tools (Vagrant, Terraform, Packer, etc), Linux Containers (docker,

rocket)

Experience with Java build tools such as Ant, Maven, Gant, or Gradle

Experience with agile development, continuous integration and automated testing

Experience with dashboarding, monitoring



Posted On: Friday, September 17, 2021



Apply to this job
  • *