Search IT Jobs

Site Reliability Engineer

Alpharetta, GA 30005

Posted: 08/28/2019 Employment Type: Contract To Hire Job Category: Engineering Job Number: 11045

Site Reliability Engineer

Our client is a data and technology company fostering on innovation, growth and collaboration. The fast-paced, team-driven environment provides the opportunity to work as a key contributor on high priority initiatives by developing new products, solutions and platforms, and supporting technology operations while maintaining the highest standards of quality.

Our client is looking for a Site Reliability Engineer to join their team in Alpharetta, GA!

Here’ s what you’ ll be doing:
  • Engaging in and improving  the whole lifecycle of software development services— from inception and design, through deployment, operation, and refinement
  • Supporting services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews
  • Working closely with development and operations teams to build highly available, cost effective systems with extremely high uptime metrics
  • Working with teams across organization and ensuring  core services reliability and keep an eye on capacity and performance
  • Maintaining services once they are live by measuring and monitoring availability, latency, and overall system health in a 24/7 environment
  • Participating  anytime in on-call support for multiple core platforms globally
  • Scaling  systems sustainably through mechanisms like automation and evolving  systems by pushing for changes that improve reliability and velocity
  • Practicing  sustainable incident response and blameless postmortems
  • Influencing  and creating  new designs, architecture, standards, and methods for large-scale systems
  • Binding and orchestrating the system infrastructure with the application layer to enable High Availability/Clustering load balancing and integration
  • Providing  technical guidance or support for the development or troubleshooting of systems
  • Establishing end-to-end monitoring and alerting on all critical aspects to ensure SLOs, SLIs, and SLAs and get proactive notifications of possible issues for all systems
  • Developing automated solutions to address potential problems before they result in a service interruption and demonstrating  a passion for automation, including CI/CD automation
  • Establishing performance baseline, capacity thresholds, correlate events, and define monitoring/alerting criteria

Here’ s what our ideal candidate has:
  • Bachelor of Science degree in Computer Science, Engineering, or equivalent relevant experience
  • Expertise in designing, analyzing and troubleshooting large-scale distributed systems
  • Ability to debug and optimize code and automate routine tasks
  • Overall 6+ years of experience in one or more of the following:
    • Experience in building JavaEE applications using, build tools like Maven/ANT, Subversion, JIRA Jenkins, Bitbucket and Chef
    • Experience in continuous integration tools (Jenkins, SonarQube, JIRA, Nexus, Confluence, GIT-BitBucket, Maven, Gradle, RunDeck, is a plus)
    • Experience creating automation using Chef, Puppet or another SCM tool
    • Docker and container scheduler services such as ECS or Kubernetes is desirable
    • Experience working with Nginx, Tomcat, HAProxy, Redis, Elastic Search, MongoDB, and RabbitMQ, Kafka, Zookeeper
    • Experience as SCM/release engineer, or in a position with similar skill sets and responsibilities (Software Engineer, Systems Engineer, Systems Administrator)
    • Experience in performing source code control management Subversion/GIT including branching, merging, tagging, etc.
    • Experience in configuring and administering JavaEE application servers (Tomcat, WebSphere, WebLogic, etc.)
    • Experience in with scripting language such as Unix Shells, Python, Perl, Shell, bash, ksh)
    • Experience in configuring, building, and supporting apps and operations in a public cloud environment (AWS, Azure, GCP)
    • Experience with Monitoring and Logging tools (Elastic Search, ELK, AppDynamics, Splunk, etc.)
  • Knowledge of Agile / Scrum methodologies and principles
  • Excellent written and verbal communication skills with the ability to communicate with team members at various levels, including business leaders

Benefits: Our IT consultants enjoy a wide array of benefits including: medical, dental, 401K, life insurance, and much more.

Send an email reminder to:

Share This Job:

Related Jobs:

Login to save this search and get notified of similar positions.