Site Reliability Engineer

California  ‐ Onsite
This project has been archived and is not accepting more applications.
Browse open projects on our job board.

Description

About the role:
We are looking for a highly skilled member of its Site Reliability Engineering team with functional knowledge in all areas of technology operations and site reliability with particularly emphasis on monitoring and trending. The ideal candidate will fulfil the critical role of ensuring our systems are healthy, monitored, and designed to scale. The successful candidate should have hands-on experience in a web-scale role with emphasis on software-as-a-service. Candidates should also have experience designing, planning, implementing, tuning and operating technology including application Servers, virtual machine & container management, large-scale monitoring/trending techniques, micro-service architectures, clustering technology, configuration management and creative scaling techniques.

About the team:
As a senior member of our Application Operations team, you will join a team of dedicated, intelligent, fast-paced engineers. You'll work in a cutting edge hybrid cloud environment that will power our company's impressive growth. You will bring a data driven approach to monitoring, trending, and telemetry. We are smart, innovative, and ambitious, and are looking for people of the same cut to join us.

Job Duties:

  • Drive architecture principles, operability guidelines and progressive scaling techniques within the platforms
  • Help develop and maintain processes, tools, and documentation in support of all components
  • Participate in the evaluation of new software, automation, and infrastructure solutions
  • Collaborate with architects, developers, data engineers, and infrastructure engineers on designing scalable and highly available platforms.
  • Ensure proper security, monitoring, alerting and reporting for application platform.
  • Troubleshoot and resolve production issues
  • Help drive the capacity planning process

Qualifications:

  • Experience in application design and deployment with a high volume customer facing website
  • Experience with large-scale Linux production environments, preferably as part of an online service provider environment
  • Strong sense of ownership of projects and tasks assigned
  • Strong interpersonal and communications skills
  • Ability to solve problems quickly and automate processes
  • Hands on experience with release, deployment, and environment management
  • Ability to write code in at least one language. (eg Python, Perl, Ruby, Java, Javascript)
  • Experience with application virtualization and containerization technologies (Docker, Kubernetes, Mesos, CoreOS/rkt)
  • Hands-on experience with infrastructure as code tools and concepts (eg Salt/Puppet/Chef/Ansible)
  • Experience with big data systems and distributed systems
  • Working knowledge of advanced open source web, database, and OS server configuration (Linux, Nginx, Tomcat, MongoDB, ElasticSearch (ELK), ZooKeeper, Redis)
  • Experience with cloud computing platforms and hybrid cloud environments (VMware vSphere, AWS EC2 and abstracted PaaS solution family)
  • Ability to manage competing priorities in a complex environment

Desired Qualifications:

  • At least 3+ years of experience working as a SRE/Application engineer
  • At least 3+ years of coding experience
  • At least 5+ years of experience working in a fast-paced senior engineering role
  • Bachelor's degree or equivalent
Start date
n.a
From
NextGen Global Resources
Published at
18.09.2016
Project ID:
1205166
Contract type
Freelance
To apply to this project you must log in.
Register