Site Reliability Engineer

San Francisco, ca 94102

Posted: 08/08/2018 Industry: IT Operations Job Number: 8436 Pay Rate: Not Specified
Our client is looking for a self-motivated SENIOR SITE RELIABILITY ENGINEER to join the growing team. You will be embedded in the Technical Operations organization, and also work closely with Engineering Development teams, Product Architecture, and Program Management. As a Site Reliability Engineer, we know that you are passionate about seamless uptime. You delight in building tools to automate routine tasks and constantly seek new ways to improve system performance. If you also want to join a fast-growing startup, work with friendly people, and play with some cool tech, you found the right place! What you'll be doing:
  • Our core technology is Java on Linux using open source technology throughout the stack. The Java engine runs and stores all data in RAM for super high performance while staying safe with transaction logs and auto recovery. The office is Macs with a few Windows holdouts. You decide which works best for you.
  • Utilize your skills in automation, replication and scaling to manage the production cloud in our worldwide data centers
  • Write scripts in Ruby, Python, Perl, etc. To build custom tools for automation, replication and scaling
  • Build tools to monitor and provide metrics on our systems
  • Perform Linux system administration (DNS, NFS, RPM, Apache, Raid, etc.)
  • Extend the existing automation we have in place and making things even easier to use.
  • Support Product Development Teams
  • Lead Release deployments and participate in revising software design to scale and prevent against failures
  • Ensure compliance with various best practices.
  • Adhere to compliance standards in the development and operations spaces as guided by security.
  • Participate in on-call rotation
  • Work in a customer facing production environment More about you:
  • B.S. In Computer Science and 3 + years relevant experience OR 10+ years equivalent experience supporting production platforms using the following skills:
  • Automation using tools such as Chef and Rundeck
  • Programming in any of the following: Ruby, Python, Perl, BASH
  • Multi Data center management, replication, scaling.
  • Middleware software such as HA Proxy, Consul, Terraform or equivalent architectures
  • Java applications including JVM performance and tuning
  • Metrics and monitoring - writing custom tools and familiar with open source options.
  • Linux administration - DNS, NFS, RPM, Apache, Raid, etc.
  • Technologies you'll work with:
  • Chef
  • CentOS
  • Red Hat
  • VMWare
  • EMC
  • Isilon
  • Pure
  • Dell-Both Virtual and Bare Metal
  • HP-Both Virtual and Bare Metal
  • Splunk
  • New Relic
  • Rundeck
  • Kubernetes Bonus points:
  • MySQL - replication, backups, some light querying
  • Networking - Switches, routers, firewalls, VPN
  • Amazon EC2, EFS and related AWS technologies
  • Google Cloud
  • Taking a bare metal server/hardware to a fully functional app server
  • Send an email reminder to:

    Share This Job:

    Related Jobs:

    Login to save this search and get notified of similar positions.