Site Reliability Engineer



As a Site Reliability Engineer (SRE) at Circonus you will be responsible for keeping Circonus SaaS and on-premise customers up and running as well as improving the automation, scalability, and performance of systems. This is an unparalleled opportunity to grow on a small, collaborative, and friendly team with established leadership in the field of SRE. 


Responsibilities



    • Work in the office or remotely, or both (but not at the same time)

    • Install, upgrade, and manage systems powering Circonus SaaS infrastructure

    • Install, upgrade, and manage systems powering customer infrastructure running Circonus software

    • Troubleshoot availability and performance issues

    • Diagnose production issues and perform front-line remediation

    • Communicate with management and customers regarding aberrant system's behavior

    • Influence software and architecture design based on system and architecture observations related to performance and reliability

    • Participate in an on-call schedule




Requirements



    • Linux (RHEL, CentOS, Ubuntu)

    • Experience working with cloud service providers such as AWS, Azure, or GCP

    • Chef or similar automation system

    • HAProxy, PostgreSQL, Apache or similar technologies

    • Strong networking knowledge: firewalls, TCP & UDP, DNS, SSL/TLS

    • Strong understanding of monitoring principles

    • Familiarity leveraging REST and REST-like APIs for operations tasks

    • UNIX troubleshooting skills: tcpdump, strace, bpftrace, etc

    • Fluency in one or more of the Git, Subversion or Mercurial version control systems




Preferred Experience


    • Experience with Docker, Kubernetes, and containers



Circonus offers a powerful machine data intelligence platform to handle the world's most demanding use cases. From mission-critical IT infrastructure to data-intensive IoT applications, Circonus works with any tech and at any scale. Circonus uses advanced data science and patented technology to ingest and analyze machine data to deliver unmatched clarity, insights, and performance. From real-time alerts and fault detection to ML-based predictive analytics, Circonus helps companies optimize operations and deliver exceptional user experiences with confidence.



We enjoy a global reach, but our customers primarily cluster on the East Coast, California, and to a lesser degree, Europe. Our success stems from our industry-leading offering and our obsession with customer satisfaction.



Culturally, we operate like a startup. Small, agile teams with quick decisions and short, iterative cycle times. We relish our core values of respect, integrity, value and growth, among others. 



All of our positions include a discretionary PTO policy, health insurance, gym reimbursement, a generous 401(k), the opportunity for a bonus and more.



Apply Now

Back to jobs