About the job
This role will primarily involve leading and working with a cohesive team of engineers, developers, and support teams to build world-class systems and the necessary tools to maintain and constantly improve. The position calls for someone with a mind to innovate, automate, and constantly use measurements and statistics to improve. Members of the current team have backgrounds in a range of areas including software development, networking, UNIX internals and large-scale systems administration from some great teams and firms.
Responsibilities:
Define the process, set the framework and drive the end to end improvement
Ensure the reliability, availability, and performance of applications
Own the automation of repetitive tasks and resolution of systematic issues
Identify and deliver engineering solutions for issues based on root cause analysis
Own incident management and resolution
Lead by evangelizing the SRE mindset to other teams
Provide support and ensure applications are production-ready
Qualifications:
Strong background in computer science: data structures, algorithms, distributed systems
Fully proficient and hands-on in at least one modern structured programming language
Proficient with a range of current software development tools and practices (testing, source control, build systems, CI/CD etc)
Track record of leading a team of expert engineers working on Worldclass systems at a high-quality company. Experience building and managing highly reliable large scale systems.
Basic working understanding of TCP/IP networking, LAN and WAN as well as Linux networking
Bonus experiences:
SQL and database experience.
Web UI experience (e.g., knows javascript, CSS and similar web languages).
Passion for learning, adapting to changing requirements and technology and inventing new approaches to hard problems, curious mindset.