Site Reliability Engineer

Remuneration:	cost-to-company
Location:	Johannesburg
Job level:	Senior
Type:	Permanent
Company:	THE SKILLS MINE (PTY) LTD

Requirements:

A bachelor’s degree in computer science, engineering or related field, or equivalent experience
Proficient in one or more programming languages, such as Python, Go, Java, or C++
Proficient in one or more scripting languages, such as Bash, Perl, or Ruby
Proficient in one or more cloud platforms, such as AWS, Azure, or GCP
Proficient in one or more UNIX-like operating systems
Proficient in one or more configuration management and deployment tools, such as Ansible, Chef, Puppet, or Terraform
Proficient in one or more monitoring and alerting tools, such as Prometheus, Grafana, Datadog, or Splunk
Proficient in one or more container and orchestration tools, such as Docker, Kubernetes
Proficient in one or more web servers and proxies, such as Apache, Nginx, or Envoy
Proficient in one or more databases and data stores, such as MySQL, PostgreSQL, MongoDB, or Redis
Proficient in one or more version control and collaboration tools, such as Git
Knowledgeable in the concepts and principles of site reliability engineering, such as SLIs, SLOs, error budgets, incident management, postmortems, and blameless culture
Knowledgeable in the concepts and principles of software engineering, such as design patterns, code quality, testing, debugging, and documentation
Knowledgeable in the concepts and principles of performance engineering, such as profiling, benchmarking, load testing, and capacity planning
Knowledgeable in the concepts and principles of distributed computing, such as concurrency, parallelism, synchronization, and consensus

Responsibilities:

Assisting with resources to facilitate engineering services, and keep them operational
This includes continuous integration systems, software deployment and basic troubleshooting of code, and creation and management of software repositories
Ensuring servers are patched against security exploits in time, managing secure access to servers and repositories for partners and internal staff, and secure interconnection between systems
Ensuring servers are configured in a documented and repeatable way
Ensuring system and server architecture is appropriate to the requirements of projects, easily maintainable in the long term, and provides appropriate levels of redundancy
Provide timeous uptime assurance, and support with issue investigation and recovery procedures

Skills:

AWS
C++
Docker
Java
MongoDB
MySQL
Perl
postgreSQL
Python
Ruby

Posted on 23 Aug 16:51, Closing date 22 Sep