Skip to content

Site Reliability Engineer | Lisbon

  • Hybrid
    • Lisbon, Lisboa, Portugal
  • Tech

Job description

The SRE is a team in the Engineering organisation that applies the well-known SRE mentality to Unbabel's "You Build It, You Run It" approach. Our mission is to use a software engineering mindset to deliver and maintain a set of services that form the backbone where the services are built and run by the rest of the Engineering teams.

Responsibilities:

  • Develop the platform that supports our business services, improving the systems that implement provisioning/automation, deployment, monitoring, and others

  • Collaborate with the other engineering teams to focus on improving the usability of those systems and advising on best practices for building scalable and reliable applications

  • Work with different open-source/third-party and cloud-native technologies

  • We are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences at every interaction

Job requirements

Must haves:

  • Excellent English verbal and written communication skills

  • Engineering Degree or equivalent

  • Knowledge of SRE best practices

  • SRE experience developing, running and debugging multiple distributed systems at scale, in production

  • Ability to collaborate cross-functionally and effectively with diverse, fast-paced teams

  • Experience managing Kubernetes clusters and serverless environments

  • Solid knowledge of Linux, container internals, and computer networking

  • Experience working with monitoring/observability systems (e.g. Prometheus, Thanos, Grafana)

  • Experience in shell scripting and strong software development skills with one of the following: Python, Go, Java, Rust, Ruby, C/C++, or any similar programming language

  • Experience with any major cloud provider, preferably AWS

  • User experience with technologies like Docker, Kubernetes, Nginx, or Apache

Nice to have:

  • Experience in supporting or mentoring other engineers

  • Experience in creating response plans against Infrastructure and disaster failovers

  • Experience with technologies like Terraform, ArgoCD, GitLab CI/CD, and Hashicorp Vault

Hybrid
  • Lisbon, Lisboa, Portugal
Tech
Full-time, Permanent

or