Site Reliability Engineer / DevOps Senior

2025

Site Reliability Engineer

/ DevOps Senior

2025

Horas de trabajo

Full time

Ubicación

Cualquier lugar

Lugar de trabajo

Remoto

Tipo de contrato

Contrato indefinido

What is Cobre and what do we do?


Cobre is Latin America’s leading instant b2b payments platform. We solve the region’s most complex money movement challenges by building advanced financial infrastructure that enables companies to move money faster, safer, and more efficiently.


We enable instant business payments—local or international, direct or via API—all from a single platform.

Built for fintechs, PSPs, banks, and finance teams that demand speed, control, and efficiency. From real-time payments to automated treasury, we turn complex financial processes into simple experiences.

Cobre is the first platform in Colombia to enable companies to pay both banked and unbanked beneficiaries within the same payment cycle and through a single interface.


We are building the enterprise payments infrastructure of Latin America!


What we are looking for:


The Cobre Infrastructure team and their SRE engineers are professionals who face the daily challenges that allow us to improve the technological level of our products. We enjoy each project or task, giving 100% and learning from each other. Our main goal is to maintain the reliability of our systems. To achieve this, we collaborate with other teams to find the most effective solutions, maintain high-reliability processes, adopt the necessary safety measures and optimise time and cost in every decision.

What would you be doing:

  • Support teams in defining the infrastructure that will support the solution architecture.

  • Support all the infrastructure (aws services and k8s clusters) and company products. Culture zero-downtime deployments.

  • Assisting with troubleshooting application issues and incidents related with infrastructure services.

  • Review code instrumentation with development teams and ensure necessary dashboards are created to monitor.

  • Document and maintain runbooks and procedures, automate as much as possible.

  • Perform periodic load and scalability testing to establish baselines, drift, and capacity planning.

  • Design and implement peak readiness reviews for anticipated high-volume times.

  • Lead weekly operational state reviews covering performance trends, anomalies, errors and other availability events with SREs, product owners, and development teams.

  • Plan and execute periodic Disaster Recovery exercises including both tabletop and simulated failures (fault injection).

  • Socialize SRE culture across teams within the organization to publicize the value of SRE, mentor and train other engineers around proactive reliability decision making and planning.

What do you need:

  • Proven experience of at least 3 years as SRE or DevOps, with a strong focus on highly available and scalable environments, cloud infrastructure, observability, and incident management.

  • In-depth technical knowledge of microservices architecture and cloud platforms (e.g., AWS, Kubernetes), along with proficiency in Infrastructure as Code (IaC) tools (e.g., Terraform).

  • Passionate about automating routine processes (e.g., scripting python, bash)

  • Strong understanding of monitoring, logging, and alerting tools, with a track record of improving system reliability and performance. (e.g., NewRelic, Datadog, Cloudwatch…)

  • Proven experience troubleshooting, mitigating, and resolving issues in a distributed system.

  • Ability to define and execute the SRE strategy, aligning it with company goals and driving the adoption of SRE practices across multiple teams.

  • Resilience in facing challenges and promoting a fail-fast, learn-fast culture that embraces innovation and experimentation.

  • Exceptional communication skills to effectively convey complex technical concepts to both technical and non-technical stakeholders.

  • Ability to actively listen and understand diverse team and stakeholder needs, demonstrating empathy in decision-making and conflict resolution.

CobreJobs es una plataforma de Cobre destinada a conectar talento con oportunidades en nuestro ecosistema. La información proporcionada será tratada con confidencialidad y según las leyes de protección de datos aplicables.

© Todos los derechos reservados de Cobre

CobreJobs es una plataforma de Cobre destinada a conectar talento con oportunidades en nuestro ecosistema. La información proporcionada será tratada con confidencialidad y según las leyes de protección de datos aplicables.

© Todos los derechos reservados de Cobre