Kubernetes Site Reliability Engineer

London, New York

Attach A Resume

(.docx, .doc, .pdf, .txt - max 10MB file size)

For information on how we process your data, please read our privacy policy.

Job Description

Citadel is seeking a Kubernetes Site Reliability Engineer (SRE) to join our central Compute team that is evolving the compute platform for the firm. In this role, you will be instrumental in shaping and executing our containerization strategy, optimizing resource utilization, and ensuring robust disaster recovery and business continuity. The ideal candidate has experience working in high-performance environments and has worked on the Kubernetes internals. The Kubernetes engineer at Citadel will work with application and fellow infrastructure teams to design solutions and troubleshoot issues.  
Key Responsibility: 
This role is hands-on, requiring direct interaction with platform users to understand their requirements. The ideal candidate will translate these requirements into effective solutions and then build and configure them. We prioritize self-documenting, version-controlled code, and configurations over traditional wiki pages and written documentation. 
  • Design and drive standards for compute usage, encompassing virtual machines and containers. 
  • Establish processes to ensure applications are host-neutral and deployable across various data center and office environments. 
  • Support firmwide disaster recovery and business continuity initiatives. 
  • Leverage data to inform strategies and decision making. 
Required Qualifications:  
  • Bachelor's degree in computer science, ora related technical discipline, or an equivalent experience.  
  • Experience building and running production Kubernetes clusters  
  • Deep understanding of Linux and its network stack  
  • Experience with observability techniques including logs, metrics, traces, and profiles.  
  • Experience deploying and managing services on the Google Cloud Platform (GCP).  
  • Experience writing production-grade code in Go, Python or Rust.  
  • Experience developing with Git, issue tracking, code reviews and CI/CD pipelines. 
Preferred Qualifications  
  • Proficiency in infrastructure provisioning/management tools (e.g. Ansible, Puppet, Terraform, Packer). 
  • Experience with ArgoCD, Helm, and eBPF 
  • Advanced knowledge of TCP/IP networking, architecture, and core technologies (such as DNS, DHCP, HTTP, Routing, VPN). 
  • Ability to manage and implement large scale infrastructure projects. 

In accordance with New York City’s Pay Transparency Law, the base salary range for this role is $105,000 to $300,000. Base salary does not include other forms of compensation or benefits.

About Citadel

Citadel is one of the world’s leading alternative investment managers. We manage capital on behalf of many of the world’s preeminent private, public and nonprofit institutions. We seek the highest and best use of investor capital in order to deliver market leading results and contribute to broader economic growth. For over 30 years, Citadel has cultivated a culture of learning and collaboration among some of the most talented and accomplished investment professionals, researchers and engineers in the world. Our colleagues are empowered to test their ideas and develop commercial solutions that accelerate their growth and drive real impact.