Member of Technical Staff - Site Reliability Engineer

Endor Labs

Endor Labs

Software Engineering, IT
Palo Alto, CA, USA
USD 170k-220k / year
Posted on May 20, 2025

About Us

Endor Labs is building the Application Security platform for the software development revolution. Modern software is complex and dependency-rich, making it increasingly difficult to pinpoint the risks that truly matter. AI code generation with LLMs exacerbates this problem, making it easy to produce large amounts of code quickly. Endor Labs solves this challenge by building a call graph of your entire software estate—enabling teams to clearly identify, prioritize, and fix critical risks faster.

Trusted by companies that are one or one hundred years old, Endor Labs secures code whether it was written by humans or AI—be it 40-year-old C++ code or cutting-edge Bazel monorepos. Endor Labs was founded by serial entrepreneurs Varun Badhwar and Dimitri Stiliadis and is backed by leading VC firms such as Dell Technologies Capital, Lightspeed, and Sierra Ventures.

What You’ll Do

As a Site Reliability Engineer at Endor Labs, you’ll play a pivotal role in shaping the reliability, performance, and scalability of our systems. You’ll partner with engineering teams across the company to define and implement best practices that improve operational excellence, reduce incidents, and foster a culture of accountability and continuous improvement.

  • Lead the definition and rollout of SRE practices across engineering, including SLAs, SLOs, and error budgets
  • Design and build monitoring, alerting, and observability frameworks that empower teams to own the reliability of their services
  • Establish incident response protocols and lead post-incident reviews to drive learning and remediation
  • Collaborate with product and platform teams to improve system architecture with reliability and performance in mind
  • Advocate for automation of deployments, scaling, and failover procedures across services
  • Create tooling and dashboards that give teams visibility into system health, latency, and error rates
  • Foster a blameless culture and partner closely with engineering leadership to drive a proactive approach to reliability
  • Champion operational readiness for new services before they go to production
  • Mentor engineers and help scale reliability thinking across the organization

What We’re Looking For

If you’re passionate about building reliable, secure systems and empowering teams to own what they build—and if you’re excited by the idea of shaping how SRE is practiced at a fast-growing early-stage company—we’d love to talk to you.

  • 8+ years of software engineering or infrastructure experience, with 3+ years in an SRE or DevOps capacity
  • Strong experience designing and scaling production systems in cloud-native environments.
  • Proficiency with observability tooling such as Prometheus, Grafana, Datadog, OpenTelemetry, etc.
  • Experience setting and managing SLAs/SLOs and driving improvements in reliability metrics
  • Proficient in programming/scripting languages such as Go, Python.
  • Experience with container orchestration (Kubernetes, Helm) and infrastructure-as-code (Terraform, Pulumi, etc.)
  • Familiarity with CI/CD pipelines and deployment strategies.
  • Exceptional communication skills and a collaborative mindset—able to influence and educate across teams
  • A mindset of ownership, humility, and learning

At Endor Labs, we:

  • Go to extraordinary lengths to distinguish ourselves through world-class work
  • Prioritize quality over speed, and speed over scope
  • Desire to work with deeply kind, mission-driven people
  • Strive to make the complex simple
  • Use first principles to debate ideas, test assumptions, and make decisions
  • Seek the truth by putting data above opinions
  • Assume good intent and give tactical feedback to help each other get better
  • Hold no ego—when our customers win, we all win

Compensation:

For candidates who receive an offer for this position, the compensation range is expected to be between $170,000 - $220,000.