Manager of Engineering, SRE
Company: Platform Science
Location: San Diego
Posted on: February 2, 2025
Job Description:
Who We AreAt Platform Science, we're working to connect
everything that moves.Founded in 2015, we are an open IoT platform
that partners with innovative fleets, application developers,
vehicle manufacturers, and equipment providers in the
transportation industry to deliver revolutionary solutions to
supply chain professionals across the globe.Our employees are an
engaging, diverse group of people who believe in the power of great
ideas. We hire people with different experiences and perspectives
to build a company culture that fuels growth through innovation.We
value thoughtful actions and empathy for others. We approach
challenges with resiliency and creativity, while encouraging
transparency because, no matter our backgrounds or
responsibilities, we are one team.About the RoleThe Site
Reliability Engineering (SRE) Manager will lead a high-performing
team that ensures system reliability, scalability, and efficiency
while championing SRE principles across the organization. This role
involves coaching the team, promoting best practices, and enabling
development teams to deliver observable, maintainable, and
production-ready applications. The SRE Manager oversees multiple
projects, requests, and initiatives while maintaining clear
communication and keeping the team aligned and productive.Essential
Responsibilities
- Recruit, train, and mentor a team of Site Reliability Engineers
to deliver operational excellence.
- Foster a culture of innovation, collaboration, and adherence to
SRE principles like SLOs, error budgets, and production
readiness.
- Standardize and train development teams on observability tools
such as Prometheus, Grafana, and Datadog.
- Enhance developer and release workflows using CI/CD best
practices, GitOps methodologies, and tools like Jenkins, ArgoCD,
and Docker.
- Drive application and system resilience through chaos
engineering, load testing, and automation.
- Collaborate with teams to define SLIs, SLOs, and manage error
budgets.
- Manage on-call rotation schedules, optimize alerting processes,
and ensure 24/7 production application support.
- Serve as the escalation point for incident resolution,
providing guidance and technical expertise.
- Build tools, dashboards, and processes to improve incident
response, production health, and system reliability.
- Conduct quarterly "State of the Service" reviews to assess
performance, sustainability, and risks.
- Track and prioritize multiple initiatives while ensuring the
team stays focused and aligned with organizational goals.
- Maintain detailed documentation on team projects, requests,
policies, and best practices.
- Communicate effectively across teams, departments, and
stakeholders to ensure alignment and a clear understanding of SRE
initiatives.
- Evangelize SRE practices across the organization and ensure
consistent adoption of reliability-focused processes.Education and
Experience
- 5+ years of experience in software engineering or SRE
roles.
- 2+ years in a leadership or management position.
- Proven expertise with Kubernetes, ArgoCD, AWS, Prometheus,
Grafana, Datadog, FluentD, Jenkins, and Docker.
- Strong knowledge of CI/CD and GitOps practices.
- Excellent verbal and written communication skills.
- Demonstrated ability to track and prioritize multiple projects,
requests, and initiatives effectively.
- Bachelor's degree in Computer Science, Engineering, or
equivalent experience.Platform Science Benefits HighlightsThe
company offers various benefits to regular, full-time employees
including:
- Medical, dental, and vision insurance
- Short-term and long-term disability insurances
- AD&D and life insurance
- 401k plan
- Paid vacation, sick leave and holidays
- Six weeks of paid parental leaveFor more information please see
the brochure for regular, full-time employees.This is an exempt
role. Our job titles for each posting may span across more than one
job level. The estimated base salary for this role is between
$134,550 and $200,000. The range displayed on each job posting
reflects the minimum and maximum target range for new hire base
salaries across all US locations. Compensation packages are based
on many factors unique to each candidate, including but not limited
to skill set, work experience, relevant trainings and
certifications, business needs, market demands and specific
geographical location. The base pay range is subject to change and
may be modified in the future. This role may also be eligible for
bonus, equity, and benefits.Beware job scams! Our recruiters use @
emails only. We don't interview via text/message. We don't ask for
software downloads (except Zoom) or sensitive info (like SSN/bank).
Suspect fraud? Report it to law enforcement &
peopleops@platformscience.com.
#J-18808-Ljbffr
Keywords: Platform Science, San Bernardino , Manager of Engineering, SRE, Executive , San Diego, California
Didn't find what you're looking for? Search again!
Loading more jobs...