
Manager of Global Solution Architecture, Customer Platform Engineering - EST (Remote)
Weights & BiasesPosted 4/16/2025

Manager of Global Solution Architecture, Customer Platform Engineering - EST (Remote)
Weights & Biases
Job Location
Job Summary
Weights & Biases is seeking a highly skilled and experienced SA/SRE (Solutions Architect / Site Reliability Engineering) Manager to lead customer-managed deployments, improve on-premise deployments, streamline upgrades, and build scalable systems and processes through automation. The role requires strong technical background, leadership capabilities, and deep understanding of deployment automation and reliability engineering. The ideal candidate will have 7+ years of experience in SRE, DevOps, or Solutions Architecture roles, with at least 2+ years in a managerial or leadership capacity. They should also possess proficiency in infrastructure as code (Terraform, Ansible, or similar tools), CI/CD automation, and monitoring, logging, and alerting tools. The position offers competitive salary, benefits, and career growth opportunities, as well as a flexible remote work culture with in-office flexibility in San Francisco.
Job Description
Responsibilities:
- Lead and manage a team of SA engineers focused on supporting and scaling customer-managed and on-premise deployments.
- Design, implement, and enhance deployment architectures to improve reliability, scalability, and security.
- Develop and optimize upgrade processes to minimize downtime and operational risk.
- Build and maintain automation frameworks to streamline deployment, monitoring, and incident management.
- Collaborate closely with product and engineering teams to enhance software deliverability and maintainability for on-premise environments.
- Establish and enforce best practices for configuration management, infrastructure as code (IaC), and CI/CD pipelines.
- Lead incident response and root cause analysis for critical production issues, ensuring continuous improvement and proactive problem prevention.
- Drive a culture of operational excellence, automation, and continuous improvement across the organization.
- Customer empathy is vital and timely communication with customer stakeholders
Requirements:
- 7+ years of experience in SRE, DevOps, or Solutions Architecture roles, with at least 2+ years in a managerial or leadership capacity.
- Strong background in managing on-premise and customer-managed deployments at scale.
- Proficiency in infrastructure as code (Terraform, Ansible, or similar tools) and CI/CD automation.
- Experience with Kubernetes, Docker, and cloud/on-prem hybrid architectures.
- Expertise in monitoring, logging, and alerting tools (Prometheus, Grafana, ELK, etc.).
- Strong scripting and programming skills (Python, Go, Bash, etc.).
- Experience with security and compliance considerations in enterprise software deployments.
- Excellent communication and stakeholder management skills, with the ability to influence technical and business decisions.
- Experience working in SaaS and enterprise environments is a plus.
Why Join Us?
- Opportunity to drive large-scale transformation in enterprise software deployment and automation.
- Work with cutting-edge technology and a team of talented engineers.
- Competitive salary, benefits, and career growth opportunities.
Our Benefits:
- 🏝️ Flexible time off
- 🩺 Medical, Dental, and Vision for employees and Family Coverage
- 🏠 Remote first culture with in-office flexibility in San Francisco
- 💵 Home office budget with a new high-powered laptop
- 🥇 Truly competitive salary and equity
- 🚼 12 weeks of Parental leave (U.S. specific)
- 📈 401(k) (U.S. specific)
- Supplemental benefits may be available depending on your location
- Explore benefits by country