Skip to main content

Principal Site Reliability Engineering Lead

Location: Oakland, California

Requisition ID # 165427-en_US

I'm Interested

Our IT professionals are at the enterprise's core, leveraging modern technology to deliver safe and reliable energy to our customers. We use AI, the cloud, data science, and the latest tools and programming languages to solve hard, interesting problems and tackle challenges like the ever-growing threat of climate change, wildfires, and breaches of cyber security. Join us and experience the satisfaction of being a technology enabler for a company that leads the industry in innovation.

  • Entry, Mid, Senior, Executive
  • Full-Time
  • Glassdoor Reviews and Company Rating

Success Profile

What makes a successful Team Member at PG&E? Check out the top traits we’re looking for and see if you have the right mix.

  • Adaptable
  • Collaborative
  • Creative
  • Curious
  • Results-driven
  • Thoughtful

Benefits

PG&E is proud to provide a comprehensive benefits program to help you take care of your physical, emotional and financial health. In addition to the offerings below, you can expect inclusive programs in areas such as performance recognition, training and employee development, mentoring and more.

  • Paid Time Off

    Vacation, Sick Hours, Holidays, Family Leave

  • Employee Resource Groups

    16 ERGs at the core of our DEIB culture that support employee development and foster business relationships

  • Professional Development

    Leadership and Employee Development Courses, LinkedIn Learning, Mentoring Program and up to $8,000 for Tuition Reimbursement

  • Healthcare

    Low-Cost Medical, Dental, Life/Accident/Disability Insurance and Free Vision

  • Healthcare & Dependent Care FSA

    Pre-tax employee-funded accounts that cover certain out-of-pocket medical and dependent care expenses

  • Retirement Plans

    401(k) Matching up to 8% AND Cash Balance Pension (no cost to you)

Job Details

Requisition ID # 165427 

Job Category: Information Technology 

Job Level: Manager/Principal

Business Unit: Information Technology

Work Type: Hybrid

Job Location: Oakland

Department Overview

The Data Solutions Architecture Team at Pacific Gas & Electric Company is responsible for driving long-term, enterprise-wide data solutions, target state architecture, and overall excellence with the application of data, analytics and information to critical business challenges and opportunities. This team is chartered to develop the strategy, roadmap, and accompanying standards that will enable better use of data and information and to develop analytics maturity at PG&E.

Position Summary

The Digital Utility runs on data and information. At PG&E, we have many teams building data products that need support, and our operations teams are on the hook for ensuring reliability and support across all data products. The Principal Site Reliability Engineering Lead fills a critical role in empowering our operations teams to do their best work.

The Principal Site Reliability Engineering Lead will drive our operations strategy in DA&I, working with operations teams with implementing best practices, mentoring junior engineers, driving automation, and building a continuously improving operations practice. You will work with operations management and operations engineers to create scalable DevOps practices for key data platforms at DA&I, notably Palantir Foundry, Snowflake, and Informatica. You will also get hands-on with operational problems, and building out operations tooling for the team.

We strive for a team that will make a difference in the new PG&E. As Site Reliability Engineering Lead, you will have a direct impact on the day-to-day life of data solutions, delivery, and affect the Safety of California. You will be collaborating with other technical leaders and Executive Leadership to help reshape a first-class operations team, with high levels of reliability for the data products we, and our customers, rely on the most. As Site Reliability Engineering Lead, you will work closely with supportive Operations management, a talented team in need of your guidance, and an organization looking to you to support their key products.

The Principal Site Reliability Engineering Lead will report to the Senior Manager of Data Solutions Architecture in the Data Analytics & Insights department of Information Technology, and work closely with the Data Ecosystem Operations team.


PG&E is providing the salary range that the company in good faith believes it might pay for this position at the time of the job posting. This compensation range is specific to the locality of the job. The actual salary paid to an individual will be based on multiple factors, including, but not limited to, specific skills, education, licenses or certifications, experience, market value, geographic location, and internal equity. We would not anticipate that the individual hired into this role would land at or near the top half of the range described below, but the decision will be dependent on the facts and circumstances of each case.


A reasonable salary range is:

Bay Area Minimum: $155,000.00

Bay Area Maximum: $265,000.00

Job Responsibilities

  • Technical Support and Collaboration: Provide applications engineering support to product teams. Collaborate with product teams, support teams, and customers on shared goals, cross-team projects, and new initiatives.
  • Continuous Improvement and Reliability Practices: Strive for continuous improvement in processes and reliability practices. Develop and evolve improved operations workflows.
  • Leadership and Mentoring: Show teams how to improve quality and eliminate waste by implementing improvements with them.
  • Hands-on Troubleshooting: As a member of the Operations team, you will join them on-call and be available to help with escalated issues, or issues requiring your additional experience and steady hand.
  • Operations tooling: You will build tools for improved operational workflows in collaboration with, and leading, members of the Operations team.
  • Efficiency: Identify wasteful processes and procedures. Work with teams to streamline and automate tasks.
  • Performance Monitoring and Improvement: Monitor, measure, and enhance the performance and state-awareness of systems. Identify and drive improvements in infrastructure and system reliability, performance, and monitoring.
  • Root Cause Analysis and Investigation: Lead investigations into repetitive damage and failure rates, utilizing root cause analysis techniques. Implement corrective and preventive actions based on findings.
  • Reliability and Capital Planning: Participate in annual and long-term reliability planning, ensuring alignment with operational objectives. Contribute to the development and execution of life cycle asset management processes.
  • Architecture: Own the Information Architecture and related Technical Architecture for the Operations sub-domain of the Data & Information Architecture domain.
  • Technology Life Cycle: Develop and execute strategies to introduce new capabilities needed, evolve and mature existing capabilities, and retire capabilities at their end of life.
  • Documentation and Governance: Develop and maintain architectural guidance documents and artifacts, practices and procedures, and governance to support the above.
  • Strategic Planning: Support technology strategy, planning, and road mapping activities across IT and at the enterprise level.
  • Data Analysis and Predictive Modeling: Perform statistical data analysis. Utilize data insights for capacity planning, demand forecasting, and identifying performance bottlenecks.

Qualifications

Minimum:

  • Bachelors Degree in Computer Science or job-related discipline or equivalent experience
  • 7 years of relevant work experience in Information Technology, Data Management, Business Intelligence, and Analytics, to include experience in both IT and line of business departments


Desired:

  • Experience working directly with line of business stakeholders demonstrating job-related skills.
  • 5 or more years experience with Site Reliability Engineering/DevOps practices.
  • Experience with analytics and data management principles such as: data acquisition and modeling, data warehousing, business intelligence, metadata management, master data management, advanced analytics and data science, “big data” techniques, public/hybrid/private cloud data management and analytics services data security, and data and analytics governance.
  • Ability to achieve a deep understanding of line of business strategies, priorities, needs, and current capabilities.
  • Ability to work collaboratively to engage and influence business and IT stakeholders, senior leadership and external partners.
  • Customer management and negotiation skills that enable the ability to mediate opposing viewpoints and articulate the advantages of a preferred solution.
  • Excellent written and oral communication skills across all levels; ability to communicate complex technical concepts to leaders, business sponsors and stakeholders in clear, concise language that inspires confidence and earns trust.
  • Strong leadership skills in the technology and operations domain and a high level of drive, initiative and assertiveness.
  • Extensive experience with SRE/DevOps practices and tooling
  • At least 3 years experience developing operations automation tools in Python or another high level scripting language commonly used on Unix systems.
  • Familiarity with at least two or more of: Scaled Agile, Scrum development methodology, DevOps/DevSecOps, LEAN, Six Sigma or ITIL practices.
  • Experience with any of the following: Data Architecture, Airflow, Palantir Foundry, Informatica, Spark, Snowflake, Teradata, and other database and BI technologies, data access languages such as SQL, SAS, R, Python, Scala, etc.
  • Experience working in the Utility Industry and a working knowledge of Utility concepts and challenges a plus.
I'm Interested

jonathan

PG&E combines an established company’s stability with the autonomy of a startup. I enjoy high levels of trust and openness among my coworkers in a dynamic environment where I’m included in important decision-making discussions. As our company evolves, I look forward to career growth opportunities ahead.

Jonathan A. Solutions Architect, Expert
Products & Enterprise Platforms

  • Patti Poppe Becomes the First PG&E CEO to Sign the Disability: IN CEO

    In signing the CEO Letter on Disability Inclusion, Poppe commits to benchmark the company’s inclusion efforts and encourages other leaders to join IN.

    Learn More
  • Life at PG&E

    Life at PG&E

    Learn More

Sign Up for Job Alerts

Note that all fields are mandatory. Please set your category and location selections prior to submitting.
By submitting your information, you acknowledge that you have read our privacy policy and consent to receive email communications from PG&E.

Interested InSelect a job category from the list of options. Search for a location and select one from the list of suggestions. Finally, click “Add” to create your job alert.

  • Information Technology, Oakland, California, United StatesRemove