Principal Data Scientist
Requisition ID # 145150
Job Category: Accounting / Finance
Job Level: Manager/Principal
Business Unit: Information Technology
Work Type: Hybrid
Job Location: Oakland
The Data Science & Decision Science Department is both a “Delivery” team that is a sophisticated practitioner of data science and a “Center of Excellence” team that supports other practitioners in an enterprise-wide Hub & Spoke analytics adoption model.
As a Delivery team, this Department uses industry leading data science and change management practices to drive PG&E’s transition to the sustainable grid of the future. The Department works cross-functionally across the company to enable data driven decisions applying analytics, as well as improvements to relevant business processes. Deployed to some of PG&E’s highest priority arenas, the Department does not specialize in a traditional utility domain, such as asset management or program administration, but instead specializes in extracting useful insights from disparate data sets and facilitating actions informed by these insights.
As a Center of Excellence team, this Department listens to the needs of practitioners across the company, along with emerging industry practices, and builds standards, processes, tools, knowledge and best practices that meet the current and future needs of the enterprise.
This team works on a wide variety of difficult problems, offering great variety in the work, and constant opportunity to explore and learn. Current and past engagements include:
- Creating wildfire risk models that are used by regulators and the utility to prioritize asset management
- Developing computer vision models that improve, accelerate, and automate asset inspections processes
- Predicting electric distribution equipment failure before it occurs, allowing for proactive maintenance
- Forming the analytical framework behind PG&E’s Transmission Public Safety Power Shutoff
- Optimizing non-wires alternative resource portfolios, like the Oakland Clean Energy Initiative, including location and resource adequacy considerations
- Analyzing customer demographic, program participation, and SmartMeter interval data to build program targeted propensity models, e.g. for customer owned distributed energy resource technologies
- Identifying and investigating anomalous customer natural gas usage, in order to resolve dangerous customer side leaks
ABOUT THE ROLE
As the first Principal Machine Learning Engineer (MLE) at PG&E, you will build our model deployment and continuous improvement practice. You will architect, build, and maintain the foundation and standard technologies for machine learning models/ algorithm serving, monitoring and maintenance for the company. You will work with senior leadership to set deployment standards as well as replicable, reliable and scalable technologies for the IT department and the Data Science and Decision Products teams. While designing the initial machine learning infrastructure is an important contribution, you will also develop yourself the first iterations of that infrastructure hands on. You will also coach junior machine learning engineers, and collaborate with data engineers and data scientists in developing data and digital products. As your values, you care about our environment, climate change, clean energies, sustainable prosperity for Californians. If developing a new machine learning infrastructure for artificial intelligence algorithms that empower the next generation energy systems excites you, this is your job.
This position reports to the Director, Enterprise Decision Science/Data Science & Analytics Products.
This position also matrix reports (i.e. ‘dotted line’ reporting) to the Sr Manager, Data Solutions Architecture.
PG&E is providing the salary range that the company in good faith believes it might pay for this position at the time of the job posting. This compensation range is specific to the locality of the job. The actual salary paid to an individual will be based on multiple factors, including, but not limited to, specific skills, education, licenses or certifications, experience, market value, geographic location, and internal equity. We would not anticipate that the individual hired into this role would land at or near the top half of the range described below, but the decision will be dependent on the facts and circumstances of each case.
A reasonable salary range is:
Bay Area Minimum: $151,000.00
Bay Area Mid-point: $204,000.00
Bay Area Maximum: $257,000.00
California Minimum: $143,000.00
California Mid-point: $194,000.00
California Maximum: $244,000.00
This position is hybrid, working from your remote office and your assigned work location based on business need. The assigned work location will be within the PG&E Service Territory.
WE ARE EXCITED TO MEET YOU BECAUSE YOU:
- Have a bachelor’s degree in computer science, machine learning engineering, an engineering field, or equivalent work experience in an engineering field focusing on machine learning & advanced analytics modeling/algorithm development.
- Worked 10 years as a MLE, software engineer, data engineer, or equivalent field; OR have a master’s degree with 8 years of relevant experience, OR have a doctorate with 5 years of relevant experience.
- Led efforts on scoping and implementing major programs for 5 years.
ARE TECHNICALLY EXCELLENT
- Have hands-on experience in
- Object-oriented data management platforms such as Palantir (if not, you are eager to learn)
- PySpark, Apache Spark, Scala, Hadoop, AWS S3 or equivalent big data processing framework.
- CICD/DevOps tools (preferably SparkML, AWS Guru & Sagemaker) and architectures, for pipeline development, deployment, testing, and model monitoring.
- AWS services, Google Cloud Platform, Microsoft Azure, or an equivalent cloud computing platform.
- Database design fundamentals.
- Code review protocols and code quality controls for several languages (Python, Java, etc.). Error and log monitoring and handling.
- Have experience coaching teams to write production-level code including unit testing, integration testing, modularized functions.
ARE RESPONSIBLE FOR:
- Building and maintaining tools that monitor model drift, data drift, code quality, and system health.
- Performance tuning and optimizing machine learning pipeline deployment.
- Testing protocols, practices and technologies.
- Building enterprise-wide standards for model serving/deployment, monitoring, and maintenance systems.
- Coaching software developers, IT professionals, and data scientists and data engineers in preparing machine learning models for production and maintenance.
- A critical thinker
- Have hands-on experience with technology planning and road mapping.
- Demonstrate strong organizational skills to plan and prioritize workload.
- Collaborate in cross-functional teams using Agile practices
- Demonstrate excellent interpersonal, verbal, and written communication skills.
- Can breakdown ambiguous problems.
WE ARE REALLY EXCITED TO MEET YOU IF YOU:
- Have worked with the Palantir Foundry platform and have experience developing CIDI/DevOps pipelines with Foundry as part of your architecture.
- Have built out a team or been an early member of a growing team4, providing leadership and direction.
YOU WILL ENJOY WORKING AT PG&E DATA & ANALYTICS BECAUSE WE:
- Enjoy work that has a real, tangible impact for California, with a commitment to our bottom line of serving People, the Planet, and California’s Prosperity .
- Care deeply about stopping catastrophic wildfires and delivering carbon-neutral energy for California.
- Will provide affordable energy and deliver excellent customer experiences every day.
- Embrace diversity of backgrounds, skills, and thought in tackling our collective climate change challenges.
- Find projects and roles that provide you autonomy, mastery, and purpose.
- Embrace empathic feedback and a continuous improvement culture.
- Have smart and empathic leadership who care about their employees and encourage new ideas.
- We act with integrity and do what is best for our customers.
- Offer competitive compensation and a comprehensive benefit package.