Senior Data Engineer
Requisition ID # 165716
Job Category: Information Technology
Job Level: Individual Contributor
Business Unit: Operations - Other
Work Type: Hybrid
Job Location: Oakland
Position Summary
PG&E is seeking a Senior Data Engineer to lead the development of a modern, cloud-native data platform that supports vegetation management operations, regulatory compliance, and advanced analytics. This role is central to transforming fragmented legacy systems into a unified, audit-ready Snowflake Data Lakehouse, enabling scalable, secure, and transparent data access across the organization.
The engineer will design and implement robust ELT pipelines using Informatica as the primary tool, alongside a suite of AWS services including Step Functions, Fargate, Lambda, DynamoDB, and S3. The role spans structured, semi-structured, and unstructured data domains, integrating sources such as Salesforce (OneVM), SAP, SharePoint, ArcGIS, and remote sensing imagery (LiDAR, orthoimagery, surface reflectance, etc.).
PG&E is providing the salary range that the company in good faith believes it might pay for this position at the time of the job posting. This compensation range is specific to the locality of the job. The actual salary paid to an individual will be based on multiple factors, including, but not limited to, particular skills, education, licenses or certifications, experience, market value, geographic location, collective bargaining agreements, and internal equity. Although we estimate the successful candidate hired into this role will be placed towards the middle or entry point of the range, the decision will be made on a case-by-case basis related to these factors.
A reasonable salary range is:
Bay Area Minimum: $122,000
Bay Area Maximum: $194,000
&/OR
California Minimum: $116,000
California Maximum: $184,000
This job can also participate in PG&E’s discretionary incentive compensation programs.
This position is hybrid. You will work from your remote office and your assigned location based on business needs. The headquarters location is Oakland, CA.
Job Responsibilities
• Design, build, and maintain scalable data pipelines using tools such as Informatica, AWS services, and Snowflake to support ingestion, transformation, and curation of structured, semi-structured, and unstructured data.
• Collaborate with cross-functional teams—including data scientists, analysts, and business stakeholders—to understand data requirements and deliver high-quality, analytics-ready datasets.
• Implement and optimize data lakehouse architectures, ensuring efficient data flow across Bronze, Silver, and Gold layers in Snowflake.
• Support the deployment of machine learning models by enabling feature pipelines, model input/output data flows, and integration with platforms like SageMaker or Foundry.
• Apply software engineering best practices such as version control, CI/CD, unit testing, and infrastructure-as-code to ensure reliable and maintainable data workflows.
• Monitor and troubleshoot data pipelines, ensuring data quality, lineage, and governance standards are met across all environments.
• Mentor junior engineers and contribute to architectural decisions, code reviews, and knowledge sharing within the team.
• Communicate technical concepts and project updates clearly to both technical and non-technical audiences.
Qualifications
Minimum:
- 5 years of hands-on experience in data engineering, including:
- Designing and building scalable ETL/ELT pipelines for structured, semi-structured, and unstructured data.
- Working with cloud data platforms such as Snowflake, Databricks, or AWS Redshift.
- Using data integration tools like Informatica or AWS Glue.
- Developing in distributed data processing frameworks such as Apache Spark.
- Proficiency in SQL and at least one programming language such as Python, Scala, or Java.
- Ability to collaborate in cross-functional teams and communicate technical concepts to non-technical stakeholders.
- Strong understanding of software engineering best practices, including:
- Unit testing and test-driven development (TDD)
- Continuous Integration/Continuous Deployment (CI/CD)
- Source control using tools like Git and platforms like GitHub
- Experience with cloud services (preferably AWS), including:
- S3, Lambda, Step Functions, Fargate, DynamoDB, SNS/SQS, and CloudWatch.
- Bachelor’s degree in Computer Science, Information Systems, Engineering, or a related technical field; or equivalent practical experience.
Desired:
- Experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation).
- Experience deploying and operationalizing machine learning models in production environments.
- Familiarity with business intelligence tools such as Power BI, Tableau, or Foundry for data visualization and reporting.
- Experience working with geospatial data and tools such as ArcGIS.
- Experience with data cataloging, metadata management, and data lineage tools such as Informatica and Collibra.