Reports To:
Assistant Manager, Solution Architect
Responsible For:
- Building, maintaining, and optimizing scalable data pipelines.
- Ensuring data accuracy and availability across systems.
- Implementing data governance and security best practices.
- Supporting data-driven decision making across the business.
Overall, Purpose of Job:
The Data Engineer will be responsible for designing, building, and optimizing data infrastructure, enabling seamless access to high-quality data across the organization. This role supports advanced data-driven initiatives across all subsidiaries including upstream, trading, and clean energy arms by ensuring reliable and secure data pipelines. You will collaborate with data scientists, analysts, and business stakeholders to enable insights and business intelligence.
Key Responsibilities:
Data Infrastructure & Pipeline Development:
- Design, build, and maintain efficient and scalable data pipelines.
- Automate data ingestion processes from multiple sources, ensuring timely delivery.
- Collaborate with data scientists and analysts to ensure data pipelines are robust and support their needs for modeling and analysis.
- Continuously optimize existing data pipelines to improve performance, scalability, and reliability.
- Ensure smooth operation of ETL processes and resolve any data-related issues promptly.
Data Storage and Architecture:
- Implement best practices for data storage using modern data warehouse solutions (e.g. Snowflakes, Redshift, BigQuery)
- Design and manage both SQL and NoSQL databases for different business applications.
- Maintain and evolve data architecture to support business expansion needs.
Data Management and Governance:
- Ensure data integrity, accuracy, and security through rigorous data management practices.
- Implement data governance frameworks, ensuring compliance with internal policies and external regulations.
- Establish data monitoring and auditing procedures to track data usage and identify potential risks or inconsistencies.
- Ensure data privacy and security protocols are maintained, adhering to NDPA and other relevant data protection regulations.
Collaboration:
- Work closely with Data Scientists, Analysts, Business Information Coordinators, and stakeholders to gather requirements and implement effective data solutions
- Collaborate with IT and business units to align data strategies with organizational goals.
- Collaborate with solution architects to optimize the cloud data infrastructure.
Key Performance Indicators (KPIs):
- Data Pipeline Uptime: Ensure 99.5% uptime for critical data pipelines.
- Data Governance Compliance: Achieve 100% compliance with governance and security policies, including audit trails, access controls, and regulatory adherence.
- Data Processing Efficiency: Improve pipeline processing speed by an additional 20% through continuous optimization, automation, and system upgrades.
- Data Storage Optimization: Reduce storage costs by 15% through effective data management.
- Data Quality Metrics: Achieve 95% data accuracy across all systems.
Person Specification:
- Bachelor’s degree in computer science, Information Systems, or a related field.
- 3-5 years of experience as a Data Engineer or in a similar role.
- Proficiency in SQL, Python, and cloud data platforms (AWS, Azure and Google Cloud).
- Strong understanding of relational and NoSQL databases (e.g., PostgreSQL, MongoDB).
- Familiarity with data modeling, data warehousing, and Big Data processing (e.g., Hadoop, Spark).
- Knowledge of containerization (e.g., Docker, Kubernetes) is a plus.
- Proficient in Microsoft Office Suite, (including Word and Excel.)
- Professional qualifications such as ITIL (Information Technology Infrastructure Library).
Required Competencies:
- Problem-solving and Critical Thinking: Ability to analyze data, identify trends, and recommend actionable solutions
- Ability to communicate technical concepts effectively to both technical and non-technical stakeholders.
- Experience in the energy sector, drone datasets or working with IoT dataset is a plus.
- Familiarity with distributed systems and cloud-based data platforms.
- Strong collaboration and team-building skills to work cross-functionally.
- Experience working in Agile or DevOps environments