Description
We are looking for an experienced, self-driven Senior Data Engineer to design, implement and optimize our data pipeline in our Cloud-based Big Data platform. This will be an outsourced external position to start with.
Main responsibilities:
- Design, build, optimize and maintain data pipelines from ingestion, processing to generating metrics/data required by the Business Operations team, product managers, and data scientists.
- Crystalize BI requirements with the business operations team and product managers. Identify the required data sources in collaboration with our development team.
- Data modelling of our data warehouse based on the source systems and business requirement and develop business reports.
- Optimization of our Hadoop/Spark-based processing framework and batch jobs for increased processing throughput, reliability, and accuracy.
Desired qualifications and skills:
- Strong experience in ETL, simplify and optimize complex SQL.
- Hands on experience of Hadoop systems in particular Spark, Hive.
- Experience in system, performance optimization.
- Building analytics reports/dashboards for business stakeholders.
- Strong in SQL and familiar with Java, Scala programming languages.
- Hands on experience of Linux/Unix environment & Java VM.
- Master or bachelor degree or in computer science/information system or equivalent.
What we consider as an advantage:
- Experience of any BI reporting tool.
- Experience in building Real Time data pipelines is a big plus.
- Knowledge data administration (such as Meta data, data quality, etc.) related framework and methodology.
- Knowledge or experience of mass data batch processing, streaming and machine learning.
This is a English Speaking role