Description
Senior Data Engineer - Spark/Hadoop consultant
Our client is looking for an experienced, self-driven Senior Data Engineer to design, implement and optimize their pipeline in their Cloud-based Big Data platform.
Main responsibilities:
- Design, build, optimize and maintain data pipelines from ingestion, processing to generating metrics/data required by the Business Operations team, product managers, and data scientists.
- Crystalize BI requirements with the business operations team and product managers. Identify the required data sources in collaboration with our development team
- Data modelling of our data warehouse based on the source systems and business requirement and develop business reports.
- Optimization of our Hadoop/Spark-based processing framework and batch jobs for increased processing throughput, reliability, and accuracy.
Requirements:
- Strong experience in ETL, simplify and optimize complex SQL.
- Hands on experience of Hadoop systems in particular Spark, Hive.
- Experience in system, performance optimization.
- Building analytics reports/dashboards for business stakeholders.
- Strong in SQL and familiar with Java, Scala programming languages.
- Hands on experience of Linux/Unix environment & Java VM.
- Master or bachelor degree or in computer science/information system or equivalent.
Required skills/personal characteristics
- Team work skills as well as ability to work independently. Result oriented and solution-drive
- Fluent English for daily work, good communication and documentation skills.
- Ability to adapt to a multicultural and international environment.
- Proven innovative skills and out-of-box thinking: ability to go beyond state- of-the-art.
Experience in any of the following fields is an advantage:
- Experience of any BI reporting tool.
- Experience in building Real Time data pipelines is a big plus.
- Knowledge data administration (such as Meta data, data quality, etc.) related framework and methodology.
- Knowledge or experience of mass data batch processing, streaming and machine learning.