BIG Data Solutions Architect, Project offers

You are here:  Projects  »  BIG Data Solutions Architect

 
BIG Data Solutions Architect

Job type:
on-site
Start:
06.2017
Duration:
6 months
From:
Outvise,SL
Place:
Jakarta
Date:
05/19/2017
Country:
flag_no Indonesia


Project description:

We are looking for a Big Data Engineer that will work on the collecting, storing, processing, and analyzing of huge sets of data. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them. You will also be responsible for integrating them with the architecture used across our clients.

In this task, you will lead a team of data engineers and data scientists to:

Assess existing big data ecosystem: architecture, data sourcing, systems integration required to monetise the data
Understand specific requirements to evolve existing big data ecosystem and perform gap analysis/roadmap definition
Implement specific big data use-cases by sourcing internal and external data, building data layers, developing analytics algorithms, models and visualisation and overseeing the implementation of 3rdparty tools to monetise data
Specifically:

Selecting and integrating any Big Data tools and frameworks required to provide requested capabilities
Gather and process raw data at scale (including writing scripts, web scraping, calling APIs, write SQL queries, etc.)
Process unstructured data into a form suitable for analysis and then execute analysis
Support business decisions with ad hoc analysis as needed
Work closely with client engineering team to integrate your innovations and algorithms into production systems
Professional background

Experience processing large amounts of structured and unstructured data. MapReduce experience is a plus
Proficient understanding of distributed computing principles
Management of Hadoop cluster, with all included services
Proficiency with Hadoop v2, MapReduce, HDFS
Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming
Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala
Experience with Spark
Experience with integration of data from multiple data sources
Experience with NoSQL databases, such as HBase, Cassandra, MongoDB
Knowledge of various ETL techniques and frameworks, such as Flume
Experience with various messaging systems, such as Kafka or RabbitMQ
Experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O
Good understanding of Lambda Architecture, along with its advantages and drawbacks
Experience with either of Cloudera/MapR/Hortonworks