Data Architect - Remote - Cloud/PySpark/Java or Scala

‐ Onsite

This project has been archived and is not accepting more applications.
Browse open projects on our job board.

Keywords

Storage Programming Languages Java Big Data Hadoop Scala Security Cloud NoSQL Database

Description

Data Architect with cloud ideally GCP and PySpark experience is required for 6-month contract with a leading financial services organisation based in London. You will architect, design, estimate, developing and deploy cutting edge software products and services that leverage large scale data ingestion, processing, storage and querying, in-stream & batch analytics for Cloud and on-prem environments.

THIS ROLE IS FULLY REMOTE AND INSIDE IR35

Experience:

Extensive experience with Data related technologies, including knowledge of Big Data Architecture Patterns and Cloud services (AWS/Azure/GCP)
GCP experience is desirable (Big Query, Pub-Sub, Spanner)
Experience delivering end to end Big Data solutions on-premise and/or on Cloud
Knowledge of the pros and cons of various database technologies like Relational, NoSQL, MPP, Columnar databases
Expertise in the Hadoop eco-system with one or more distribution-like Cloudera and cloud-specific distributions
Proficiency in Java and Scala programming languages
Python experience
Expertise in one or more NoSQL database (Mongo DB, Cassandra, HBase, DynamoDB, Big Table etc.)
Experience of one or more big data ingestion tools (Sqoop, Flume, NiFI etc.), distributed messaging and ingestion frameworks (Kafka, Pulsar, Pub/Sub etc.)
Expertise with at least one distributed data processing framework eg Spark (Core, Streaming, SQL), Storm, Flink etc.
Knowledge of flexible, scalable data models addressing a wide variety of consumption patterns including random-access, sequential access including necessary optimisations like bucketing, aggregating, sharding
Knowledge of performance tuning, optimization and scaling solutions from a storage/processing standpoint
Experience building DevOps pipelines for data solutions, including automated testing

Desirable:

Knowledge of containerization, orchestration and Kubernetes engine
An understanding of how to setup Big data cluster security (Authorization/Authentication, Security for data at rest, data in transit)
A basic understanding of how to manage and setup Monitoring and alerting for Big data clusters
Experience of orchestration tools - Oozie, Airflow, Ctr-M or similar
Experience of MPP style query engines like Impala, Presto, Athena etc.
Knowledge of multi-dimensional modelling like start schema, snowflakes, normalized and de-normalized models
Exposure to data governance, catalog, lineage and associated tools would be an added advantage
A certification in one or more cloud platforms or big data technologies
Any active participation in the Data Engineering thought community (eg blogs, key note sessions, POV/POC, hackathon)

Start date: ASAP
Duration: 6 months
From: Strike IT Services
Published at: 11.04.2021
Project ID:: 2087811
Contract type: Freelance

To apply to this project you must log in.

Data Architect - Remote - Cloud/PySpark/Java or Scala

Keywords

Description

Report project

Recommend this project

Application limit reached

Welcome to freelancermap!