Profileimage by Davide Imperati Data Engineering Consultant - Cloud from London

Davide Imperati

available

Last update: 03.11.2023

Data Engineering Consultant - Cloud

Company: Betamile Ltd
Graduation: PhD
Hourly-/Daily rates: show
Languages: German (Full Professional) | English (Full Professional) | Italian (Native or Bilingual)

Keywords

Google Cloud Application Programming Interfaces (APIs) Artificial Intelligence Test Automation Cloud Computing Continuous Integration Kubernetes Extreme Programming Apache Kafka Microservices + 87 more keywords

Skills

Agile, Airflow, DynamoDB, S3, AWS, Apache Beam, Hadoop, Kafka, Kafka Streams, NiFi, API Gateway, API, APIs, architectural patterns, AI, artificial intelligence, neural networks, Test Automation, Automated Testing, big data, BigQuery, BigTable, Bitbucket, cloud services, Cloud, Cloud Storage, CloudWatch, Computational Statistics, cyber-security, continuous delivery, CI/CD, continuous integration, Core Data, Dask, Logging, data representation, data retention, DataFlow, databases, Datadog, DevOps, disaster recovery, django, Docker, Elastic Beanstalk, Elastic Search, XP, extreme programming, FluentD, git, Github, Google Cloud, Google Cloud platform, gcp, Graph Database, Computer Science, Jira, Java, Jenkins, Jupyter Notebook, Kibana, Kubernetes, k8s, Load Testing, logical data models, Logstash, low latency, Matlab, Machine Learning, meta-data management, microservice architecture, microservices, Azure Cloud, NLTK, Neo4j, New Relic, nginx, Numpy, object oriented programming, OOP, Pandas, physical data model, Pyspark, Python, R, reference data, reinforcement learning, RDF, Route53, SparQL, Scala, Scipy, scikit, scrum, Semantic web, versioning systems, SOLID principles, Spring, svn, support vector machines, high availability, technical debt, TensorFlow, Terraform, TDD, Trello, OWL, Wiki

Project history

03/2021 - 05/2022
Product Owner/Product Manager
RTL+ (Media and Publishers, >10.000 employees)

Deliver Smart Search Capabilities on the new RTL+ platform cross-purpose featuring news, sport, movies, video on demand, music, podcast, audiobooks, and magazines.
Details of the project are still confidential, for information in the public domain refer to the link below
Ref: “How Germany’s RTL+ Aims to Compete With Netflix” (https://www.hollywoodreporter.com/business/digital/rtl-plus-netflix-streaming-1235042977/)

04/2020 - 10/2020
Consultant - Core Data Engineering Lead - Neuron Program
Vodafone (Telecommunications, >10.000 employees)

Delivered the core of the migration of Vodafone's Big Data platform to Google Cloud (Team of 15 -
Fully Remote)
The platform serves all European markets and handles several terabyte of data per day (data
retention of about 2-3 Petabyte of rolling data)
Refurbished capabilities of the Core Data Engineering Squad for the migration of big data platform
to Google Cloud after the impact of IR35 reform. Delivered the migration under tight time/budget
constraints with minor delay despite the serious constraints posed by Covid-19.
Initial challenges: Team was impacted by the IR35 related change of policies, the project suffered
loss of knowledge, delay, high technical debt, and missing documentation.
Benefits: Team was reinforced, Technical debt was assessed and its impact mitigated, a reduction in
scope was agreed with the stakeholders to fit timelines and budget. The project was delivered with
minor delay despite serious technical, budgetary and environmental constraint. "Vodafone calls for
transformative insights, Google Cloud answers"
(https://cloud.google.com/blog/topics/customers/vodafone-calls-for-digital-transformation-with-the-help-of-google-cloud)
Technologies: Stakeholder engagement, Java (EE), Scala, Python, Pyspark, Github, Jenkins, Jira,
CI/CD, TDD/BDD, DevOps, Test Automation, Load/Stress Test, Cost optimization, Google Cloud platform,
multiple services including DataFlow (Apache Beam), Composer (Airflow), DataProc, Cloud Storage,
BigQuery, BigTable, Spanner, internal microservice architecture based on Kubernetes, Docker,
Terraform.

07/2019 - 02/2020
Consultant - Quant Research
Lloyds Banking Group (Banks and financial services, >10.000 employees)

Revamped the automated trade surveillance platform to meet the criteria set by the auditor (Team of
6 - co-located).
Mediated between stakeholders to have them agree on a standardized approach across different asset
classes.
Mediated between stakeholders and developers to ensure delivery was meeting requirement.
Defined templates for efficient and standardized implementation all analytic.
Implemented a set of critical high-end analytic using NLP, ML, and advanced quant methods.
Initial challenges: Pending review from the regulator. The project suffered disconnect between
stakeholders, compliance requirements, and developers. The platform was legacy. The development team
suffered high attrition rate, thus loss of knowledge. Documentation was partial.
Benefits: Passing auditing (serious cost reduction). Providing meaningful alerts (67% spam reduction
on downstream teams). Platform was consolidate and made extensible.
Asset Classes: FX spot/options, rates futures/bonds/swaps, repo, bespoke OTC.
Technologies: Stakeholder engagement, Java (EE), Python, Pandas, NLTK, Scipy, Numpy, Pyspark, Dask,
Bitbucket, Jenkins, Jira, CI/CD, TDD, DevOps, Risk Scenarions, Automated Testing, Load Testing.

04/2019 - 06/2019
Interim Director of Product
EMY Design (Architecture and civil engineering, < 10 employees)

Managed the start-up of the company from ground zero to the first viable product, with particular
focus on the e-commerce exposure and the click through rate optimization.

01/2019 - 04/2019
Consultant - Data Scientist
News Uk - The Times (Media and Publishers, 5000-10.000 employees)

Delivered "Project James". A reinforcement learning AI for direct marketing optimization.
News UK won a Google sponsored innovation grant aimed at delivering an advanced solution to real
marketing problems. Attrition of the initial investigator created condition for reassigning the
task. The intervention required assessment of the partially implemented project, baseline the
approach, rebuild the reinforcement learning core using state of the art tools. Tune and deliver a
production viable tool within the scheduled time frame.
Challenges: Time pressure for delivery. Partially implemented platform with partial documentation.
Full research project with no previous case study to leverage for comparison.
Benefits: "JAMES has revolutionised churn further, and advisors informed by readers interests
underpin an award winning contact centre"
(https://www.inma.org/practice-detail.cfm?zyear=2019&id=6FDAA177-1BE0-4886-B527-985CEF5D16A2).
Technologies: Python, pandas, scipy, numpy, TensorFlow, github, jenkins, jira, GitOps, CI/CD,
DevOps, Kubernetes, Docker, Terraform, Microservice Architecture

07/2018 - 12/2018
Consultant - Data Scientist
News Uk - The Times (Media and Publishers, 5000-10.000 employees)

Delivered the propensity model and API (Team of 5 - co-located).
The client wanted to improve conversion rate on the digital platform, and deliver a personalized
user experience. Therefore, we piloted an online propensity model. The model follows each user of
The Times Digital in real time and predicts the best opportunity for calls to action. E.g.
subscriptions, cross-sale, up-sale.
Challenges: The model should work at high throughput (1000+prediction/sec) and low latency (<250ms
max response time).
Benefits: It increased subscriptions ad cross-sales 5% and 9%, respectively. Piloted the deployment
of high throughput APIs in NewsUK's brand new k8s cluster.
"Best Ever Growth for The Times & The Sunday Times Thanks to Usable Data Science"
(https://www.inma.org/practice-detail.cfm?zyear=2019&id=6FDAA177-1BE0-4886-B527-985CEF5D16A2)
Technologies: Stakeholder management, python, pandas, nltk, scipy, numpy, API, django, nginx,
docker, Kubernetes (k8s), Terraform, Microservice Architecture, TensorFlow, github, jenkins, jira,
CI/CD, DevOps, New Relic.

03/2017 - 08/2018
Vicepresident
JP Morgan Chase (Banks and financial services, >10.000 employees)

Managed the delivery of the Cloud Logging and Monitoring Platform (Team of 20 across 3 sites).
In the framework of public cloud adoption, JPMC needed a standardized, large scale, logging and
monitoring system to meet cyber-security requirements for all application in the public cloud.
Davide joined the team after the PoC of the platform. He reviewed architecture and implementation.
Then, scaled the platform to handle 5TB of data a day (approximately 5 billion messages with peak of
1.3 bln during the first hour of trading).
Challenges: very new project, under hard constraints in terms of data protections, thus limited
availability of approved cloud services. Very challenging requirements in terms of SLO/SLA, high
availability, disaster recovery, and sustained recovery.
Benefits: The platform allowed to monitor an initial set of 5 mission critical applications in the
public cloud (AWS). It pioneered new technologies, produced a number of architectural patterns new
to JPMC, and demonstrated its ability to scale up to a higher number of monitored applications at a
button push.
Technologies: Leadership, AWS (API Gateway, Route53, S3, DynamoDB, Kinesis, Elastic Beanstalk,
Lambda, ELB, AIM, CloudWatch, CloudTrail, etc.), Boto, Terraform, FluentD, Kafka, Kafka Streams,
(replaced by Kinesis after SOC3), Kinesis Firehose, NiFi, Elastic Search, Logstash, Kibana, Java
(EE), Python, Bitbucket, Jenkins, Jira, CI/CD, TDD, BDD, DevOps, Hera (JPMC Terraform based API),
Automate Testing, Load Testing. Microservice architecture, Docker, Kubernetes (k8s), Datadog. L1 and
L3 support during rollout and production, respectively.

03/2016 - 02/2017
Vicepresident
JP Morgan Chase (Banks and financial services, >10.000 employees)

Set the basis for standardized regulatory reporting across all business (Regulatory driven - Team of
4).
Due to regulatory change, the company was required to produce reporting aggregating across all lines
of business (LoB). It required the standardization of thousands of words used for reporting ('Loan'
has a different meaning in retail than in derivatives). We created controlled vocabularies, devised
and automated the procedures for meta-data management. Served the dictionaries and the reference
data through restAPI based a constellation of microservices. Promoted numerous educational
interventions across the organization.
Challenges: High exposure to the regulators. Humongous amount of non listed words that needed
attentions. Serious need to mediate between different high ranking stakeholders (senior executive
and managing directors)
Benefits: We hampered the regulatory risk and provided tools to gain insight in the corporate
dynamics.
Asset Classes: FX spot/options, rates futures/bonds/swaps, derivatives, OTC.
Technologies: Java (EE), Spring, Python, RDF, OWL, SparQL, Semantic web standards, Ontologies,
Semantic Wiki, Knowledge graphs, Graph Database, Neo4j, BigQuery(Blazegraph), ISO20022, bitbucket,
jenkins, jira, CI/CD, TDD, BDD,DevOps. Docker, Microservices.

11/2014 - 02/2016
Vicepresident
JP Morgan Chase (Banks and financial services, >10.000 employees)

Developed the Meta Analytic of Corporate and Investment Branch (CIB) of the bank.
As part of the digital transformation initiative, JPMC aimed at labelling and scoring all data
repositories and all software products owned by the line of business. We defined the Data Quality
Metrics, formal ontologies for data representation of logical data models (LDM), scan through the
meta-data of all databases inferring the physical data model (PDM) and linked them through
heuristics. The results were manually refined by Information Architects.
Challenges: Very broad collections of dishomogeneous data. Data quality was not always prime. Some
data steward was only partially cooperative with the process.
Benefits: The semi-automated approach increased productivity of the Information Architects by a
factor 4.7x.
Technologies: Java, Spring, Python, RDF, OWL, Semantic web standards, Ontologies,Knowledge graphs,
Graph Database, BigQuery, ISO11179, bitbucket, jenkins, jira, CI/CD, TDD, DevOps.

Local Availability

Open to travel worldwide
Profileimage by Davide Imperati Data Engineering Consultant - Cloud from London Data Engineering Consultant - Cloud
Register