Big Data Analytics , Data Science , Machine Learning Engineering , Spark , Time Series

Graduation: P.hD. Mathematics / M.Sc. Mathematics (TU Berlin)

Hourly-/Daily rates: show

Languages: Chinese (Full Professional) | German (Native or Bilingual) | English (Full Professional) | Spanish (Full Professional)

Keywords

Big Data Analytics Time Series Analysis Business Intelligence SQL / NoSQL Tensorflow Machine Learning ( Python / Pandas / Scikit ) Data Science Spark SQL and Analytics Python & Java AWS & Google Cloud

Skills

Big Data Analytics & Business Intelligence (Spark, SQL, NoSQL, Excel, various visualisation tools, Jupyter Notebooks)
Data Science, Time Series, Machine Learning Engineering (Python, Java, Pandas, Scikit Learn, Tensorflow, PyTorch)
Complete planning and implementation of Data Science projects (problem formulation, goal setting, technical communication, implementation, evaluation, monitoring and maintenance)
Implementation of complex ETL jobs and data transformations (batch and online processing), real-time datastream processing
Other skills (incomplete list): C++, Javascript, Flutter, Git, Docker, Apache Kafka, Apache Hive, Hadoop MapReduce, Bash, Python Flask, REST API, Scrum, Amazon Web Services, Google Cloud, Redis, MySQL, PostgreSQL, MongoDB, Apache Cassandra

Project history

01/2015 - 06/2016

Data Scientist, Data Engineering

Adrule GmbH (Marketing, PR and Design, < 10 employees)

Conceptual design and implementation of a data warehouse infrastructure in the Google Cloud. Creation of ETL jobs to make unstructured data easy to analyse. Analysis and visualization of historical business data.

Technologies used: Python Machine Learning Stack (Pandas, Scikit Learn), Google Cloud, Google Big Query, MySQL, Apache Spark.

03/2013 - 10/2014

Machine Learning Engineer

MBR Targeting (Internet and Information Technology, 10-50 employees)

Design, coordination and implementation of a real-time bidding system for online marketing to predict click and conversion probabilities. For this purpose, various specialised real-time machine learning algorithms with different objectives were developed and put into live operation.

Technologies used: Apache Hadoop, Apache Hive, Apache Spark, Redis, PostgresQL, Python Machine Learning Stack consisting of Pandas, Scikit-Learn and C++ (to accelerate individual Python program parts).

05/2012 - 08/2014

Data Scientist

Hitfox GmbH (Internet and Information Technology, 50-250 employees)

Planning and implementation of a business intelligence reporting pipeline, including predictive functionality for the following days of a week.

Technologies used: Python Machine Learning Stack (Pandas, Scikit Learn), AWS Redshift, AWS, MySQL.

Local Availability

Open to travel worldwide

Location: 100% remote (on-site only irregularly possible under certain circumstances)

Other

Angebotene Dienstleistungen
Planung, Koordination und Umsetzung von Big Data Analytics Projekten:
Kommunikation, Zielformulierung, Datenbeschaffung und Aufbereitung (ETL), Sicherstellung einer soliden Datenqualität, zuverlässiger Betrieb der Analyse-Prozesse, Visualisierung und Reporting zur sicheren & schnellen Beurteilung von Analyse-Ergebnissen. Herstellerunabhängige Beratung zur Auswahl von Softwares und Tools mit dem besten Kosten-Nutzen Verhältnis für den Kunden.
Data Science / Machine Learning Engineering
Vollständige Umsetzung des sogenannten Data Science Prozesses. Dieser gliedert sich in folgende Schritte:

Problemidentifizierung (erfordert Einbindung verantwortlicher Stakeholder)
Unmissverständliche Zielformulierung und Planung (erfordert Einbindung verantwortlicher Stakeholder)
Datenbeschaffung, -aufbereitung, -transformation, -bereinigung
Modellierung und/oder ermitteln des passenden Vorhersagealgorithmus, Objektive Performance Evaluation
Implementierung in den Live-Betrieb / Deployment
Monitoring und Maintenance

Machine Learning Engineering, d.h. Entwicklung individualisierter Vorhersage-Algorithmen auf Basis unstrukturierter Daten (z.B. Text, Bild, Zeitriehen, etc.).