David Hillmann

Lauterstein

available

Last update: 11.04.2024

Data Scientist

Company: DLT-Innovation GmbH

Graduation: Economics

Hourly-/Daily rates: show

Languages: German (Native or Bilingual) | English (Full Professional) | French (Elementary) | Czech (Native or Bilingual)

Keywords

Data Science Consultant Machine learning Data Scientist Python R Shiny R Coding Predictive Analysis Data Classification

Attachments

Coursera-LLM-Certificate_261123.pdf

David-Hillmann-Project-History-2020-03_100424.pdf

DavidHillmannResume-220124_100424.pdf

davidhillmann-Azure-DS-Certificate_110424.pdf

Skills

Many years of extensive experience as a Data Science Consultant

External consultant in global operating enterprises
Method consulting
Programming of prototypes
Programming of Data Science / Machine Learning Pipelines
Projects in international, English speaking teams

Broad knowledge in Machine Learning tools and application with R and Python for many years

Supervised Learning: Deep theoretical knowledge in classification- and regression methods and long experience in practical use
- Decision Trees, Random Forests, Gradient Boosted Trees, Support Vector Machines, Deep Learning (ANN, CNN), logistic – and linear regression methods, Ridge Regression, Lasso, Linear Discriminant Analysis, kNN
Unsupervised Learning: Extensive experience in application of clustering/pattern recognition of unstructured data
- DBSCAN, OPTICS, k-means, hierarchical clustering
Rule based Learning: Considerable experience with algorithms for learning patterns and derive rules from unstructured data
- apriori, eclat, FP-growth
Dimensionality reduction methods: Experienced with application of
- Principal Component Analysis, Linear Discriminant Analysis, Non-negative matrix factorization

Good knowledge of Deep Learning concepts and application with TensorFlow & keras

Artificial Neural Networks
ConvNets
RNNs, LSTMs, GRUs

Deep knowledge in statistical methods and modeling

Econometric models
Time series models (ARMA, ETS)

Many years of experience in development of data science pipelines, for proof of concept and implementation in productive systems

Data exporting and merging from different sources (HANA; MSSQL)
Data preparation, cleaning- and pre-processing for final analysis
Data post-processing of final results and insights (automated reporting, visualization, interaktive applications like Shiny or Tableau)

Project history

06/2018 – 11/2018 Inventory Health Forecasting (DAX Enterprise, Industrial sector), DLT
Goals & Questions:

Which materials, products or product groups have an increased risk of becoming ‚unhealthy‘ (being too long in inventory)?
Which are the main factors of the risk of getting unhealthy?
Is it possible to forecast inventory KPI’s of individual products?
Can we achieve a satisfactory forecast accuracy?

Role:

Self-realiant work on proof of concept, coding and consulting on choosing most powerful and suitable models
Collaboration with internal IT colleagues
Knowledge transfer on methods and workflow

Methods/Approach:

Test of different models and select the best performing with cross validation: Decision Trees, Gradient Boosted Trees, Support Vector Machines and Neural Networks
Implement automated method for hyperparameter optimization with bayesian optimization and grid seach techniques
Developing a memory efficient machine learning pipeline, including automated optimization techniques to allow run by non-technical users

Tools: R (packages: mlr, caret, xgboost, rpart), Python (libraries: sklearn), MSSQL, SAP HANA
Results:

Inventory risk KPI’s can be forecasted with excellent accuracy by tree ensemble models one and three months in advance
Individual driving factors can be filtered from the applied models for each product forecast (avoiding black-box problem)
Providing an automated machine learning pipeline which is implemented into a productive system as an early warning system of products at risk

03/2018 – 10/2018 Inventory Management (DAX Enterprise, Industrial sector), DLT
Goals & Questions:

Analysis of decisions taken by inventory management planners on products that remain in stock for too long
Is there correlations between certain decisions, like fireselling products or allocate them to alternative uses, and product features or additional factors? Do patterns exist that could be exploited in predictive models?
Is it possible to build a predictive model that accurately predicts decisions?

Role:

Self-realiant work on proof of concept, coding and consulting on choosing most powerful and suitable models
Collaboration with internal IT colleagues
Knowledge transfer on methods and workflow

Methods/Approach:

Test of different models and select the best performing with cross validation: Artificial Neural Networks, Random Forests, Gradient Boosted Trees, Support Vector Machines
Implement automated method for hyperparameter optimization with bayesian optimization and grid seach techniques
Developing a memory efficient machine learning pipeline, including automated optimization techniques to allow run by non-technical users

Tools: R (packages: mlr, caret, xgboost), Python (libraries: Tensorflow with keras, sklearn), MSSQL, SAP HANA
Results (on-going project):

A selection of ensemble models performs well in predicting decisions
Machine learning pipeline is being impemented in a productive system
Automated classification models are provided to planners to help them with recommended decisions and support their decision making

07/2016 – 11/2018 Growth Finder, Recommender System follow-up project (DAX Enterprise, Industrial sector), DLT
Goals & Questions:

Can recommendations be further improved by more advanced machine learning methods?
Which performance can be achieved with different models, from simple and intepretable to very advanced and rather black-box models?
Is it possible to improve model performance by adding additional features into more advanced models?
Are more performant models black-boxes or can explanation/interpretation methods be applied?
Generate ready to use R packages to apply different business scopes

Role:

Consulting on method selection and coding embedded in a data science team
Close interaction with peers and communication with stakeholders and internal IT staff to share knowledge and enable them to use the methods
Coaching of internal IT colleagues to enable them to adopt the methods in practice and transfer knowledge to peers

Methods/Approach:

Testing of several machine learning algorithms to generate recommendations: Decition Trees, Gradient Boosted Trees, Convolutional Neural Networks
Evaluate tested models in cross validation settings and implement into existing algorithmic package
Developing an automated pipeline from data import from databases over pre-processing and modeling to post-processing with prepared recommendations and if possible explanation layers
Developing memory efficient functions and implement them in an R package for all pre- and post-processing steps for each implemented method
Further development of an algorithmic R package based on S4 framework from recommenderlab to implement enseble methods and neural networks

Tools: R, Tensorflow mit keras (R), recommenderlab (R), xgboost (R), rpart (R), Python, SAP HANA, R Markdown
Results (project on-going):

Provided workflow for generating product recommendations enables training with large amounts of data, which improves performance of most models significantly
The developed R packages and the prepared workflow are used by internal IT department to supply recommendations to business units
Successful trials and positive feedback from large business units and operating regional departments

07/2017 – 03/2018 Process Minining (DAX Enterprise, Industrial sector), DLT
Goals & Questions:

Analysis of delivery reliability (DR) in order to discover potential reasons for delayed delivery on grouped product level
Is it possible to identify recurrent patterns which are associated with a low DR?
Evaluation of potential improvement of the DR and other KPIs
Which factors are associated with a drop in DR in certain periods?

Role:

Support an internal Data Science team to set up a data mining pipeline (on-site and remote) and enhance existing methods
Collaboration with peers and communication and presentation to business stakeholders from supply chain management

Methods/Approach:

Select appropriate machine learning methods to discover rules
Cluster rules into problems with similar context to ease analysis by business domain experts. Evaluation of different clustering algorithms and distance measures, like dbscan, k-means, hierarchical clustering
Connect existing pipeline to databases to automate data import and export of results
Test additional features in modeling to improve derived rules: feature selection/engineering
Application of causal inference algorithms (e.g. causal trees) to extract and evaluate importance of time dependant factors
Enhance an interactive platform (R Shiny) to ease its application by business experts with non-technical backgounds
Proper documentantion of the whole workflow

Tools: R, arules (R), dbscan (R), SAP HANA, SQL, R Shiny, R Markdown, Jupyter
Results (on-going):

Provide an R package that enables an easy to use pipeline to discover rules in an automated way
Provide an enhanced version of an interactive platform to be utilized by non-technical users
Successful test of the platform in one business unit
Evalutation of the results in other business units (on-going)

03/2015 – 06/2016 Recommender System (DAX Enterprise, Industrial sector), STAT-UP
Goals & Questions:

Segmentation/clustering of customers according to volume of sales, potentials and purchasing behaviour
Proof of concept: Is it possible to derive patterns from purchasing behaviour from historical data and to utilize them to create product recommendations?
If possible: How good is the accuracy of recommendations and how high are the potentials in terms of sales and contribution margins?

Role:

Support as external consulant for conceptual work, methods selection and coding (on-site with remote periods)
Close collaboration with peers from Data Science, Machine Learning and IT
Frequent communication of results to peers and internal management consulting colleagues in order to spread the concept in the enterprise

Methods/Approach:

Clustering methods to group customers on customer features, sales and purchasing patterns
Selection of appropirate algorithms to discover patterns of frequently purchased products (itemset mining) and to derive rules (association rule mining)
Post-processing and interpretation of results for stakeholders
Create interactive platform in Shiny to visually explore and explain purchasing patterns, rules and recommendations

Tools: R, arules + arulesViz (R), ggplot2, SAP HANA, SQL, R Shiny, R Markdown
Results:

High potential as measured by expected contribution margins if a fraction of recommendation are realized
Test and validation of derived rules and recommendations of sales force of several business units in practice
Application of the developed interactive platform to
- Discover interesting patterns and rules
- Reationalize and explain recommendations
- Prioritize recommendations according to expected success and contributin margin potential
- Present and spread idea within the enterprise in a workshop

06/2014 – 12/2017 Price Outlier Analysis (DAX Enterprise, Industrial sector), STAT-UP
Goals & Questions:

Analysis of statistical outliers to determine pricing opportunities for marketing and sales force
Which factors have signigicant influence on pricing?
Is it possible to determine statistical price outliers in an automated way?
How high is the potential in terms of contribution margins if you could avoid prices that the model identified to be too low?

Role:

External support for development of proof of concept, coding and consulting on choosing most powerful and suitable methods
Collaboration with internal peers and permanent communication with and presentations to business stakeholders

Methods/Approach:

Develop several linear pricing models with intertion effects on a granular level to get benchmarks and discover pricing opportunities
Interpretation and visualization of model estimates in business context
Development of an interactive platform for reporting and monitoring in marketing & sales

Tools: R, R Shiny, R Markdown, SAP HANA, SQL, ggplot2
Results:

Application in several business units trying to exploit willingness to pay of customers from pricing insights
Provision of an interactive tool for marketing & sales staff

05/2014 – 11/2014 Benchmark analysis for sales forecasts (BASF SE), STAT-UP
Goals & Questions:

Model-based estimation of sales forecast accuracies in order to set up benchmarks of methods in use (Demand Planning follow-up project)
Can we identify factors that significantly improve or deteriorate forecasting accuracies?

Role:

Self-reliant development of proof of concept, coding and consulting on choosing most suitable method
Communication and presentation of results to peers at BASF

Methods/Approach:

Literature research to find most suitable method for out setting
Test and compare appropriate methods like PCA and linear regression
Test methodology on enterprise data

Tools: R, ggplot2
Results:

Forecast accuracies can be reliably estimated with a few extracted factors which enables creating useful benchmarks
Benchmarks can be applied and compared accross methods, business units and planners to identify over- and underperformers in an automated way and discover room for improvement

04/2014 Hierarchical Forecasting (BASF SE), STAT-UP
Support in project aiming at generating and visualization of hierarchical sales forecasts.
Role: Support of Data Science Colleagues in coding for reports and visualization
Tools: R, ggplot2, Excel
11/2013 - 03/2014 Demand Planning (BASF SE), STAT-UP
Goals & Questions:

Can sales quantities be forecasted on product level with satisfactory accuracy?
With the induced forecasts, is it possible to reduce storage costs and improve just-in-time production goals?
With implementation of modern state-of-the-art methods, can forecasting accuracy be significantly improved compared to methods already in use?

Role:

External support for development of proof of concept, coding and consulting on choosing most powerful and suitable methods
Collaboration with internal technical colleagues and permanent communication with business stakeholders

Methods/Approach:

Estimation of time series models with historical sales data
Implementation and validation of statistical forecasts with several time series modeling approaches (ARMA, ETS, Croston) and development of an R package in order to automate sales forecasts

Tools: R, forecast (R package), Excel
Results:

Provide a ready to use R package for automated sales forecasting and visualizations
Provide a methodology for forecast accuracy benchmarking, enabling comparisons of methods over time and business units and show room for improvement

2013    Demographic study on data from Switzerland (CIELO-MATH AG), STAT-UP
Consulting on methods and analysis of socio-demographical factors.
Methods/Approach: Hypothesis testing, econometrical models, causal inference with instrumental variables, visualization of aggregated data
Tools: R, ggplot2
2012    Portfolio evaluation with Probabilistic Utility Models, STAT-UP
Collaboration on a research project with Prof. Walter Krämer, Technische Universität Dortmund
2012 - 2014    Clinical study of cancer patients (Klinik, Düsseldorf), STAT-UP
Support of a statistical evaluation of different surgical methods, conducting statistical tests and assessment of early diagnosis and treatment effects.
Methods/Approach: survival analysis, statistical hypothesis tests, use of parametric and non-parametric methods, logistic regression, ROC, AUC
Tools: SPSS, R, Excel
2012 - 2013    Study on behalf of a german HR association: Bundesverband der Personalmanager, STAT-UP
Provide support for an analysis of potential determinants of job satisfaction of personnel managers.
Methods/Approach: econometric modeling, causal inference
Tools: R, SPSS
2012 – 2013   Study of graduate students (Landkreise Altötting und Mühldorf a. Inn), STAT-UP
Create concept for questionnaire, finalize a ready to implement questionnaire, implement in an online tool. Support colleagues with data analysis and interpretation and in writing a final report for a public local institution.
Methods/Approach: analysis of correlations, statistical hypothesis testing and regression analysis
Tools: R
2012    Small aircraft market research (Diamond Aircraft Industrial sectors GmbH), STAT-UP
Support in a project for generating market- and sales forecasts for products of Diamond Aircraft Industrial sectors and of its relevant competitors on basis of internal sales data and sector databases. Provide interpretation and derive recommended actions for sales force.
Methods/Approach: time series forecasting
Tools: R, Excel
2012    Study for modeling sales in medical technology, STAT-UP
Provide support for a study analyzing economic factors and cyclical leading indicators on demand for certain mecial implants with time series models allowing for epidemiological and demographic factors.
Methods/Approach: econometrical modeling, time series models
Tools: R

Local Availability

Open to travel worldwide

available from 22. April 2019, 100%

Data Scientist

David Hillmann

Data Scientist

Keywords

Attachments

Upgrade your account now

Skills

Project history

Local Availability

Follow profile

Follow profile

Welcome to freelancermap!