01/01/2024 updated

**** ******** ****
100 % available

Big Data Solutions Architect & Machine Learning Engineer

London, United Kingdom
Worldwide
M. Sc. Computer Science, RWTH Aachen University
London, United Kingdom
Worldwide
M. Sc. Computer Science, RWTH Aachen University

Java (Programming Language)JavaScriptAmazon Web ServicesAmazon S3Android (Software)Apache HTTP ServerGoogle App EnginesAtlassian ConfluenceAtlassian JiraHTML5Big DataCluster AnalysisDatabasesCouchDBD3.JsDjangoECMAScript (C Programming Language Family)ElasticsearchGoogle AnalyticsApache HadoopMapReduceApache HBaseInformation RetrievaljQueryPython (Programming Language)PostgreSQLLinux ServersMachine LearningMemcachedMySQLNatural Language ProcessingNode.jsNumPyRedisClouderaSolution ArchitectureSubversionSalesforce TableauJettyFront End (Software Engineering)Feature EngineeringData ScienceReact.jsExpress.jsApache SparkInternet of Things (IoT)BackendGitpandasSalesforce HerokuScikit LearnSlackCassandraAWS GlueApache KafkaApache NifiElastic KibanaDocker
Backend: Python, Django, Java, Jersey, Jetty, Java Script ES6, Node.js, Express, Android, Cloudera/Hortonworks Stack, Apache NiFi, Apache Kafka, Apache Spark
Databases: MySQL, PostgreSQL Apache HBase, Apache Hadoop, MangoDB, CouchDB, ElasticSearch, ELK-Stack, Cassandra, Memcached, Redis
Frontend: HTML 5, React, Java Script ES6, jQuery, d3.js, Android, Kibana, Tableau, Amazon Quicksight
Deployment: Docker, Amazon AWS ecosystem, EC2, S3, Google App Engine, Heroku, Apache, Linux Server
Data Science: numpy, sci-kit-learn, sci-py, pandas, machine learning, feature engineering, natural language processing, clustering
Miscellaneous: Git, Svn, Jira, Confluence, Slack, Google Analytics, IoT

Languages

GermanNative speakerEnglishFluent

Project history

Big Data Solution Architect & Information Retrieval Specialist

DAX30 Konzern (Ludwigshafen am Rhein)

Automotive & Vehicle Manufacturing

>10.000 team member

Conceptualization, architecture and development of a scalable Big-Data solution (as a R&D Datalake use case) for mass indexing file contents using bleeding edge natural language processing and machine learning algorithms on the Cloudera Hadoop Stack (HDP and HDF), Palantir Foundry and Kubernetes Deployment in Microsoft Azure.

Technologies used
  • NiFi and MiNiFi for ETL
  • Apache Spark Processing (Java, Scala & Python)
  • Apache HBase
  • Elasticsearch Stack
  • Django Backend
  • React Frontend
Features comprising i.a.
  • Raw text extraction from various file types
  • Language dependent indexing
  • Clustering Approaches (i.a. Latent Dirichlet Allocation, Latent Semantic Indexing, doc2vec)
  • Parse unstructured data into structured data
  • Named Entity Recognition (i.a. chemical entities) using Neural Networks
  • Entity Linking (Distant Knowledge)
  • Molecular Substructure Search

Contact form

Log in to get in touch

You need to be logged in to use the contact form.

Sign upLog in