Egzon Syka available

Egzon Syka

Data Scientist - NLP

Profileimage by Egzon Syka Data Scientist - NLP from Zuerich
  • 8008 Zürich Freelancer in
  • Graduation: University of Bern, Master of Computer Science (Specialization in Data Science)
  • Hourly-/Daily rates: not provided
  • Languages: English (Full Professional)
  • Last update: 15.04.2020
Profileimage by Egzon Syka Data Scientist - NLP from Zuerich
CV - Egzon Syka

You need an account to view this information.

Data scientist with experience in building new systems from scratch.
Expertise in natural language processing (NLP).

  • Machine Learning (python: Scikit-learn, NumPy, Pandas, spaCy, Keras, CoreNLP, NLTK)
  • NLP: spaCy, Gensim, Keras, NLTK, CoreNLP, Hugging Face*(BERT-based models), Flask (Flask-RESTful, Swagger/OpenAPI)
  • MongoDB and Neo4j databases
  • Data annotation tools: Prodigy and Dataturks, Git, Docker, CI/CD with Github Actions or Travis CI, Heroku
  • Big Data platforms: Hadoop (hdfs and MapReduce), Spark(Spark Streaming and SparkML) with Java/Scala/Python, Kafka
  • R and Matlab programming
  • Java, SQL, JDBC, XML, JSON,
  • OOP, TDD using JUnit and Mockito, DbC
  • 08/2018 - Present

    • SuisseCoGmbH
  • Data Scientist
  • Resume Parsing and Analysis Based on NLP and Machine Learning
    • Hybrid approach based on content and layout techniques
    • Extract and categorize the resume information into specific fields
      • Personal details, education, work experience, projects, skills etc.
      • Layout features extraction
      • Content features extractio: Fuzzy matching, Named Entity Recognition (NER), POS Tagging, Topic Modeling, Word2Vec
      • Rule-based grammar for IE
    • Custom NER models using Prodigy annotation tool with Active Learning
    • Spacy NER models for every section of content: personal info, education, work experience etc.
    • Search indexing using Skill2Vec
    • Target: converting semi-structured data from PDFs to structured JSON files
    • Ranking score that describes how well candidate fits based on education, skills and experience

  • 10/2017 - 07/2018

  • Data Scientist
  • Recognizing User's Activity for the case of Public Transportation (Master Thesis Project) 

    The goal of this project is to design, build and evaluate prediction models for recognising human activities in the context of fine-grained transportation mode detection. 
    The project involves collecting data from various mobile device sensors, such as accelerometer and GPS, performing feature extraction to extract meaningful features out of raw signals including features from statistical, time and frequency domains. The extracted features are used to build a supervised classifier that recognises the transportation mode for the new data samples.

    Study projects
    University of Bern/University of Fribourg
    • Building Hadoop MapReduce applications to analyze large Twitter datasets 
    • Live Twitter analysis using Apache Spark Streaming + Kafka 
    • Online Course on Hadoop Streaming: forum logs processing using Hadoop map/reduce.
    • Building an ML model for predicting heart diseases in patients based on a biomedical dataset from Zurich, Basel and Lugano hospitals(Best model award)
    • Building an ML model for digit recognition on MNIST dataset(Best model award)
    • Building an ML model for Signature verification