Data Scientist
- Verfügbarkeit einsehen
- 0 Referenzen
- auf Anfrage
- 73111 Lauterstein
- Weltweit
- de | cs | en
- 03.05.2024
Kurzvorstellung
Qualifikationen
Projekt‐ & Berufserfahrung
6/2018 – 3/2019
Tätigkeitsbeschreibung
• Test of different models and select the best performing with cross validation: Decision Trees, Gradient Boosted Trees, Support Vector Machines and Neural Networks
• Implement automated method for hyperparameter optimization with bayesian optimization and grid seach techniques
• Developing a memory efficient machine learning pipeline, including automated optimization techniques to allow run by non-technical users
SAP HANA, Python
7/2016 – 3/2019
Tätigkeitsbeschreibung
Goals & Questions:
• Can recommendations be further improved by more advanced machine learning methods?
• Which performance can be achieved with different models, from simple and intepretable to very advanced and rather black-box models?
• Is it possible to improve model performance by adding additional features into more advanced models?
• Are more performant models black-boxes or can explanation/interpretation methods be applied?
• Generate ready to use R packages to apply different business scopes
Role:
• Consulting on method selection and coding embedded in a data science team
• Close interaction with peers and communication with stakeholders and internal IT staff to share knowledge and enable them to use the methods
• Coaching of internal IT colleagues to enable them to adopt the methods in practice and transfer knowledge to peers
Methods/Approach:
• Testing of several machine learning algorithms to generate recommendations: Decition Trees, Gradient Boosted Trees, Convolutional Neural Networks
• Evaluate tested models in cross validation settings and implement into existing algorithmic package
• Developing an automated pipeline from data import from databases over pre-processing and modeling to post-processing with prepared recommendations and if possible explanation layers
• Developing memory efficient functions and implement them in an R package for all pre- and post-processing steps for each implemented method
• Further development of an algorithmic R package based on S4 framework from recommenderlab to implement enseble methods and neural networks
Tools: R, Tensorflow mit keras (R), recommenderlab (R), xgboost (R), rpart (R), Python, SAP HANA, R Markdown
SAP HANA, Python
3/2015 – 6/2016
Tätigkeitsbeschreibung
Role:
• Support as external consulant for conceptual work, methods selection and coding (on-site with remote periods)
• Close collaboration with peers from Data Science, Machine Learning and IT
• Frequent communication of results to peers and internal management consulting colleagues in order to spread the concept in the enterprise
Methods/Approach:
• Clustering methods to group customers on customer features, sales and purchasing patterns
• Selection of appropirate algorithms to discover patterns of frequently purchased products (itemset mining) and to derive rules (association rule mining)
• Post-processing and interpretation of results for stakeholders
• Create interactive platform in Shiny to visually explore and explain purchasing patterns, rules and recommendations
SAP HANA, SAP HANA SQLScript, Python
Zertifikate
Über mich
Weitere Kenntnisse
- External consultant in global operating enterprises
- Method consulting
- Programming of prototypes
- Programming of Data Science / Machine Learning Pipelines
- Projects in international, English speaking teams
Broad knowledge in Machine Learning tools and application with R and Python for many years
• Supervised Learning: Deep theoretical knowledge in classification- and regression methods and long experience in practical use
o Decision Trees, Random Forests, Gradient Boosted Trees, Support Vector Machines, Deep Learning (ANN, CNN), logistic – and linear regression methods, Ridge Regression, Lasso, Linear Discriminant Analysis, kNN
• Unsupervised Learning: Extensive experience in application of clustering/pattern recognition of unstructured data
o DBSCAN, OPTICS, k-means, hierarchical clustering
• Rule based Learning: Considerable experience with algorithms for learning patterns and derive rules from unstructured data
o apriori, eclat, FP-growth
• Dimensionality reduction methods: Experienced with application of
o Principal Component Analysis, Linear Discriminant Analysis, Non-negative matrix factorization
Deep knowledge in statistical methods and modeling
• Econometric models
• Time series models (ARMA, ETS)
Many years of experience in development of data science pipelines, for proof of concept and implementation in productive systems
• Data exporting and merging from different sources (HANA; MSSQL)
• Data preparation, cleaning- and pre-processing for final analysis
• Data post-processing of final results and insights (automated reporting, visualization, interaktive applications like Shiny or Tableau)
Persönliche Daten
- Deutsch (Muttersprache)
- Tschechisch (Muttersprache)
- Englisch (Fließend)
- Europäische Union
- Schweiz
Kontaktdaten
Nur registrierte PREMIUM-Mitglieder von freelance.de können Kontaktdaten einsehen.
Jetzt Mitglied werden