freiberufler Big Data, Python, Data Vault, BI, DWH auf freelance.de

Big Data, Python, Data Vault, BI, DWH

zuletzt online vor 7 Tagen
  • auf Anfrage
  • 12207 Berlin
  • DACH-Region
  • ur  |  en  |  de
  • 05.11.2024

Kurzvorstellung

I can help companies looking for Project Management, Data Management, Cloud Engineering and Cloud Architecture roles for data and pipeline initiatives in AWS/Databricks. I can help companies looking for Data Modeling, Data Standards, and Governance

Qualifikationen

  • Amazon Web Services (AWS)3 J.
  • Data Vault 2.0
  • Python2 J.
  • Big Data
  • Cloud-Services2 J.
  • Cloud Computing2 J.
  • Cloud Spezialist2 J.
  • Data Warehousing5 J.
  • Databricks2 J.
  • Datenmanagement2 J.
  • DBt
  • ETL4 J.
  • Microstrategy2 J.
  • Oracle Database2 J.
  • Projektmanagement (IT)2 J.
  • SAP BusinessObjects (BO)4 J.
  • SqlDBM

Projekt‐ & Berufserfahrung

Project Data Manager, AWS, S3, Databricks
Chemical Industrie, Berlin
4/2022 – offen (2 Jahre, 8 Monate)
Chemieindustrie
Tätigkeitszeitraum

4/2022 – offen

Tätigkeitsbeschreibung

AWS Cloud Adoption and Infrastructure Consulting
Advised chemical industry clients on AWS cloud adoption, data management, and infrastructure architecture, establishing a robust data platform. Proposed architecture solutions, evaluating AWS building blocks to align with client needs, while actively contributing to project planning and team communications.

Project and Incident Management
• Coordinated with Cyber Security and Group Admins for secure implementations and incident management.
• Managed IT demands and change requests in ServiceNow, enhancing efficiency.
• Handled GitHub team permissions, ensuring secure project access.
• Facilitated Databricks GitHub app requests, managing repos for various environments.

Data Lake Management and AWS Setup
• S3 Data Collection: Led sessions to define data categories for S3, aligning with company classifications.
• AWS Transfer Family with Lambda: Implemented Transfer Family with Lambda for custom authentication, enhancing data security.
• IAM Policies: Consulted on IAM policy setup for secure, cross-account S3 access.
• Redshift Setup: Managed Redshift cluster setup, consulting on keys and Copy Commands for optimization.
• Liquibase: Advised on Liquibase adoption for database versioning in CICD.

Architecture and Security Management
• AWS Account Documentation: Documented purpose, environment classification, IAM policies, S3 access, service principals, GitHub OIDC integration, and VPC connections to on-prem via Transit Gateway.
• Subnet and CIDR Blocks: Designed and documented subnet layouts for network security.
• Firewall Management: Controlled firewall openings/closings per compliance standards.
• Resource Monitoring: Monitored AWS resources, optimizing costs.
• Databricks Access: Configured SCIM passthrough and meta-IAM roles for identity management.
• GitHub OIDC Integration: Established IAM roles for GitHub-to-AWS access via OIDC.

Databricks Data Engineering and Data Transfer
• Unity Catalog: Deployed with Terraform for data governance.
• Delta Tables and Workflows: Created Delta Tables and workflows for streamlined data processing.
• Python Notebooks: Built notebooks for analysis, including data cleanup and public datasets (ELWIS, PEGELONLINE WSV, DWD).
• Data Transfer and Monitoring: Developed Shell/SFTP scripts and managed BOXI exports over SFTP for data archiving.

Outcome: Enabled Cloud adoption for Data and Analytics needs (including CICD), increased Data acquisition, and flexible Data Pipelines for Reporting use-cases.

Tools: Project Data Management, AWS, Architecture, Python, RDS, Redshift, DMS, S3, Lambda, Transfer Family, Databricks, Spark, Subnets/CIDR allocation, IAM Roles, IAM Policies, Bucket Policies, Microsoft Graph API, Power BI, SAP BI (BOXI)

Eingesetzte Qualifikationen

Amazon Web Services (AWS), Cloud-Services, Cloud Computing, Cloud Spezialist, Databricks, Datenmanagement, IT-Berater, Projektmanagement (IT), SAP BusinessObjects (BO)

Data Modeler, Group Data Office
Insurance Branche, München
2/2022 – 9/2024 (2 Jahre, 8 Monate)
Versicherungen
Tätigkeitszeitraum

2/2022 – 9/2024

Tätigkeitsbeschreibung

Datenbankaufbau und -strukturen nach Data Vault Design, 3NF und Data Marts (Consumption Layer).

Engaged as a Data Vault 2.0 Modeler with a Group Data Office from Insurance and Reinsurance business, for Group implementation of Commercial Insurance (PLC), Reinsurance data collection, sourced from various Operating Entities across various Geographical locations.

• Updating and enhancing Group Commercial Common/Unified Data Model (nach Data Vault 2.0).
• Updating Data Standards for the Data Model.
• Enhancing Global Definitions (Global Business Glossary)
• Developing business Data Examples for Operating Entities.
• Writing Mapping Guidelines.
• Unifying CAT Risk 3NF Data Model into group Commercial DV 2.0 Data Model.
• Participation in Enterprise Ontology clarification Workshops
• Business examples and presentation slide decks for various Scopes in the model, Reinsurance (FAC/Treaty/Quota Share), Incident/Claim classifications, Policy and Object Terms (L&Ds), IIP (International Insurance Programs) to name a few
• Data Mapping and Validation Example preparations, show-casing bi-directional mapping between group Commercial DV 2.0 Data Model and CAT Risk and Cyber 3NF Data Models.
• Writing detailed Mapping Guideline for Data Deliveries into Group Commercial Common Ingestion DV2.0 Data Model.
• Led solutioning in Data Model Workshops.
• Unifying Claims Data Model into group Commercial DV 2.0 Data Model.
• Participation in Workshops with Data Architects from other Business units.
• Member of Community of Practitioners on Business Intelligence, Architecture and Data Modeling.

Outcome: Enabled Group Data Reporting for Portfolio steering on Common Data Standards and Global Business Glossary

Tools: SQLDBM, PostgresSQL, Azure Synapse Analytics Pool DB SQL, GitHub, Confluence, Informatica Axon (GBG), Sharepoint, Excel, Powerpoint, Enterprise Ontology

Eingesetzte Qualifikationen

Azure Synapse Analytics, Data Vault, Datenarchitekt, Datenmodelierung, Postgresql

Data Vault Consultant
Insurance Branche, München
10/2021 – 1/2022 (4 Monate)
Versicherungen
Tätigkeitszeitraum

10/2021 – 1/2022

Tätigkeitsbeschreibung

Engaged as a Data Vault 2.0 developer for implementation of IFRS17 Account Standard requirements and adaptions, at an Insurance company in Munich. As an Integration Layer developer, I was mainly responsible for:

• Overseeing operations in Integration Layer
• Analysing and Developing new Raw Vault and Business Vault hubs, links and satellites
• Ensuring relationships between entities loaded from different source systems.
• Integrating new file based data source for Top Adjustments for month end closings.
• Working with Effectivity Satellites
• Met requirements for Ledge Specific postings (GAAP Codes) and Automated Reversals.
• Provided Loading Templates for Premiums (Policy), Claims and Cash transactions.
• Analysing Data Quality errors
• Generating SQL packages and carrying out deployments.
• Preparing Visual Data Vault Diagrams!
• Documentation in Confluence
• Exchanging with other IL Developers, Engineering Manager and Product Managers.
• Raising Pull Requests in Github, and Merging them to Master (after a successful review).
• Creating Test Cases in Tricentis TOSCA
• Working in an agile 10 days Sprint basis, creating User Stories and allocating Story Points.
• Participating in Feature Grooming sessions.
• Using Jenkins to manage and schedule Runtime jobs

Tools: SQL Developer, Oracle, GitHub, Tricentics TOSCA, Eclipse, Jenkins

Eingesetzte Qualifikationen

Oracle Database, SQL Entwickler, Eclipse

Project Manager GDPR Data Lake
Real State Platform, Berlin
5/2021 – 10/2021 (6 Monate)
Real State
Tätigkeitszeitraum

5/2021 – 10/2021

Tätigkeitsbeschreibung

Working as Project Manager for GDPR implementation in Data Lake storage at a Real State Platform company.

• Responsible for coordination between tech team and legal.
• Engagement with external DPO, to clarify GDPR requirements.
• Over-seeing team plannings.
• Stake holder management
• Communication with data consumers and producers.

Tools: Miro, Confluence, JIRA , Project documentation

Eingesetzte Qualifikationen

Datenschutz, Projektmanagement

Python Data Migration Engineer
Fashion eCommerce, Berlin
1/2021 – 7/2021 (7 Monate)
Fashion eCommerce
Tätigkeitszeitraum

1/2021 – 7/2021

Tätigkeitsbeschreibung

Working as Data Engineer for Fashion eCommerce Shop, on a migration project, to migrate fashion
Products data from Product Information Management system (PIM / Akeneo) community edition to new Enterprise version.
The solution was developed in Python, to pull data from older PIM via APIs, join and transform, and Publish into new PIM via APIs.

• Performed Data Mapping between old Data Structures and new Data Structures.
• Finalized Data Transformation requirements analysis.
• Wrote data processing and transformation modules.
• Wrote module responsible for interacting with APIs.
• API Authentication
• Managed mapping and transformation rules in Json files.

Python Libraries:
requests, multiprocessing, Json, logging

Technologies: Python 3.7, PyCharm, Akeneo PIM, Restful APIs, JSON, CSV

Eingesetzte Qualifikationen

API-Entwickler, Python, Json, Representational State Transfer (REST), Produkt- / Sortimentsentwicklung

Middleware Data Integration Specialist
Telekomm Branche, Berlin
7/2020 – 9/2021 (1 Jahr, 3 Monate)
Telekommunikation
Tätigkeitszeitraum

7/2020 – 9/2021

Tätigkeitsbeschreibung

Developing and Maintaining Middleware Restful APIs for Integration use-cases, such as:

• between SAP and Salesforce
• DocuSign and Salesforce,
• Microservices, legacy ERPs and Internal Systems that maintain Parts information
• GCP

Working with Json, XML and iDoc (SAP) Formats, to develop Integrtations and Gateways. Analysis of source SAP Data structures and Data models, Salesforce and real-time micro-services. Data Mapping between SAP, Salesforce and the Micro-service.
Implementation of HMAC.

• SQL development
• Stored procedure development
• Use of Transaction Management
• Exception Handling.

Regularly prepared Swagger specifications, Test Evidence and UAT documentation.

Technologies: MSSQL Server 2014 (SQL/T-SQL), Middleware (IBM App Connect Studio), JSON/XML, CSV, Swagger, GCP, Restful APIs

Eingesetzte Qualifikationen

Transact-Sql, Microsoft SQL-Server (MS SQL), API-Entwickler, XML, Json

DWH/Data Vault Entwickler
Payment Provider, Berlin
5/2020 – 7/2020 (3 Monate)
Finanzdienstleister
Tätigkeitszeitraum

5/2020 – 7/2020

Tätigkeitsbeschreibung

As a Data Vault developer, I was responsible for enriching Raw Vault and Business Vault with new Satellites containing SAP Bookings data.

• Analysis of existing Data Vault Model and Data Loading routines.
• Performed Extension in existing Data Vault Model.
• Creation of new Links and Satellites in Raw and Business Vault.
• Working with Events Data Processing in Data Vault.

Creation of Stored Procedures for daily loading of new data source. Enabled data extraction through BCP and Powershell.

Eingesetzte Qualifikationen

Data Warehousing, Transact-Sql, Data Vault, Microsoft SQL-Server (MS SQL)

Python Data Engineer
Beratungshaus, Heidelberg
11/2019 – 2/2020 (4 Monate)
IT & Entwicklung
Tätigkeitszeitraum

11/2019 – 2/2020

Tätigkeitsbeschreibung

Fast paced intensive development of multiple data integration modules in Python on Linux in Docker.

These developments enabled data integration and provision for a new web based software:

Integration with Nifi over nifi-api. This component works with Json retrieved from Nifi Rest API, to traverse through Nifi Flow, Process Groups and Processors.
Metadata handling component. This component handles for target MySql database data type conversions, against Big Data AVRO primitive data types based meta data.
Component to download large CSV files over Rest API with Streams (use of requests iter_content) in Parallel threads.
Loading of CSV files into MySql using Data Load Infile command over Sqlalchemy(+pymysql)
Data Integration Job. A main python job that combines other components together.

Python Libraries:
requests, Sqlalchemy, multiprocessing, pandas, dotenv, logging

Technologies: Python 3.7, Linux, Docker, PyCharm, Liquibase, Putty, Real VNC, Citrix.

Eingesetzte Qualifikationen

ETL, Mysql, Python

Data Vault und Big Data Berater und Entwickler
Versicherungsunternehmen, Köln
4/2019 – 7/2019 (4 Monate)
Versicherungen
Tätigkeitszeitraum

4/2019 – 7/2019

Tätigkeitsbeschreibung

Erfahrung mit Python
Data Vault 2.0 basiert Data Lake Entwicklung und Beratung mit Hadoop Cloudera (CDH) und Amazon Stacks.
Hadoop Cloudera (CDH):
- GDPR Compliant HDFS Data Lake using AVRO file format.
- Hive/Impala based Data Vault Entities & Information Mart.
Amazon S3 and Redshift:
- S3 based Data Lake and external Athena/Redshift tables.
- Redshift based Data Vault and Virtualised Information Mart.
Pre-computed Hash keys Materialised as AVRO files in Lake.
Technologien: Python 3,7, AWS, S3, Redshift, DMS, SQS, Cloudera, Avro, Hive, Impala

Eingesetzte Qualifikationen

Apache Hadoop, Big Data, Python, Amazon Web Services (AWS)

Middleware Spezialist
Telekom Lösungsanbieter, Berlin
9/2018 – 8/2019 (1 Jahr)
Telekommunikation
Tätigkeitszeitraum

9/2018 – 8/2019

Tätigkeitsbeschreibung

● Rest APIs and Gateways with Json, Xmls, Idocs and Javascript.
● Data structure / model analysis between SAP, Salesforce and real-time micro-services and respective data mappings.
● Development in MS SQL Server 2014 SSMS.
● SQL development and stored procedure development with Transaction Management and Exception Handling.
● Test Evidence and UAT documentation.

Technologien: MSSQL Server 2014, Middleware, JSON/XML, CSV

Eingesetzte Qualifikationen

Microsoft SQL-Server (MS SQL), Idoc, Json, Representational State Transfer (REST)

MicroStrategy Entwickler
Einzelhandel, Ruhrgebiet
8/2017 – 3/2018 (8 Monate)
Großhandel
Tätigkeitszeitraum

8/2017 – 3/2018

Tätigkeitsbeschreibung

Part of FE team, responsible for implementation of MicroStrategy Use Cases for the retail business.

Responsibilities include:

Business Validation of Requirements, with RE & Arch. team.
Solution Concept Workshops, with Arch. & Business teams.
Implementation of MicroStrategy Use Cases (package 2 & 3).
Liaising between Backend and Frontend teams.

Extensive Development Experience with MSTR Documents.
Use of Panel Stacks, Selectors, Grids and Graph components.
Use of Multiple Datasets.

Extensive experience with Visual Insights and OLAP Metrics.

Datasets with Level and Derived Metrics.

Technical feats include:
● Use of Transaction Services.
● Mapping of Attributes (IDs, Forms).
● Parent-Child relationships & Hierarchies.
● Use of multiple Datasets, based on multiple Data Marts.

Advanced Topics include:
● Setting up MicroStrategy Job Prioritisations
● iCube Optimisation & Incremental Refresh reports.

Operational tasks include bi-weekly deployments.

Eingesetzte Qualifikationen

Microstrategy, Data Warehousing, Oracle-Anwendungen

Head of Data Technology
Crosslend GmbH, Berlin
9/2015 – 2/2017 (1 Jahr, 6 Monate)
FinTech, Consume Lending
Tätigkeitszeitraum

9/2015 – 2/2017

Tätigkeitsbeschreibung

I was responsible for leading BI and analytics function of the company, a member of management team. Close cooperation with other Heads, Team Leads and C-levels. Vendor management (Microstrategy). Streamlined many data acquisition, processing and KPI calculation challenges (e.g. Payment processing). Built visualizations and dashboards, together with maths intensive calculations for returns and portfolio performance.

 Responsible for leading BI and Analytics function of Crosslend.
 Close collaboration with Executives, Marketing, Operations, Finance, Product, Engineering and DevOps.
 Enabled self service BI, rolled-out Microstrategy.
 Investor Fact sheets and pitch-decks.
 Financial Metrics, IRRs, Annualized Net Returns (unadjusted) and Default Curves.
 Marketing Performance dashboards and reports (per Channel).
 Customer Insights for Operations and CC team.
 Payment processing and overdue related KPIs.
 Visualizations, simulations and correlations.
 Successful closing of audits (positive opinion).

Eingesetzte Qualifikationen

Microstrategy, Data Warehousing, Business Intelligence (BI), Mysql, ETL

Data Warehouse Architect
Kreditech Holdings SSL, Hamburg
7/2014 – 8/2015 (1 Jahr, 2 Monate)
Fintech, Consumer Lending
Tätigkeitszeitraum

7/2014 – 8/2015

Tätigkeitsbeschreibung

I was responsible for Data Warehouse Architecture and managing company relationship with Exasol (service provider), trained and hired people (DWH Engineers), built overall DWH Architecture, Infrastructure, integrated unstructured NoSQL data (Mongo DB), modeled company core business tables, wrote Finite State Machine (for IFRS based classification), successful closing auditing (a pre-req for series-B funding).

 Responsible for Data Warehouse Architecture and Data Engineering Team.
 Managing Data Warehouse technology infrastructure and service providers.
 Data Modelling company core Revenue and Accounting Fact tables.
 Marketing data mart, performance data at campaign and keyword level (Hierarchy).
 Finite state machine (for IFRS based classification) and Payment Waterfall calculations.
 Data historisation design concepts.
 Integration of unstructured NoSQL data (Mongo DB).
 Successful closing of audits and series B funding.
 Tech-stack: Exasol, Mongo DB, Postgres, Pentaho Kettle, Python and LUA.

Eingesetzte Qualifikationen

Online Analytical Processing, Data Warehousing, Open Source, Postgresql, Mongodb, ETL, Datenbankentwicklung, Lua Scripting, Python

Senior Manager BI
Zalando SE, Berlin
8/2012 – 6/2014 (1 Jahr, 11 Monate)
E-Commerce
Tätigkeitszeitraum

8/2012 – 6/2014

Tätigkeitsbeschreibung

I was part of ERP/MIS team, responsible for Customer Analytics pipeline. Carried out wide set of responsibilities and functions. Came across aggregation requirements using Hadoop (Java Map/Reduce). Oracle, Exasol, Pentaho Kettle technology stack. Lead Oracle DWH migration to a new HW. Re-wrote legacy ETLs, migrated IBM Unica CRM in-house, managed freelancers.

 Responsible for Customer pipeline within Zalando BI.
 Cohort trend analysis for customers -Hyperlink entfernt-
 Analysis of website click log files using Hadoop (Java Map/Reduce).
 Design and Development of Customer Survey Data (Oracle PL/SQL).
 Interfacing operational subset for forecast analysis (Exasol).
 Migration of IBM Unica CRM in house. Redesigning CRM Data Model and simplifying ETLs.
 Leading migration of Oracle DB to new HW, improving backup and recovery options.
 Tech-stack: Hadoop, Oracle, Exasol, Pentaho Kettle, PostgresSql and Business Objects

Eingesetzte Qualifikationen

Apache Hadoop, Data Warehousing, SAP BusinessObjects (BO), Postgresql, Oracle Database, ETL, CRM Beratung (allg.), Enterprise Resource Planning

Zertifikate

SqlDBM Fundamentals
SqlDBM
2024
dbt Fundamentals
dbt Labs
2024
Certified Data Vault 2.0 Practitioner
2018
Oracle Certified Professional Database 11g Administrator
2010

Ausbildung

Computer Science
Bachelors
4
Lahore, Pakistan

Über mich

Danke für Ihren Besuch. ich bin tätig als Selbständig Datenmanager, Berater, Datenmodilier und Databricks Entwickler im Bereich Business Intelligence und Data Warehousing. ich biete Erfahrungen mit AWS Cloud, AWS Architektur, Databricks, Redshift, Oracle, Exasol und MicroStrategy, sowie auch mit MSSQL, MySQL, Postgres, Python und Middleware! in Bereich Big Data habe ich GDPR willig Data Lake mit Cloudera CDH (avro, Hive/Impala) und AWS (S3, Athena/Redshift) durchgeführt. Ich bin immer auf der Suche nach Interessante Kontakt und Projekten. es wird mich freuen von Ihnen zurück zu hören.

Weitere Kenntnisse

● AWS Cloud Architekture, Projekt Data Management, Databricks - Erfahrung 2J+
● Datenmodelierung, Data Vault 2.0, Data Standards, Data Governance - Erfahrung 2J+
● Data Vault 2.0 Zertifiziert.
● SqlDBM Fundamentals Zertifiert
● dbt Fundamentals Zertifiert.
● Oracle Zertifiziert (11g DBA).
● MicroStrategy, Oracle, Exasol, Python
Momentan arbeite ich an Kenntnis Erweiterung in Databricks.

Persönliche Daten

Sprache
  • Englisch (Fließend)
  • Deutsch (Fließend)
  • Urdu (Muttersprache)
Reisebereitschaft
DACH-Region
Arbeitserlaubnis
  • Europäische Union
Profilaufrufe
6460
Alter
41
Berufserfahrung
19 Jahre und 9 Monate (seit 02/2005)
Projektleitung
2 Jahre

Kontaktdaten

Nur registrierte PREMIUM-Mitglieder von freelance.de können Kontaktdaten einsehen.

Jetzt Mitglied werden