freiberufler Data Architect | Big Data Architect | DevOps Engineer | Cloud Engineer auf freelance.de

Data Architect | Big Data Architect | DevOps Engineer | Cloud Engineer

offline
  • 90‐120€/Stunde
  • 85716 Unterschleißheim
  • Umkreis (bis 200 km)
  • fa  |  en  |  de
  • 01.07.2024

Kurzvorstellung

Senior Big Data engineer with 10 years of management and architecture design experience. Five years of Bigdata mission critical system management experience in the financial sector (data engineering and system engineering).

Auszug Referenzen (1)

"H. is a very honest and sympathic persion. Working with him was always a pleasure. Nice, friendly and really passionate in what he is doing."
Architect / Technical team lead DATA Ops (Festanstellung)
Thomas Riedel
Tätigkeitszeitraum

10/2017 – 12/2022

Tätigkeitsbeschreibung

- Leading cross-functional team including OPS and Dev members
- Part of core team for DWH (Oracle, Hadoop) migration to GCP
- Design and setup of GCP projects via Terraform (GCS, BigQuery, Dataproc, Dataflow, Pub/Sup, Cloud Composer, Cloud Build, etc.)
- Evaluation of Databricks platform including Delta Live Table, Job orchestration, Databricks SQL, Cluster policies, Dashboards and monitoring and alerting
- Design and implementation of PII data pseudonymization for cloud migration
- Installation, configuration and maintenance of MapR eco-system, fully automated by Ansible and Puppet (MapR core, Spark, Hive, Hue, Livy, Objectstore, NFS and monitoring packs)
- Setup and configuration on MapR Monitoring solution (OpenTSDB, Elasticsearch,CollectD, FluentD, Kibana, Grafana)
- Major upgrade and patching of MapR clusters with full automation by Ansible
- MapR cross-clusters data replication setup, cross cluster data access setup
- Containerization of MapR client/edge node by Ansible automation (OpenShift, Podman, Buildah, Ansible bender)
- Datacenter migration planning and execution which include migration of all MapR clusters to new DC
- Data restructuring to achieve better Application (Spark, Hive) performance
- Application tuning (SQL performance tuning) for Spark and Hive
- Spark streaming application development to replace Apache Flume
- Apache Airflow installation, configuration, update and maintenance on OpenShift. Application orchestration by Airflow
- JupyterHub installation and configuration including custom configuration per user
- Installation and configuration of Datahub/OpenMetadata (Data Catalog) including in-house development of data lineage generator for different data sources (Oracle, Hive,SAS)
- Installation, configuration and maintenance of Cassandra, Kafka, Postgres and Redis
- PoCs on different in-memory query engines (Apache Drill, Trino, Starburst)

Eingesetzte Qualifikationen

Design (allg.), Container Management, Data Science, Apache Hadoop, Apache Spark, Big Data, Data Warehousing, Maschinelles Lernen, Back up / Recovery, KVM (Kernel-based Virtual Machine), Kubernetes, Architektur (allg.)

Qualifikationen

  • Ansible5 J.
  • Apache Hadoop5 J.
  • Apache Spark
  • Automation Anywhere
  • Big Data5 J.
  • Data Warehousing5 J.
  • Databricks
  • Google Cloud
  • Postgresql5 J.
  • terraform

Projekt‐ & Berufserfahrung

Airflow SME
Deutsche Bank AG, Frankfurt
2/2023 – 6/2023 (5 Monate)
Banken
Tätigkeitszeitraum

2/2023 – 6/2023

Tätigkeitsbeschreibung

- Design and implementation of large scale, high available, fault tolerant Apache Airflow. The implementation consists of running Airflow with different executor type both on bare metal/virtual machines and OpenShift (Kubernetes). Containers are built from source using Buildah based on ubi9-micro image. Podman and systemd script were used to run containers and services on bare metal/virtual machine nodes. The setup also covers Disaster Recovery scenarios. Helm charts were used for OpenShift deployment.
- High available, disaster recoverable and load balanced Postgres setup using PgBouncer, HAProxy, Patroni with backup and recovery via BarMan on both bare metal and Kubernetes.
- High available Redis setup using Sentinel.
- S3 compatible storage for Airflow remote logging.
- Monitoring of complete stack with Prometheus.
- Support the migration of workflows from Control-M, Automic and Tivoli WS to Airflow.

Eingesetzte Qualifikationen

Architekturvisualisierung, Cloud (allg.), Docker, Openshift, Postgresql, System Architektur

Cassandra SME
Deutsche Post AG, Köln
1/2023 – 3/2023 (3 Monate)
Finanzdienstleister
Tätigkeitszeitraum

1/2023 – 3/2023

Tätigkeitsbeschreibung

- Design, and implementation of Cassandra clusters in three different environments. The production cluster consists of multi-datacenter (primary and secondary clusters) setup to safeguard datacenter disaster scenarios. Clusters were tuned based on workload.
- Reaper was setup to perform Cassandra repair jobs.
- Prometheus was installed, dashboards were configured to show metrics collected from
Cassandra nodes and application.
- Complete setup was automated by Ansible.

Eingesetzte Qualifikationen

Design (allg.), Red Hat Enterprise Linux (RHEL), Systems Engineering, Architektur (allg.)

Architect / Technical team lead DATA Ops (Festanstellung)
PAYBACK GmbH, München
10/2017 – 12/2022 (5 Jahre, 3 Monate)
IT & Entwicklung
Tätigkeitszeitraum

10/2017 – 12/2022

Tätigkeitsbeschreibung

- Leading cross-functional team including OPS and Dev members
- Part of core team for DWH (Oracle, Hadoop) migration to GCP
- Design and setup of GCP projects via Terraform (GCS, BigQuery, Dataproc, Dataflow, Pub/Sup, Cloud Composer, Cloud Build, etc.)
- Evaluation of Databricks platform including Delta Live Table, Job orchestration, Databricks SQL, Cluster policies, Dashboards and monitoring and alerting
- Design and implementation of PII data pseudonymization for cloud migration
- Installation, configuration and maintenance of MapR eco-system, fully automated by Ansible and Puppet (MapR core, Spark, Hive, Hue, Livy, Objectstore, NFS and monitoring packs)
- Setup and configuration on MapR Monitoring solution (OpenTSDB, Elasticsearch,CollectD, FluentD, Kibana, Grafana)
- Major upgrade and patching of MapR clusters with full automation by Ansible
- MapR cross-clusters data replication setup, cross cluster data access setup
- Containerization of MapR client/edge node by Ansible automation (OpenShift, Podman, Buildah, Ansible bender)
- Datacenter migration planning and execution which include migration of all MapR clusters to new DC
- Data restructuring to achieve better Application (Spark, Hive) performance
- Application tuning (SQL performance tuning) for Spark and Hive
- Spark streaming application development to replace Apache Flume
- Apache Airflow installation, configuration, update and maintenance on OpenShift. Application orchestration by Airflow
- JupyterHub installation and configuration including custom configuration per user
- Installation and configuration of Datahub/OpenMetadata (Data Catalog) including in-house development of data lineage generator for different data sources (Oracle, Hive,SAS)
- Installation, configuration and maintenance of Cassandra, Kafka, Postgres and Redis
- PoCs on different in-memory query engines (Apache Drill, Trino, Starburst)

Eingesetzte Qualifikationen

Design (allg.), Container Management, Data Science, Apache Hadoop, Apache Spark, Big Data, Data Warehousing, Maschinelles Lernen, Back up / Recovery, KVM (Kernel-based Virtual Machine), Kubernetes, Architektur (allg.)

Team lead IT Operations (Festanstellung)
Xyrality GmbH, Hamburg
12/2011 – 12/2016 (5 Jahre, 1 Monat)
Gaming
Tätigkeitszeitraum

12/2011 – 12/2016

Tätigkeitsbeschreibung

- Delegate duties and tasks within the IT department
- Perform regular IT audit to discover areas of weaknesses and fortify them
- OS installation, configuration, troubleshooting and tuning for Linux Red Hat 6 and 7 with security enhancement through SELINUX
- Configuration and administration of HA Postgres cluster with backup and recovery validation
- Installation and configuration of load balancer and caching mechanism (Varnish) and web servers (Apache, Nginx)
- Setup and configuration of monitoring tools (Check_mk, Graphite, Grafana). Log management and analysis (Elasticsearch, Kibana, FluentD)
- Configuration and troubleshooting of various network components (Juniper and Cisco) and implementation of Policy rules, DMZ and multi cross Vlan communications
- User management and authentication via LDAP and Radius
- Infrastructure automation with Ansible

Eingesetzte Qualifikationen

Architekturvisualisierung, Postgresql, Apache HTTP Server, Load Balancing, Nginx, Ansible, VLAN (Virtual Local Area Network), VPN, Hardware Virtualisierung, Design Thinking, Team Building, Automatisierungstechnik (allg.)

Zertifikate

Google Cloud Platform: Associate Cloud Engineer
Google
2022
Databricks: Certified Data Engineer Associate
Databricks
2022
Red Hat: Certified System Administrator
Red Hat
2019

Weitere Kenntnisse

MapR, HPE Ezmeral Data Fabric, Hadoop eco-sytem, Apache Spark,Apache Hive,Apache Airflow, Spark streaming, Flume, data lineage, SAS, BIGDATA, Apache Drill, Dremio, Trino, Cassandra, Kafka,
GCP, BigQuery, Cloud Composer, PubSub, Dataproc, Cloud Function,
Docker, Kubernetes, Podman, OpenShift, Buildah, Virtualization,
Terraform, Ansible,
Postgres, Patroni, HAProxy, BarMan, PgBouncer, Redis,
Linux Red Hat 7,8,9, CentOS, SELINUX, Debian, OpenStack
Varnish, Nginx,
Juniper, Cisco, Vlan, MRTG
User management, LDAP,
Prometheus, NewRelic, OpenTSDB, Icinga, Nagios, CheckMK, Grafana,

Persönliche Daten

Sprache
  • Persisch (Muttersprache)
  • Englisch (Fließend)
  • Deutsch (Grundkenntnisse)
Reisebereitschaft
Umkreis (bis 200 km)
Arbeitserlaubnis
  • Europäische Union
  • Schweiz
  • Vereinigte Staaten von Amerika
Home-Office
bevorzugt
Profilaufrufe
708
Alter
39
Berufserfahrung
17 Jahre und 9 Monate (seit 03/2007)
Projektleitung
10 Jahre

Kontaktdaten

Nur registrierte PREMIUM-Mitglieder von freelance.de können Kontaktdaten einsehen.

Jetzt Mitglied werden