logo
  • Home
  • About
  • Skills
  • Expertise
  • Experience
  • Education
  • Portfolio
  • Certifications
  • Writing
  • Contact
Saeid Shahriari

Hello, I’m Saeid

Data Engineer & MLOps · Brussels
Building reliable batch and streaming data pipelines.

Get in touch Download CV

About Me

I am a Data Engineer based in Brussels, with a diverse engineering background spanning over a decade. I am currently finalizing my MSc in Applied Computer Science while completing an intensive Data Engineering Masterclass focused on AWS, Big Data, and Modern Data Architectures.

My background in maintaining critical medical systems taught me the value of reliability, compliance (FDA/GDPR), and zero-downtime operations. Today, I apply that same discipline to build robust data pipelines. I specialize in Real-time Streaming (Kafka/Flink), Cloud Infrastructure, and Data Security.

Open to work
Download CV

Skills

Technologies I use to build reliable data pipelines.

Languages

Python (Pandas, PySpark, PyFlink), SQL (PostgreSQL, ClickHouse),Java, Bash.

Orchestration & Transformation

Apache Airflow, Docker, Kubernetes.

Streaming & Processing

Apache Kafka, Apache Flink, Spark Streaming, Debezium (CDC).

Cloud & Platforms

AWS (Glue, Athena, Lambda), GCP (BigQuery).

Monitoring & Visualization

Grafana (dashboards/alerts), Tableau, Power BI.

Data Platforms & Lakehouse

Databricks (Delta Lake, DLT), Snowflake, dbt, Apache Spark.

What I Build

Systems I design, build, and monitor — from raw source data to production-ready pipelines.

Data Pipelines

Building reliable Batch and Streaming pipelines to move data from sources to storage using Kafka & Flink.

Data Warehousing

Designing clean data models to ensure data is organized, fast, and ready for analytical queries.

Observability

Setting up Monitoring (Grafana) to catch issues early and ensure pipelines are running smoothly 24/7.

Data Quality

Implementing automated tests to ensure data is accurate, consistent, and trustworthy.

Data Visualization

Creating clear Dashboards (Tableau) to help stakeholders visualize trends and make decisions.

Experience & Training

Combining engineering discipline with modern data skills.

Training & Projects

Data Engineering Masterclass – Trainee

Dr Mohammad Fozouni · Nov 2024 – Present · Remote

Intensive, project-based Data Engineering & MLOps bootcamp. Working hands-on with PostgreSQL, ClickHouse, Apache Kafka, Apache Spark, Apache Flink, Delta Lake, Docker, Kubernetes, Apache Airflow, MLflow, Jenkins and AWS.

Capstone projects include:
– Real-time fraud detection pipeline (Kafka, Spark, PostgreSQL, MySQL, Redis, Grafana, AWS)
– End-to-end ML model deployment with monitoring and security on the cloud
– Building a secure Rust-based message broker (CipherMQ) as a product-focused project

Professional Experience

Data Visualization & Analyst (Intern)

Orange Business · Jul 2025 – Aug 2025 · Brussels

Worked with Google Cloud Platform (GCP) and Power BI to visualize complex datasets. Supported ETL processes and created clear technical reports to help business stakeholders make decisions.

Senior Technical Service Engineer

Sina Parto Jam · Apr 2019 – Dec 2021 · Tehran

Maintained complex medical imaging systems with 99% uptime. Performed root-cause analysis on system logs and ensured strict compliance with FDA and local healthcare regulations.

Radiology Technician

Bu Ali Sina Hospital · Aug 2014 – Feb 2017 · Hamedan

Operated high-tech radiology equipment in a fast-paced clinical environment, delivering accurate results while following strict safety protocols.

Education

Current and selected academic work relevant to my transition into data engineering.

MSc in Applied Computer Science

Vrije Universiteit Brussel (VUB) · 2024 – 2026 · Brussels, Belgium

Graduate program focused on Data Engineering, Machine Learning, Big Data, and Distributed Systems. Relevant coursework includes Big Data Processing, AI Techniques, Advanced Databases, Cloud & Distributed Systems, and Advanced IT Networks.

Selected Projects

Projects that best represent my work in streaming systems, analytics engineering, privacy, and MLOps.

Streaming Data Platform

Real-Time Food Delivery Platform

Production-style CDC pipeline for a food-delivery platform: PostgreSQL changes flow into Kafka, Flink SQL materializes streaming views, and online outputs land in OpenSearch and Redis. Built with Docker Compose, repeatable connector registration, and verification scripts.

Kafka · Flink SQL · PostgreSQL · Debezium · Redis · OpenSearch
View project
Privacy Engineering

PostgreSQL Anonymizer Streaming

Privacy-by-design demo built fully inside PostgreSQL. Combines dynamic masking, trigger-based anonymized streaming into a sanitized schema, and static anonymization for safe data exports in GDPR-conscious environments.

PostgreSQL · Triggers · HMAC Tokenization · Docker · Data Privacy
View project
Analytics Engineering

ELT Pipeline with dbt, Snowflake & Airflow

Analytics engineering workflow using Snowflake sample data, dbt models and tests, and Airflow via Astronomer Cosmos. Produces a clean fact table with automated data-quality checks and orchestration around transformation dependencies.

dbt · Snowflake · Apache Airflow · Astronomer Cosmos · SQL
View project
MLOps / Model Serving

Real-Time Fraud Detection API

FastAPI-based fraud scoring service with health checks, feedback capture, monitoring endpoints, Prometheus, Grafana, and MLflow integration. Designed as a practical starting point for near real-time model operations.

FastAPI · Prometheus · Grafana · MLflow · Docker · Monitoring
View project
Streaming Fraud Pipeline

Fraud Detection Pipeline

Containerized fraud-detection pipeline where a producer generates synthetic transactions, Spark Structured Streaming computes features in real time, PostgreSQL stores scores, and Airflow orchestrates drift detection and retraining tasks.

Kafka · Spark Structured Streaming · PostgreSQL · Airflow · Java
View project

Certifications

Selected courses and certificates that support my data engineering path.

Data Engineering: From Local Development to Server Deployment

Data Engineering School · Issued Mar 2025

CI/CD, PostgreSQL, Git & GitHub, Kubernetes, Apache Airflow, Docker, AWS, Apache Kafka, ETL.

Hands-On Essentials: Data Warehousing Workshop

Snowflake · Issued Jun 2025

Data warehousing fundamentals and best practices on the Snowflake platform.

Introduction to dbt

DataCamp · Issued Jun 2025

Analytics engineering with dbt for modular SQL transformations and testing.

Data Ingestion with Delta Lake

Databricks · Issued Apr 2025

Building reliable ingestion pipelines with Delta Lake and the Lakehouse architecture.

Data Management and Governance

Databricks · Issued Apr 2025

Data governance, access control and quality on top of the Databricks Lakehouse.

Foundational Cloud Practitioner

Maktabkhooneh · Issued Mar 2025

Core cloud concepts, security and cost-awareness for modern cloud platforms.

Machine Learning

Coursera · Issued Jun 2021

Classical ML algorithms, logistic regression, neural networks and model evaluation.

Advanced Python Programming

Maktabkhooneh · Issued Apr 2021

Advanced Python for data processing, scripting and backend development.

Selected Writing

28 Nov

Soft skills that really matter in tech

Read More
27 Nov

Data Engineering for Solution Architecture

Read More
04 Nov

ETL → ELT in 2025: What changed, what’s now, and what actually works

Read More
View all on LinkedIn

Get in Touch with Saeid

Prefer email? saeidshahriari1@gmail.com

Privacy

By sending this form you agree to my Privacy & Cookie Policy.

© 2026 Saeid Shahriari. All Rights Reserved