Open to Opportunities

Harish Rao
Yadagiri

|

Turning raw data into actionable intelligence through scalable pipelines, ML models, and compelling visualizations.

0 GitHub Repos
0 Featured Projects
0 Technologies
Scroll to explore

About Me

Harish Rao Yadagiri

I'm a Data Engineer & Analytics Professional with a Master's in Business Analytics and Information Systems from the University of South Florida. My passion lies in building scalable data infrastructure and leveraging machine learning to uncover actionable insights.

I specialize in designing end-to-end data pipelines, developing predictive models, and creating compelling data visualizations that bridge the gap between raw data and strategic decision-making. From ETL orchestration to real-time streaming and AI-powered applications, I bring a holistic approach to every data challenge.

Data Pipelines

Scalable ETL/ELT workflows with orchestration

Cloud Architecture

Azure, AWS — cloud-native data solutions

Machine Learning

Predictive models & AI-powered applications

data_pipeline.py
from pipeline import DataPipeline

pipeline = DataPipeline(
  source="api.weather_data",
  transform=[
    "clean",
    "normalize",
    "feature_engineer"
  ],
  destination="snowflake.analytics_db",
  schedule="@daily"
)

pipeline.run()  # ✓ 11 cities processed

What I Bring

Data Pipeline Engineering

Design and build scalable ETL/ELT workflows that process millions of records. From API ingestion to warehouse loading with automated quality checks.

Airflow · Spark · dbt · Docker

Cloud Architecture

Architect cloud-native data solutions on Azure and AWS. Build data lakes, set up streaming pipelines, and deploy containerized analytics services.

Azure · AWS · Snowflake · Firebase

ML & AI Applications

Develop predictive models and AI-powered applications using Gemini API, scikit-learn, and TensorFlow. From churn prediction to conversational AI chatbots.

Gemini API · Scikit-learn · TensorFlow

Analytics & Visualization

Transform raw data into compelling stories with interactive dashboards and geospatial analytics. Enable data-driven decision-making at every level.

Tableau · Power BI · Streamlit · Python

Skills & Technologies

Languages & Frameworks

🐍 Python
🗄️ SQL
C#
📘 TypeScript
⚛️ React
Next.js

Data Engineering

Apache Spark
🌬️ Apache Airflow
📡 Apache Kafka
🔧 dbt
🔄 ETL/ELT
🐳 Docker

Cloud & Databases

☁️ Microsoft Azure
🟠 AWS
❄️ Snowflake
🐘 PostgreSQL
🍃 MongoDB
🔥 Firebase

ML, AI & Visualization

🧠 Scikit-learn
🤖 TensorFlow
💎 Gemini API
📊 Tableau
📈 Power BI
🎯 Streamlit

Projects

Climate Data Pipeline

End-to-end ETL pipeline processing climate data across 11 US cities — ingesting hourly weather data from Open-Meteo API, transforming into a dimensional model with dbt, and serving analytics via a live Streamlit dashboard.

Python ETL Streamlit Data Pipeline

Ask AI Buffett

AI chatbot channeling Warren Buffett's value investing wisdom. Powered by Google Gemini 2.5 Flash with live yfinance market data, token-optimized context management, and a premium Streamlit UI.

Python Gemini API AI/LLM Finance

Predictive Modeling — Bank Churn

Built 5+ classification models (Logistic Regression, SVM, Random Forest, XGBoost) with feature engineering to predict customer churn and optimize bank marketing retention strategies.

Python ML Predictive Analytics Jupyter

Bank Marketing Data Mining

Mined 45,000+ records from Portuguese bank direct marketing campaigns. Applied decision trees, logistic regression, and ensemble methods to predict term deposit subscriptions with optimized recall.

Python ML Data Mining Jupyter

Property Value Analysis

Analyzed 7 years of property data (2015–2022) across St. Petersburg neighborhoods using geospatial analytics and interactive Tableau dashboards to identify value trends for stakeholders.

Data Viz Analytics Geospatial

Credit Score Predictor

Engineered 28+ features from financial behavior data to build ensemble classification models (Random Forest, Gradient Boosting) predicting credit score tiers with optimized accuracy.

Python ML Classification

GitHub Activity

~/harishraoyadagiri — git log
GitHub Contribution Graph
21 repos
6 featured
15+ technologies
View Profile →

Education & Experience

2025 — Present

Data Engineer

Tiruven Inc

Building and optimizing scalable data pipelines, designing cloud-native analytics architectures, and driving data-driven decision-making across the organization.

Data Engineering Cloud ETL
2023 — 2025

MS, Business Analytics & Information Systems

University of South Florida

Advanced coursework in data analytics, machine learning, cloud computing, and business intelligence. Built end-to-end data pipelines and AI applications.

Data Analytics Machine Learning Cloud Computing
2023 — 2025

Graduate Assistant

University of South Florida

Supported academic research and teaching in analytics courses, assisted students with data analysis and visualization projects.

Teaching Research Analytics
2020 — 2023

Project Engineer

Wipro Limited

Worked on enterprise-level technology solutions, contributing to data-driven projects and gaining experience with large-scale system implementations.

Enterprise Tech Project Management Data Solutions
2016 — 2020

BTech, Electronics & Communication Engineering

JNTU Hyderabad

Strong foundation in engineering principles, signal processing, and embedded systems. Developed analytical and problem-solving skills applied to data engineering.

ECE Signal Processing Engineering

Get in Touch

I'm always open to discussing data engineering roles, analytics projects, or collaboration opportunities. Let's build something amazing together.