Data Scientist | Analyst | Researcher
From data to decisions — powering progress through intelligent insights.
Nairobi, Kenya | bonsoul24@gmail.com | +254 700 015600
Overview
I solve problems using data—applying machine learning, visualization, and statistical modeling across business, public health, logistics, and research to drive innovation and impact.
-
Data Analysis & Visualization: Build interactive dashboards, extract insights, and produce reports using Excel, Tableau, SQL , and Power BI.
-
Data Science & Engineering: Develop and deploy machine learning models—including Predictive, NLP, and recommendation systems—while designing ETL/ELT pipelines for efficient data processing.
-
Statistical Modelling: Apply advanced statistical modeling—such as regression, Time-series forecasting, and survival analysis—to public health projects.
-
Research: Currently working on Multilevel Structural Equation Modeling (MSEM) to support evidence-based decision-making in health and development research.
Skills
-
Technical: Python · R · SQL · C++ · Power BI · Tableau · TensorFlow · PyTorch · Scikit-learn · Keras · PostgreSQL · Azure AI Studio · Kobo Collect · Docker · Superset · Databricks
-
Analytical: Statistical Modeling · Forecasting · Hypothesis Testing · Data Visualization · A/B Testing · Time Series · Predictive Analytics · Data Modelling
-
Soft Skills: Problem Solving · Critical Thinking · Business Acumen · Communication · Teamwork · Leadership · Presentation · Data Storytelling
Projects and Case Studies
1. Statistical Modeling & Health Analytics
Cholera Surveillance & Forecasting (Kenya)
Conducted time series analysis on cholera case data across Kenyan counties, integrating environmental and demographic variables. Built an interactive dashboard and a seasonal ARIMA model in R to forecast future outbreaks, improving preparedness and reducing regional response time by up to 30%.
HIV/AIDS Transmission Risk Modeling
Investigated HIV/AIDS transmission patterns using national health datasets. Employed logistic regression and chi-square tests to analyze the relationship between socio-demographic factors (e.g., education levels, marital status) and infection risk, revealing statistically significant disparities across counties. Results informed targeted community-based interventions.
Quality of Pediatric Clinical Assessment – Nyeri, Kenya
This study aimed to assess the quality of clinical assessment provided to sick children aged 2–59 months in primary health facilities in Nyeri County. It examined both the clinical care delivered and caregiver satisfaction, while exploring the relationship between the two. Using structured facility assessments and caregiver interviews, the study applied descriptive statistics, cross-tabulations, and logistic regression to evaluate the quality of care and identify key factors influencing caregiver perceptions. The insights generated informed recommendations for enhancing pediatric service delivery and strengthening caregiver engagement in rural healthcare settings.
2. Machine Learning Projects
Malaria Detection Using CNNs
Built a deep learning model using Convolutional Neural Networks (CNNs) to classify infected vs. uninfected blood cells from microscope images. Applied image augmentation and performance tuning to achieve high accuracy for early disease diagnosis.
Cotton Disease Detection (PyTorch)
Developed a classification model using PyTorch to detect plant diseases from leaf images. Applied transfer learning with ResNet, improving accuracy by over 20% after optimizing preprocessing and hyperparameters.
Content-Based Recommendation System
Built a recommendation engine using Python to suggest products based on user behavior and preferences. Integrated cosine similarity and TF-IDF vectorization to personalize suggestions for e-commerce users.
3. Data Analysis & Dashboarding
NGO Reporting System & Power BI Dashboard
Designed and implemented a comprehensive PostgreSQL database to centralize program monitoring data for a local NGO. Built end-to-end ETL pipelines and connected the database to Power BI to automate reporting. Developed a dynamic dashboard to track KPIs, visualize program reach, and generate donor-ready impact reports — significantly improving reporting efficiency and transparency.
Customer Churn Analysis (Telco)
Analyzed telecom customer behavior using Pandas, NumPy, and Seaborn to identify churn drivers. Trained and tested models (Logistic Regression, Decision Trees) achieving 85% accuracy. Developed a real-time Power BI dashboard to monitor churn trends, enabling targeted customer retention strategies that reduced churn by 15%.
Sales Performance Dashboard
Built a dynamic sales dashboard in Power BI that tracked revenue, conversion rates, and sales rep performance across regions. Integrated data from Excel, SQL, and CRM systems. Delivered actionable insights that led to a 15% increase in regional sales performance within 3 months.
Experience
Freelance Data Scientist – Upwork Remote | Jan 2024 – Present
Built ML models (classification, recommendation) with Python & Scikit-learn.
Developed automated pipelines using PostgreSQL and deployed dashboards in R.
ML Intern – Technohacks Education Remote | Jun 2024 – Sept 2024
Preprocessed and structured data for TensorFlow/Keras models.
Tuned parameters and boosted model accuracy by 10%.
Data Analyst – Samburu Awareness Action Program Samburu | Apr 2023 – Jan 2024
Designed centralized data systems with Kobo Collect + SQL.
Improved program outcomes by 25% via data-informed strategies.
Supply Chain Analyst Intern – Sendy Logistics Nairobi | May 2022 – Sept 2022
Built Excel forecasts reducing excess inventory by 8%.
Maintained Tableau dashboards for real-time insights.
Education
BSc. Statistics, Computing & IT - Cooperative University of Kenya
ALX Data Science Program - Python · SQL · Data Viz · ML · Power BI
Udemy Certifications - Machine Learning, Analytics, Storytelling, Data Engineering
Leadership & Volunteering
Global Citizens Challenge (Team Lead): Led a diverse team in global sustainability challenges.
Men’s Book Club (Moderator): Managed sessions, facilitated critical discussions, promoted learning.