AI/ML Data Scientist specialising in NLP, predictive modelling, and automated analytics – with growing exposure to electronic fixed income trading and enterprise-grade data workflows.
I care about one thing: turning messy real-world data into models and tools that change decisions, not just slide decks.
-
Predictive modelling & retention / churn
- Supervised models (XGBoost, Neural Networks, logistic regression) for student and client retention, early-warning risk flags, and intervention design.
-
Time-series, anomaly detection & operations
- Isolation Forest, One-Class SVM, and rule-based QC to detect failures in sensor data and live systems, aligned to operational-risk thresholds.
-
NLP & applied AI
- LLMs, RAG, FinBERT, topic modelling and summarisation for finance and analytics use cases (research, monitoring, internal tooling).
-
Markets & trading
- Exposure to Nomura’s eFI Quant Rates desk: probability-of-fill models, ensemble architectures (logistic regression + trees/NN), and “risk radar” concepts for intraday risk buckets.
Languages & Data
- Python (pandas, NumPy, scikit-learn, XGBoost)
- SQL
- Jupyter / VS Code
ML & DL
- XGBoost, Gradient Boosting
- Neural Networks (TensorFlow / PyTorch basics, Keras)
- Clustering (K-Means, Hierarchical)
- Anomaly detection (Isolation Forest, One-Class SVM)
- Bayesian thinking for uncertainty & risk
AI & NLP
- LLM fine-tuning, LangChain / LangGraph
- RAG pipelines, NER, topic modelling
- HuggingFace Transformers, FinBERT, summarisation, agentic workflows
Cloud, BI & Dev
- AWS (S3, Athena)
- Power BI (Microsoft Certified)
- Docker, Git, GitHub
-
Student / customer retention models
End-to-end workflows: feature engineering, model comparison (baseline vs XGBoost vs NN), AUC/recall trade-offs, and tiered intervention strategies. ] -
Anomaly detection in sensor and operational data
Reproducible notebooks that combine models + QC frameworks, aimed at reducing false positives and quantifying maintenance or risk savings. -
Customer segmentation & clustering
EDA + clustering to define actionable segments, with a strong focus on interpretability and impact on downstream marketing or product decisions. -
Trading & execution prototypes (WIP)
Notebooks exploring execution modelling, ensemble architectures for probability-of-fill, and risk dashboards inspired by work on an electronic rates desk.
Each repo aims to show the full chain: from business question → data prep → modelling → evaluation → recommendations.
- Studying on the University of Cambridge PACE Data Science & AI programme (Level 7), sponsored by the Bank of England.
- Building a more opinionated portfolio around:
- retention modelling,
- anomaly detection,
- trading and execution analytics,
- and practical NLP for finance.
If you want to talk about applied ML in trading, risk, or operations, feel free to reach out (victoreigbefoh@outlook.com) or open an issue on any repo.