London Housing & Crime
geospatial

London Housing & Crime

Geospatial Analysis

This project merges over 1 million records from London’s housing market and metropolitan police crime data to construct a predictive Opportunity Index — a composite score that identifies neighborhoods where property values are likely underpriced relative to their safety trajectory. An Optuna-optimized XGBoost regressor achieves R² = 0.92 on held-out test data.

Key Features

1M+ records merged across housing transactions and crime incidents
Custom ‘Opportunity Index’ combining price trends, crime trajectories, and spatial features
Optuna hyperparameter optimization for XGBoost (R² = 0.92)
Temporal feature engineering: rolling crime rates, seasonal decomposition, price velocity
Spatial join pipeline linking crime hotspots to LSOA-level housing data
Actionable investment strategy output with ranked neighborhood recommendations

Data Pipeline

The pipeline ingests Land Registry price-paid data and Met Police crime records, performs spatial joins at the LSOA (Lower Layer Super Output Area) level, and engineers temporal features including rolling crime rates, seasonal decomposition, and price velocity indicators. Missing data is handled through spatial interpolation rather than simple imputation.

Opportunity Index

The Opportunity Index is a composite metric that identifies neighborhoods where crime rates are declining faster than property prices are rising. This gap represents an investment window — areas becoming safer but not yet repriced by the market. The index combines normalized crime trajectory, price momentum, transport accessibility, and green space proximity.

Model Optimization

Optuna’s TPE (Tree-structured Parzen Estimator) sampler explores the XGBoost hyperparameter space across 200 trials, optimizing max_depth, learning_rate, subsample, colsample_bytree, and regularization terms. The final model achieves R² = 0.92 with strong generalization verified through temporal train-test splits (training on historical data, testing on recent transactions).

Tech Stack

Python
Scikit-learn
XGBoost
Optuna
Pandas