# Selected Publications

### Development and External Validation of Prediction Models for 10-Year Survival of Invasive Breast Cancer. Comparison with PREDICT and CancerMath

Purpose: To compare PREDICT and CancerMath, two widely used prognostic models for invasive breast cancer, taking into account their clinical utility. Furthermore, it is unclear whether these models could be improved. Experimental Design: A dataset of 5729 women was used for model development. A Bayesian variable selection algorithm was implemented to stochastically search for important interaction terms among the predictors. The derived models were then compared in three independent datasets (n = 5534). We examined calibration, discrimination and performed decision curve analysis. Results: CancerMath demonstrated worse calibration performance compared to PREDICT in oestrogen receptor (ER)-positive and ER-negative tumours. The decline in discrimination performance was -4.27% (-6.39 - -2.03) and -3.21% (-5.9 - -0.48) for ER-positive and ER-negative tumours, respectively. Our new models matched the performance of PREDICT in terms of calibration and discrimination, but offered no improvement. Decision curve analysis showed predictions for all models were clinically useful for treatment decisions made at risk thresholds between 5% and 55% for ER-positive tumours and at thresholds of 15% to 60% for ER-negative tumours. Within these threshold ranges, CancerMath provided the lowest clinical utility amongst all the models. Conclusions: Survival probabilities from PREDICT offer both improved accuracy and discrimination over CancerMath. Using PREDICT to make treatment decisions offers greater clinical utility than CancerMath over a range of risk thresholds. Our new models performed as well as PREDICT, but no better, suggesting that, in this setting, including further interaction terms offers no predictive benefit.
In Clinical Cancer Research,2018

# Publications

. Development and External Validation of Prediction Models for 10-Year Survival of Invasive Breast Cancer. Comparison with PREDICT and CancerMath. In Clinical Cancer Research, 2018.

. Ventilatory limitation and dynamic hyperinflation during exercise testing in Cystic Fibrosis. In Pediatric pulmonology, 52(1), 29-33, 2017.

. Isocapnic hyperpnea with a portable device in Cystic Fibrosis: an agreement study between two different set-up modalities. In Journal of Clinical Monitoring and Computing, 29(5), 569-572, 2015.

. Cystic fibrosis patients’ performance on Modified Shuttle Walk Test. In Journal of Cystic Fibrosis, 13(2), S91, 2014.

. Exercise intensity during interactive video game. In Journal of Cystic Fibrosis, 13(2), S91, 2014.

. Quantifying weight bearing activity in children and adolescents with cystic fibrosis. In Journal of Cystic Fibrosis, 13(2), S92, 2014.

. Adherence to the administration of aerosolized promixin with the I-neb adaptive aerosol delivery (AAD) system, lung function and administration times in patients with cystic fibrosis (CF). In Journal of Cystic Fibrosis, 12(1), S105, 2013.

. Exercise and sport habits in children and adolescents with cystic fibrosis. In European Respiratory Journal, 42, 2013.

. Wiihabilitation. In Italian Journal of Physiotherapy, 2(1), 39-41, 2012.

# Teaching

• I have supervised Statistics IB, Lent 2019 (2nd year undergraduate course from the Department of Pure Mathematics and Mathematical Statistics, University of Cambridge).

• This is the handbook for the course I teach as a Brilliant club tutor. The Brilliant Club is a charity aiming to increase the number of pupils from under-represented backgrounds progressing to highly selective universities. They do this by mobilising PhD researchers to share their academic expertise with state schools.

The material is designed for Key stage 4 pupils but can be easily adapted to Key stage 5.

About the course: Virtually every decision is made in the face of uncertainty. In this course, I quantify uncertainty using probability theory. I then introduce the expected utility framework as a model of choice behaviour under uncertainty.

Cite this work as: Solon Karapanagiotis. Which bicycle lock should I buy? A journey to decision making under uncertainty, 2019.

# Recent Posts

Statistics · R · Random

### Naive classification beats deep-learning

Overview Mitani and co-authors’ present a deep-learning algorithm trained with retinal images and participants’ clinical data from the UK Biobank to estimate blood-haemoglobin levels and predict the presence or absence of anaemia (Mitani et al. 2020). A major limitation of the study is the inadequate evaluation of the algorithm. I will show how a naïve classification (i.e. classify everybody as healthy) performs much better than their deep-learning approach, despite their model having AUC of around 80%.

### Approximating Binomial with Poisson

It is usually taught in statistics classes that Binomial probabilities can be approximated by Poisson probabilities, which are generally easier to calculate. This approximation is valid “when $n$ is large and $np$ is small,” and rules of thumb are sometimes given. In this post I’ll walk through a simple proof showing that the Poisson distribution is really just the Binomial with $n$ (the number of trials) approaching infinity and $p$ (the probability of success in each trail) approaching zero.

### Another solution to the 'The Hardest Logic Puzzle Ever' using probability

I present a solution to a modification of the “hardest logic puzzle ever” using probability theory. Background “The hardest logic puzzle” was originally presented by Boolos (1996) and since then it has been amended several times in order to make it harder (see B. Rabern and Rabern 2008, Novozhilov (2012)). The puzzle: Three gods A, B, and C are called, in some order, True, False, and Random. True always speaks truly, False always speaks falsely, but whether Random speaks truly or falsely is a completely random matter.

### Plastic waste and disease on coral reefs - Another misinterpretation of a statistical model

Recently, I came across this very interesting article published in Science about how plastic waste is associated with disease on coral reefs (J. B. Lamb et al. 2018). The main conclusions are contact with plastic increases the probability of disease, the morphological structure of the reefs is associated with the probability of being in contact with plastic with more complex ones being more likely to be affected by plastic,

### On statistical reporting in biomedical journals

Poor quality statistical reporting in the biomedical literature is not uncommon. Here is another example by Cirio et al. (2016). The study itself is well planed, executed and reported. The aim was to assess whether heated and humidified high flow gases delivered through nasal cannula (HFNC) improve exercise performance in severe chronic obstructive pulmonary disease (COPD) patients. It all started when I saw their Fig.1. Here is my attempt to reproduce it

# Contact

• [my two initials]921@cam.ac.uk
• MRC Biostatistics Unit, Cambridge Institute of Public Health, Forvie Site, Robinson Way, Cambridge Biomedical Campus, Cambridge CB2 0SR, UK