This website uses cookies

This websites contains videos from YouTube. This company uses cookies (third party cookies). If you do not want them to use these cookies, you can indicate so here. However, this does mean that you will not be able to watch videos on this website. We also make use of our own cookies in order to improve our website. We don't share our data with other parties. Which cookies are involved?

This website uses cookies to enable video and to improve the user experience. If you do not want to accept these cookies, indicate so here. Which cookies are involved?

Ga direct naar de inhoud, het hoofdmenu, het servicemenu of het zoekveld.



Course outline

The course will explain statistical techniques for the evaluation of biomedical data. It provides an introduction into design aspects, methods of summarizing and presenting data, estimation, confidence intervals and hypothesis testing, including multivariable regression methods for the assessment of association. Although the course focusses somewhat on methods and examples from clinical research, it should be useful for experimental researchers as well. The emphasis will be on practical application and interpretation rather than theory.

An optional half-day introduction to the statistical software SPSS will be given for those not familiar with this program.

The next course is already fully booked.
If you register, you will be put on the waiting list.

  • Date: Dec 2-6, 2019
  • Time: 9:00-17:00 h
  • Optional half-day "Introduction to SPSS" on Friday Nov 29 at 13:00 h in Z4
  • Location: Netherlands Cancer Institute (NKI), Amsterdam
  • Language: English

More Information

General Information
Sluiten Sluit icoon

General Information


Course format

All course days will be divided as follows

  • 9:00-10:30 h: session (Piet Borst Auditorium)
  • 10:30-12:30 h: practical (Z4 and Z5)
  • 13:30-15:00 h: session (Piet Borst Auditorium)
  • 15:00-17:00 h: practical (Z4 and Z5)

During the sessions, the basic concepts will be presented and illustrated with examples. During the computer practicals, you will work on data analysis excercises while faculty is present to assist you and answer any questions you might have. Please bring your own laptop for the practicals. If you do not have a laptop, we may be able to provide you with one or you may have to share one with another course participant.

Please be prepared to do some homework before and during the course, e.g., read scientific papers or book chapters, discuss analytic approaches and interpretations of results by others.

What will and what will not be covered?

We plan to cover the following topics.

  • Data transformations
  • Analysing numerical data (related groups, unrelated groups, more than 2 groups)
  • Analysing categorical data (two proportions, more than 2 categories)
  • Specific tests (Jonckheere-Terpstra test, Cochran-Armitage trend test)
  • Linear regression
  • Logistic regression
  • Time-to-event analysis (survival analysis, Kaplan Meier, regression models)
  • Clinical trials
  • Case-control studies

Due to the limited time, we will not be able to cover the following topics.

  • Diagnostic tools (Gold standard, sensitivity, specificity, true/false positive/negative, ROC curve, prediction)
  • Competing risks analysis
  • Adaptive clinical trial design and analysis
  • Multiple imputation techniques for missing data
  • Longitudinal data analysis (random effects, multilevel models)
  • Meta-analysis
  • Growth curve analysis
  • Enzyme kinetics curve analysis
  • Limiting or serial dilution assay analysis
  • Pharmacokinetic models
  • Statistical process control
  • Nonlinear and nonparametric regression (exponential decay, equilibrium binding)
  • Repeated measures ANOVA


The use of computers is essential during the practicals in order to perform statistical analyses of data sets provided by us. However, our institute does not have a computer classroom sufficiently large to fit this group. We therefore ask you to bring your own laptop computer to all practicals.

Statistical software

SPSS software (version 25.0) will be used to illustrate the statistical analysis of example data sets. You may of course use other software, but we may not be able to assist you with software-specific issues. If you want to use SPSS, you are expected to have one of the more recent SPSS versions (22+) installed on the computer you plan to bring to the practicals.
SPSS is available for laptops from the institute to employees of the NKI via the I&A Service Desk (H1-1809). For private laptops or people outside the AVL-NKI: employees and students of most Dutch universities can buy SPSS at for € 10
For those not familiar with SPSS software, a half-day introduction will be offered. The introduction consists of an overview presentation and a computer practical, and covers the following topics.

  • Importing and exporting data
  • Combining data sets
  • Creating, recoding and transforming variables
  • Subsetting variables and observations
  • Labeling and documenting data
  • Sorting and splitting data
  • Simple descriptive analyses
  • Exporting results

Attending the SPSS introduction is optional, but if you do attend, please bring your own computer.

Who should attend the course?

Scientists with some limited previous training in statistics who now wish to understand statistical concepts more thoroughly in order to conduct their own statistical analyses or interpret the results of others.

How you will benefit

This course provides a practical introduction to a wide range of statistical methods. There will be plenty of opportunity for discussion with faculty on appropriate methods of analyzing data and help will be provided with interpreting results.

  • PBA.jpg
Course Material
Sluiten Sluit icoon

Course Material


All the course materials (slides, data sets, exercise sheets, suggested reading, etc.) can be downloaded by sessions (S) and practicals (P). Please note that the website will be updated regularly, and contents may slightly change. Handouts of the most recent version of the slides will be provided before each session, and exercise sheets will be provided before the practicals.

During sessions, the basic concepts will be presented and illustrated with examples. During the computer practicals, you will work on data analysis exercises while faculty is present to assist you and answer any questions you might have. At the end of each day, the exercise sheets with suggested answers will be posted on the website.

Sessions will be held in the Piet Borst Auditorium (PBA) and practicals in room Z4 (next to PBA). Please bring your own laptop for the SPSS introduction (if you attend) and the practicals. All data sets on the website below (scroll down to the "Data sets" section) should have been downloaded to the laptop. For those who indicated not having a laptop, we will provide one for the practicals.

We recommend preparing for the course by reading the papers accompanying some of the data sets, as well as papers or book chapters provided under ''Supplementary Material''.

Day 0

13:00-17:00 h: S0 Introduction to SPSS

Day 1

9:00-10.30 h: S1 Distributions, sampling and estimation

  • Introduction
  • Slides
  • Suggested reading
    Bland JM, Altman DG. Transforming data. BMJ 1996; 312:770. PDF
    Keene O. The log transformation is special. Statist Med 1995; 14:811-819. PDF

10:30-12.30 h: P1 Distributions, sampling and estimation

13:30-15.00 h: S2 Hypothesis testing

  • Slides
  • Suggested reading
    Bland JM, Altman DG. Absence of evidence is not evidence of absence. BMJ 1995; 311:485. PDF
    Bland JM, Altman DG. One and two sided tests of significance. BMJ 1994; 309:248. PDF
    Victor A et al. Judging a Plethora of p-Values. Dtsch Arztebl Int 2010; 107:50-56. PDF

15:00-17.00 h: P2 Hypothesis testing

  • Exercise sheet
  • Suggested answers
  • Suggested reading
    Du Prel JB et al. Confidence Interval or P-Value? Dtsch Arztebl Int 2009; 106:335-339. PDF

Day 2

9:00-10.30 h: S3 Analysis of categorical data

  • Slides
  • Suggested reading
    Bewick V et al. Statistics review 8: Qualitative data - tests of association. Critical Care 2004; 8:46-53. PDF
    Bewick V et al. Statistics review 10: Further nonparametric methods. Critical Care 2004; 8:196-199. PDF
    Petrie A, Sabin C. Medical Statistics at a Glance. Wiley-Blackwell, 3rd Edition, 2009. Book website. Pages 66-74.

10:30-12.30 h: P3 Analysis of categorical data

13:30-15.00 h: S4 Analysis of numerical data

  • Slides
  • Suggested reading
    Du Prel JB et al. Choosing Statistical Tests: Part 12 of a Series on Evaluation of Scientific Publications. Dtsch Arztebl Int 2010; 107(19):343-348. PDF
    Whitley E, Ball J. Statistics review 5: Comparison of means. Crit Care 2002; 6(5):424-428. PDF
    Whitley E, Ball J. Statistics review 6: Nonparametric methods. Crit Care 2002; 6(6):509-13. PDF
    Bewick V et al. Statistics review 9: one-way analysis of variance. Crit Care 2004; 8(2):130-6. PDF
    Bewick V et al. Statistics review 10: Further nonparametric methods. Crit Care 2004;8(3):196-9. PDF

15:00-17.00 h: P4 Analysis of numerical data

Day 3

9:00-10.30 h: S5 Correlation and simple linear regression

  • Slides
  • Suggested reading
    Schneider A et al. Linear regression analysis: Part 14 of a series on evaluation of scientific publications. Dtsch Arztebl Int 2010; 107(44):776-82. PDF
    Bewick V et al. Statistics review 7: Correlation and regression. Crit Care 2003; 7(6):451-59. PDF

10:30-12.30 h: P5 Correlation and simple linear regression

13:30-15.00 h: S6 Multiple linear regression

15:00-17.00 h: P6 Multiple linear regression

Day 4

9:00-10.30 h: S7 Logistic regression

  • Slides
  • Suggested reading
    Bewick V et al. Statistics review 14: Logistic regression. Critical Care 2005; 9:112-118. PDF
    Petrie A, Sabin C. Medical Statistics at a Glance. Wiley-Blackwell, 3rd Edition, 2009. Book website. Pages 88-91.

10:30-12.30 h: P7 Logistic regression

13:30-15.00 h: S8 Univariable survival analysis

  • Slides
  • Suggested reading
    Clark TG et al. Survival Analysis Part I: Basic concepts and first analyses. British Journal of Cancer 2003; 89:232-238. PDF
    Petrie A, Sabin C. Medical Statistics at a Glance. Wiley-Blackwell, 3rd Edition, 2009. Book website. Pages 133-135.

15:00-17.00 h: P8 Univariable survival analysis

Day 5

9:00-10.30 h: S9 Multivariable survival analysis

  • Slides
  • Suggested reading
    MJ Bradburn et al. Survival Analysis Part II: Multivariate data analysis - an introduction to concepts and methods. British Journal of Cancer 2003; 89:431-436. PDF
    MJ Bradburn et al. Survival Analysis Part III: Multivariate data analysis - choosing a model and assessing its adequacy and fit. British Journal of Cancer 2003; 89:605-611. PDF
    Clark TG et al. Survival Analysis Part IV: Further concepts and methods in survival analysis. British Journal of Cancer 2003; 89:781-786. PDF
    David Garson's website

10:30-12.30 h: P9 Multivariable survival analysis

13:30-15.00 h: S10 Summary

15:00-17.00 h: P10 Summary

Download all data sets here

Data sets:

  • Leukemia data SPSS (anderson.sav)
    The dataset includes remission time data for two groups of leukemia patients with 21 patients in each group.

    sex - gender
    rx - treatment (0=treatment, 1=placebo)
    logWBC - log white blood cell count, a well-known prognostic indicator of survival for leukemia patients
    lwbc3 - logWBC divided into low, medium and high values status - event indicator (1=relapse, 0=censored)
    survt - time to relapse or end of study
  • Antibody data SPSS (antibody.sav)
    Concentration of antibody to type II group B Streptococcus in 20 volunteers before and after inmunisation.

    before - concentration before inmunisation
    after - concentration after inmunisation
  • Arm strength data SPSS (armstrength.sav)

    alcohol - lifetime alcohol intake (kg alcohol per kg bw)
    armstrength - strength of the deltoid muscle in non-dominant arm (kg)
  • AZT data SPSS (azt.sav)
    Response to serum antigen level to AZT in 20 AIDS patients (Makutch and Parks, 1998).

    id - subject identification number
    preazt - pre-treatment antigen level
    postazt - post-treatment antigen level
  • Blood pressure data SPSS (bloodpress.sav)
    Blood pressure data of 20 high blood pressure patients.

    BP - blood pressure
    age - age in years
    weight - weight in kg
    BSA - body surface area
    dur - duration of hypertension in years
    pulse - basal pulse in beats per minute
    stress - stress index
  • Blood pressure data SPSS (bp.sav)
    Diastolic blood pressure (mm Hg) measured on 4 subjects in a treatment group and 11 subjects in a control group.

    pressure - diastolic blood pressure of each subject
    group - treatment or control
  • Brain dominance data SPSS (braindom.sav)
    Study into how different kinds of brain dominance (left-brained, right-brained or integrative) affect the ability to recall information of various types for a sample of 24 subjects.

    score - score in recall test
    brain - type of brain dominance (left, right, both)
  • Broccoli data SPSS (brocolli.sav)
    92 children who don't like broccoli & 77 children who like broccoli, all take new BroccoYum pills for a week, then: 14 switched from not liking broccoli to liking broccoli, 3 switched in the opposite direction, the remaining children stayed the same

    like_before - liked broccoli before taking BroccoYum (0=no, 1=yes)
    like_after - liked broccoli after taking BroccoYum (0=no, 1=yes)
    count - frequency of observation
  • 1-Bromopropane data SPSS (bromo.sav)
    US National Toxicology Program (NTP) long-term study on 1-bromopropane (1-BP), including 50 mice of each gender exposed to four levels of 1-BP (0, 125, 250, 500 ppm) and followed until death over two years (with no interim sacrifices).

    sex - gender (0=female, 1=male)
    dose - dose of 1-bromopropane in ppm
    event - event indicator (1=death, 0=censored)
    time - time to death or end of study (730 days for males, 729 for females)
    count - frequency of observation
  • Diabetes data SPSS (btgdiabet.sav)
    Comparison of urinary beta-thromboglobulin (beta-TG) excretion in 12 normal subjects and in 12 diabetic patients (Kirkwood 2003)

    group - diabetic or normal
    btg - beta-thromboglobulin excretion
  • Caffeine data SPSS (caffeine.sav)
    Cohort of consecutive pregnant women booking to deliver their baby at one hospital

    id - subject identification number
    caffearly - serum caffeine during early pregnancy (approximately 17 weeks gestation)
    cafflate - Serum caffeine during late pregnancy (approximately 36 weeks gestation)
  • CMV data SPSS (cmv.sav)
    Formaldehyde and acetone fixations were compared in study of cytomegalovirus antigenemia assay (Perez et al., J Clin Microbiol 1995)

    AC_detected - detected by acetone (0=no, 1=yes)
    FA_detected - detected by formaldehyde (0=no, 1=yes)
    count - frequency of observation
  • Color data SPSS (color.sav)
    Eye color & hair color of 762 children from 2 geographical regions

    region - region (1, 2)
    eyes - eye color (1=blue, 2=green, 3=brown)
    hair - hair color (1=dark, 2=medium, 3=fair, 4=black, 5=red)
    count - frequency of observation
  • Drug toxicity data SPSS (drugtox.sav)
    Patients treated with 4 doses of drug & monitored for toxicity (Hoyle: Statistical strategies for small sample research 1999)

    dose - drug does in mg
    tox - degree of toxicity (1=mild, 2=moderate, 3=severe, 4=drug death)
    count - frequency of observation
  • Esophagitis data SPSS (esophagitis_rt.sav)
    Contains data on esophagitis of lung patients treated with radiotherapy with/without additional or concurrent chemotherapy.
    Accompanying paper
  • Galactose data SPSS (galactose.sav)
    Measurements of galactose binding in three groups of patients (Weldon).

    group - patient group: Crohn's disease, ulcerative colitis, controls
  • Head injury data SPSS (HeadInjury.sav)

    interval - time between injury and surgery
    alcohol - indication whether a person was under the influence of alcohol (0=no, 1=yes)
    anaesth - indication whether a person had a general anaesthesia during the surgery (0=no, 1=yes)
    distance - distance to hospital
    outcome (0=patient died, 1=patient recovered)
    verbal_ad - verbal response on admission to hospital
  • Hormone data SPSS (hormone.sav)
    Results of two assay experiments for a certain hormone.

    reference - the results of the assay experiment using the old (reference) method
    test - the results of the assay experiment using the new (test) method
  • Levamisole colon cancer data SPSS (LevamisoleColonCaner.sav)
    Accompanying papers: Lin (1994) and Moertel et al. (1990)

    Id - patient number
    Study - identification of the study
    Treatment - treatment type (1=observation group; 2=levamisole; 3=fluorouracil+levamisole)
    Registration date
    Obstruction (0=no occurrence; 1=occurrence)
    Perforation (0=no occurrence; 1=occurrence)
    Adherence (=no; 1=yes)
    Pos nodes - number of positive lymph nodes
    Progression date
    Progression status (0=no occurrence; 1=occurrence)
    End follow-up - date of last contact with patient or death
    End status - vital status (0=alive; 1=dead)
    Survival time - time to death or last contact with patient (days)
    Progression time - time to progression (days)
  • Life satisfaction data SPSS (lifesatisfaction.sav)

    LifSat - life satisfaction on a scale from 1 to 100 (higher value indicates higher satisfaction)
    Age - age (years)
    Education - years of education
    Gender - sex (1=female, 0=male)
    ChildSup - children's support received on a scale from 1 to 10 (higher value indicates higher support received)
    SpouSup - spouse support received on a scale from 1 to 10 (higher value indicates higher support received)
  • Mussel data SPSS (mussel.sav)
    Allele frequencies at the Lap locus in the mussel Mytilus trossulus on the Oregon coast (McDonald and Siebenaller, Evolution 1989) at four estuaries, samples taken from inside the estuary & from marine habitat outside the estuary; there were 3 common alleles and a couple of rare alleles, here grouped into 94 and "non-94" alleles number of 94 and non-94 alleles by location

    location - location (Tillamook, Yaquina, Alsea, Umpqua)
    habitat - habitat (marine, estuarine)
    allele - type of allele (94, non-94)
    count - frequency of observation
  • Space-shuttle o-ring data SPSS (oring.sav)
    Data on temperature and O-ring failures from 24 previous space shuttle flights (Feynmann: Why do we care what other people think 1988)

    oring - number of o-ring failures
    temp - temperature at take-off (F)
  • Pancreas data SPSS (pancreas.sav)
    Effectiveness of three types of pancreatic supplements on fat absorption in 6 patients with steatorrhea in grams/day (van Belle 2004).

    subject - patient number
    type - form of supplements: none (control), table, capsule, enteric-coated tablet
    effectiveness: effectiveness of the supplement (grams/day)
  • Pudendal nerve terminal motor latency data SPSS (pudendal.sav)
    Five year follow-up of 8 patients receiving hyperbaric oxygen therapy for faecal incontinence (Bland and Altman, 2009).

    before - initial pudendal nerve terminal motor latency (ms)
    after - pudendal nerve terminal motor latency (ms) after 5 years
  • QoL data SPSS (QoL.sav)

    Post_Qol - quality of life after cosmetic surgery
    Base_QoL - quality of life before cosmetic surgery
    surgery (0=cosmetic surgery, 1=cosmetic surgery + meeting with pschylogist)
    Reason - reason for surgery (0=physical reason, 1=change of appearance)
  • Serum data SPSS (serum.sav)
    Serum triglyceride concentration in blood cord for 282 babies (Bland and Altman, 1996)

    id - subject identification number
    serumtrigl - serum triglyceride concentration
  • Stress data SPSS (stress.sav)
    Experiment to investigate whether the drugs levorphanol and/or epinephrine reduce stress. Each treatment given to five animals and the cortical sterone level was measured (Kleinbaum et al. 1998).

    level - level of cortical sterone
    levor - presents or absence of levorphanol in treatment
    epine - presenc or absence of epinephrine in treatment
  • Acute toxicity trial SPSS (trial_acutetox.sav)
    Contains toxicity data of the prostate cancer patients included in the radiotherapy for localized prostate cancer trial.

    studnr - patient study number
    maxarect - acute toxicity to rectum
    maxablad - acute toxicity to bladder
  • Radiotherapy for localized prostate cancer SPSS (trial_rt.sav)
    Contains clinical data of a randomized trial with prostate cancer patients receiving radiotherapy at 2 dose levels.
    Accompanying paper
  • Trismus data SPSS (trismus.sav)

    patnr - patient's identification number
    sexe - gender
    Trismus - post-treatment trismus (mouth opening < 35mm)
    difmouth - reduction in mouth opening (mm)
    difmouth_pct - relative change to baseline (mm)
    CM_mean - mean dose contralateral masseter
    IM_mean - mean dose ipsilateral masseter
  • Tumor volume data SPSS (tumorvolume.sav)
    Accompanying paper
    SPSS code for transformations
  • Melanoma data SPSS (usemelanoma.sav)

    Mortality - mortality rate due to malignant melanoma of the skin (number of deaths per 10 million people)
    Latitude - latitude of geographic center of a state (degrees north)
  • Wheeze data SPSS (wheeze.sav)
    Cross-sectional survey among 4,010 children aged 13-14 yrs in Brazil (Cassol et al., Jornal de Pediatria 2005)

    BMI - body mass index (1=underweight, 2=normal, 3=overweight, 4=obese)
    wheeze - wheezing after exercise (0=no, 1=yes)
    count - frequency of observation
  • Material statistics.JPG
Supplementary Material
Sluiten Sluit icoon

Supplementary Material

                                                                                                                   The supplementary material can be useful resources for further reading. The list will be updated before the start of the next course. 

General overview

  • Glossary of statistical terms
  • UCLA Statistical computing website (very useful!) including an overview of when to do which test with SPSS code

Medical statistics

  • A Petrie & C Sabin: Medical statistics at a glance. Blackwell.  
  • JW Twisk: Inleiding in de toegepaste biostatistiek. Elsevier.  
  • AR Feinstein: Principles of medical statistics. Chapman Hall.  
  • G van Belle, LD Fisher, P Heagerty, T Lumley: Biostatistics - A methodology for the health sciences. Wiley Interscience.  
  • British Medical Journal: Statistics Notes
            A series of short articles on the use of statistics started in 1994 by the British Medical Journal.        The full text of all but the first ten articles is available here.

Design and analysis of experiments

  • John H. McDonald: Handbook of Biological Statistics
  • John H. McDonald: Biological Data Analysis Course
  • Festing MF, Altman DG. Guidelines for the design and statistical analysis of experiments using  laboratory animals. ILAR J 2002; 43(4): 244-58.  
  • Haseman JK. Statistical issues in the design, analysis and interpretation of animal        carcinogenicity studies. Environ Health Perspect 1984; 58: 385-92.  
  • Fairweather WR, Bhattacharyya A, Ceuppens PP, Heimann G, Hothorn LA, Kodell RL, Lin KK, Mager H, Middleton BJ, Slob W, Soper KA, Stallard N, Ventre J, Wright J. Biostatistical methodology in carcinogenicity studies. Drug Infor J 1998; 32: 401-421.  
  • Festing MFW. Guidelines for the design and statistical analysis of experiments in papers submitted to ATLA. ATLA 2001; 29: 427-446.  
  • Finney DJ. 1978. Statistical Method in Biological Assay. 3rd Ed. London: Charles Griffin & Company Ltd.  
  • Montgomery DC. 1997. Design and Analysis of Experiments. 4th Ed. New York: John Wiley & Sons.  
  • Mead R. 1988. The Design of Experiments. Cambridge: Cambridge University Press.  
  • Maxwell SE, Delaney HD. 1989. Designing experiments and analyzing  data. Belmont CA: Wadsworth Publishing Company.

Analysis of survival data

  • Clark TG, Bradburn MJ, Love SB, Altman DG. Survival analysis part I: basic  concepts and first analyses. Br J Cancer 2003; 89(2): 232-8. Full text
  • Bradburn MJ, Clark TG, Love SB, Altman DG. Survival analysis part II: multivariate data analysis-an introduction to concepts and methods. Br J Cancer 2003; 89(3): 431-6. Full text
  • Bradburn MJ, Clark TG, Love SB, Altman DG. Survival analysis Part III: multivariate data analysis - choosing a model and assessing its adequacy and fit. Br J Cancer 2003; 89(4): 605-11. Full text
  • Clark TG, Bradburn MJ, Love SB, Altman DG. Survival analysis part IV: further concepts and methods in survival analysis. Br J Cancer 2003; 89(5): 781-6. Full text

For links to online statistical calculators and other relevant websites, please click here.

  • Supplementary material.JPG
Sluiten Sluit icoon


                                                                                                             For You can register by filling in this form (the course is fully booked in 2019 so you can participate for sure in the fall of 2020. If a place gets available, I will contact you)
You have to bring your own laptop for the practicals with the program SPSS. 

For questions, call Patty Lagerweij : or call 020-5126973.

Fee                                                                                                         The course is free of charge for employees of the NKI-AvL and for Ph.D. students of the OOA (Onderzoeksschool Oncologie Amsterdam). For all others, the fee is EUR 900 including the "Introduction to SPSS", course materials and coffee/tea. The fee does not include meals.

  • Registration Form
Sluiten Sluit icoon



Michael Hauptmann

Michael Hauptmann received a Ph.D. in Statistics from the University of Dortmund (Germany) in 1999. In the same year, he joined the Biostatistics Branch of the Division of Cancer Epidemiology and Genetics of the National Cancer Institute in Bethesda, Maryland (USA), as a postdoctoral fellow, and became a tenure-track investigator in 2004. From 2006 to 2019, Dr. Hauptmann was a senior statistician and head of the Biostatistics group at the Netherlands Cancer Institute in Amsterdam (The Netherlands). Since 2019, Michael Hauptmann is a professor of Biostatistics and Registry Research at the Brandenburg Medical School in Neuruppin, Germany. His research interests include statistical methods for the evaluation of health effects from medical radiation exposure, design and statistical analysis of clinical studies for predictive marker evaluation, prediction of risks for cancer and cardiovascular disease among cancer survivors, and innovative design and statistical analysis of animal studies in cancer research. Dr. Hauptmann is a Statistical Editor for the Journal of the National Cancer Institute and the Journal of Clinical Oncology, and is a member of the International Commission on Radiological Protection and of the Board of Advisors of the Dutch Cancer Society.

Katarzyna Jozwiak

Dr. Katarzyna Jozwiak obtained a Master's degree in Applied Mathematics from Delft University in 2008, and in Econometrics and Computer Science from the University of Zielona Góra, Poland, in 2009. As a graduate student in Applied Statistics at Utrecht University, she investigated optimal designs of trials with discrete-time survival endpoints and completed her PhD in 2013. After a brief period as software developer at Utrecht University, Dr. Jozwiak joined the Netherlands Cancer Institute in Amsterdam, where she was a statistical consultant for clinicians and other researchers of the Institute and the Antoni van Leeuwenhoek hospital till June 2019. Now she is a senior statistician at the Institute of Biostatistics and Registry Research at Brandenburg Medical School Theodor Fontane in  Neuruppin, Germany.

John Zavrakidis

Mr John Zavrakidis obtained his Master's degree in Statistical Science for the Life and Behavioral Sciences from Leiden University in 2017. His master thesis focused on investigating  proper combination of multiple imputation and cross-validation in calibration of Cox regression model. In the summer of 2017, Mr Zavrakidis  joined the Netherlands Cancer Institute in Amsterdam, where he works as a junior researcher. His main task  is to  develop an infrastructure for optimal design and innovative statistical analysis of animal studies conducted at the NKI.

Sander Roberti

Mr Sander Roberti obtained a Master's degree in Mathematics from Radboud University Nijmegen in 2017. His master's thesis compared different statistical methods for analysing the treatment effect using clinical trials with multiple post-treatment measurements. In December 2017, he joined the Netherlands Cancer Institute as a PhD student. His project focuses on developing methods to assess the cancer risk from therapeutic radiation exposure, incorporating data on the spatial distribution of the radiation dose in the target organ.

Anna Morra

Ms Anna Morra obtained her Master degree in Mathematics with specialization in Statistical sciences at the University of Leiden. As a part of her study, she did an internship within the Statistical Genetics group in the department of Medical Statistics and Bioinformatics of the Leiden University Medical Center. In March 2016 Ms Morra joined the group of Marjanka Schmidt as a research assistant and she is currently working on statistical analyses to investigate the association of  genetics variants and pre-diagnostic risk factors with the survival of breast cancer patients, with particular interest in the associations within specific tumor and treatment subtypes. From September 2016 she started her PhD trajectory.

Daniele Giardiello

Mr Daniele Giardiello obtained his MSc in Biostatistics and Experimental Statistics at the University of Milan-Bicocca in 2012. After that he worked as a statistician at the National Cancer Institute in Milan in the group of dr. Luigi Mariani. In November 2016, Mr Giardiello started his PhD trajectory in the group of Marjanka Schmidt's as a statistician to work on the project of developing and validating an online decision aid for physicians and patients in order to estimate the risk of controlateral breast cancer.


  • Teachers.jpg
Share this page