# An Introduction to Medical Statistics Third Edition Martin Bland

Contents
Sections marked * contain material usually found only in postgraduate courses

1 Introduction 1

1.1 Statistics and medicine 1

1.2 Statistics and mathematics 2

1.3 Statistics and computing 2

1.4 The scope of this book 3

2 The design of experiments 5

2.1 Comparing treatments 5

2.2 Random allocation 7

2.3 * Methods of allocation without random numbers 11

2.4 Volunteer bias 13

2.5 Intention to treat 14

2.6 Cross-over designs 15

2.7 Selection of subjects for clinical trials 16

2.8 Response bias and placebos 17

2.9 Assessment bias and double blind studies 19

2.10 * Laboratory experiments 20

2.11 * Experimental units / 21

2.12 * Consent in clinical trials 22

2M Multiple choice questions 1 to 6 24

2E Exercise: The ‘Know Your Midwife’ trial 25

3 Sampling and observational studies 26

3.1 Observational studies 26

3.2 Censuses 27

3.3 Sampling 27

3.4 Random sampling 29

3.5 Sampling in clinical and epidemiological studies 32

3.6 Cross-sectional studies 34

3.7 Cohort studies 36

3.8 Case-control studies 37

3.9 * Questionnaire bias in observational studies 40

3.10 * Ecological studies 42

3M Multiple choice questions 7 to 13 43
3E Exercise: Campylobacter jejuni infection 45
Summarizing data 47
4.1 Types of data 47'
4.2 Frequency distributions 47
4.3 Histograms and other frequency graphs 50
4.4 Shapes of frequency distribution 54
4.5 Medians and quantiles 56
4.6 The mean 59
4.7 Variance, range and interquartile range 59
4.8 Standard deviation 62
4A Appendix: The divisor for the variance 63
4B Appendix: Formulae for the sum of squares 64
4M Multiple choice questions 14 to 19 65
4E Exercise: Mean and standard deviation 66
Presenting data 68
5.1 Rates and proportions 68
5.2 Significant figures 69
5.3 Presenting tables 71
5.4 Pie charts 72
5.5 Bar charts 73
5.6 Scatter diagrams 75
5.7 Line graphs and time series 77
5.9 Logarithmic scales 81
5A Appendix: Logarithms 82
5M Multiple choice questions 20 to 24 84
5E Exercise: Creating graphs 86
Probability 87
6.1 Probability 87
6.2 Propert ies of probability 88
6.3 Probability distributions and random variables 88
6.4 The Binomial distribution 89
6.5 Mean and variance j 92
6.6 Propert ies of means and variances 93
6.7 * The Poisson distribution 95
6.8 * Conditional probability 96
6A- Appendix: Permutations and combinations 97
6B Appendix: Expected value of a sum of squares 98
6M Multiple choice questions 25 to 31 100
m Exercise: Probability and the life table 101
The Normal distribution 103
7.1 Probability for continuous variables 103
7.2 The Normal distribution ' 106

7.3 Properties of the Normal distribution 108

7.4 Variables which follow a Normal distribution 112

7.5 The Normal plot 114

7A Appendix: Chi-squared, £, and F 118

7M Multiple choice questions 32 to 37 120

7E Exercise: A Normal plot 121
8 Estimation

8.1 Sampling distributions

8.2 Standard error of a sample mean

8.3 Confidence intervals

8.4 Standard error and confidence interval for a proportion

8.5 The difference between two means

8.6 Comparison of two proportions

8.7 * Standard error of a sample standard deviation

8.8 * Confidence interval for a proportion when numbers are small

8.9 * Confidence interval for a median and other quantiles

8.10 What is the correct confidence interval?

8M Multiple choice questions 38 to 43

8E Exercise: Means of large samples
Significance tests 137
9.1 Testing a hypothesis 137
9.2 An example: The sign test 138
9.3 Principles of significance tests 139
9.4 Significance levels and types of error , 140
9.5 One- and two-sided tests of significance 141
9.6 Significant, real and important 142
9.7 Comparing the means of large samples 143
9.8 Comparison of two proportions 145
9.9 * The power of a test 147
9.10 * Multiple significance tests 148
9.11 * Repeated significance tests and sequential analysis 151
9M Multiple choice questions 44 to 49 152
9E Exercise: Crohn’s disease and cornflakes 153

10 Comparing the means of small samples 156

10.1 The t distribution 156

10.2 The one-sample t method 159

10.3 The means of two independent samples 162

10.4 The use of transformations 164

10.5 Deviations from the assumptions of t methods 167
10.6 What is a large sample? 168

10.7 * Serial data ,169

10.8 * Comparing two variances by the F test 171

10.9 * Comparing several means using analysis of variance 172

10.10 * Assumptions of the analysis of variance 175

10.11 * Comparison of means after analysis of variance 175

10.12 * Random effects in analysis of variance 177

10.13 * Units of analysis and cluster-randomized trials 179

10A Appendix: The ratio mean/standard error 1.81

10M Multiple choice questions 50 to 56 182

10E Exercise: The paired t method 183

11 Regression and correlation 185

11.1 Scatter diagrams 185

11.2 Regression 185

11.3 The method of least squares 187

11.4 * The regression of X on Y 190

11.5 The standard error of the regression coefficient 191

11.6 * Using the regression line for prediction 192

11.7 * Analysis of residuals 194

11.8 * Deviations from assumptions in regression 196

11.9 Correlation 197

11.10 Significance test and confidence interval for r 200

11.11 Uses of the correlation coefficient 202

11.12 * Using repeated observations 202

11.13 * Intraclass correlation 204

11A Appendix: The least squares estimates 205

11B Appendix: Variance about the regression line 205

11C Appendix: The standard error of b 206

11M Multiple choice questions 57 to 61 207

HE Exercise: Comparing two regression lines 208

12 Methods based on rank order 210

12.1 * .Non-parametric methods 210

12.2 * The Mann Whitney U testj , 211

12.3 * The Wilcoxon matched pahs test 217

12.4 * Spearman’s rank correlation coefficient p 220

12.5 * Kendall’s rank correlation coefficient, r 222

12.6 * Continuity corrections 225

12.7 * Parametric or non-parametric methods? 226

12M * Multiple choice questions 62 to 66 227

12E * Exercise: Application of rank methods 228

13 The analysis of cross-tabulations 230

13.1 The chi-squared test for association 230
13.2 Tests for 2 by 2 tables 233

13.3 The chi-squared test for small samples 234

13.4 Fisher’s exact test 236

13.5 Yates’ continuity correction for the 2 by 2 table- 238

13.6 * The validity of Fisher’s and Yates’ methods 239

13.7 Odds and odds ratios 240

13.8 * The chi-squared test for trend 243

13.9 * Methods for matched samples 245

13.10 * The chi-squared goodness of fit test 248

13A Appendix: Why the chi-squared test works 249

13B Appendix: The formula for Fisher’s exact test 251

13C Appendix: Standard error for the log odds ratio 252

13M Multiple choice questions 67 to 73 253

13E Exercise: Admissions to hospital in a heatwave 255

14 Choosing the statistical method 257

14.1 * Method oriented and problem oriented teaching 257

14.2 * Types of data 257

14.3 * Comparing two groups 258

14.4 * One sample and paired samples 260

14.5 * Relationship between two variables 261

14M Multiple choice questions 74 to 80 263

14E * Exercise: Choosing a statistical method 265

15 Clinical measurement 268

15.1 Making measurements 268

15.2 * Repeatability and measurement error 269

15.3 * Comparing two methods of measurement 272

15.4 Sensitivity and specificity / 275

15.5 Normal range or reference interval 279

15.6 * Survival data 281

15.7 * Computer aided diagnosis 288

15.8 * Number needed to treat 290

15M Multiple choice questions 81 to 86 291

15E Exercise: A reference interval 292

16 Mortality statistics and population structure 294

16.1 Mortality rates 294

16.2 Age standardization using the direct method 296

16.3 Age standardization by the indirect method 296

16.4 Demographic life tables 299

16.5 Vital statistics 302

16.6 The population pyramid 303

16M Multiple choice questions 87 to 92 305

16E Exercise: Deaths from volatile substance abuse 307
17 Multifactorial methods 308

17.1 * Multiple regression ' 308

17.2 * Significance tests and estimation in multiple

regression ЗЮ

17.3 * Interaction in multiple regression 313

17.4 * Polynomial regression 314

17.5 * Assumptions of multiple regression 315

17.6 * Qualitative predictor variables 316

17.7 * Multi-way analysis of variance 318

17.8 * Logistic regression 321

17.9 * Survival data using Cox regression 324

17.10 * Stepwise regression 326

17.11 * Meta-analysis: Data from several studies 326

17.12 * Other multifactorial methods 330

17M * Multiple choice questions 93 to 97 330

17E * Exercise: A multiple regression analysis 333

18 Determination of sample size 335

18.1 * Estimation of a population mean 335

18.2 * Estimation of a population proportion 336

18.3 * Sample size for significance tests 336

18.4 * Comparison of two means 339

18.5 * Comparison of two proportions 341

18.6 * Detecting a correlation 343

18.7 * Accuracy of the estimated sample size 344

18.8 * Trials randomized in clusters 344

18M * Multiple choice questions 98 to 100 346

18E * Exercise: Estimation of sample sizes 347

19 Solutions to exercises 348

References 381

Index 391

`