Department of Mathematics and Statistics

Cost-efficient design of observational studies


Contact person: Prof. Juha Karvanen

Efficient allocation of resources is desirable in all levels of society. Research is not an exception: scientific studies, whether experimental or observational, may be very expensive to carry out.

The objective is to interpret the design problems encountered in real life research
work in the framework of Bayesian optimal design, derive guidelines for cost-efficiency and carry out efficient analysis for the data collected according to the selected design. Here study design refers to all decisions made on the data collection in both observational and experimental setup.

Cost-efficient study design in epidemiology and genetics

In observational studies, the design decisions are related to the data collection itself. In survey sampling, the questions on the sample size and sample stratification are fundamental. Unequal sampling  probilities often improve the efficiency but also complicate the data analysis. In epidemiology, designs such as case-control design and case-cohort design, are used improve the cost-efficiency. The basic principle behind these designs is to enrich the data with the cases (e.g. death due to heart attack), which are relatively rare in the population to be studied. Compared to simple random samples, this leads to significantly smaller sample sizes. In two-stage or multi-stage designs, subsamples of individuals are selected for expensive or time-consuming measurements such as genotype or biomarker specification or brain imaging on the basis of variables measured at the first stage of the study. The multi-stage observational design resembles the batch sequential design of experiments but there are also important differences when causality is considered.

Figure 2: In the case-cohort design, the expensive measurements are carried out only for a small subset of the cohort.

Selected publications:

J. Reinikainen, J. Karvanen, H. Tolonen, Optimal selection of individuals for repeated covariate measurements in follow-up studies. Statistical Methods in Medical Research, Volume 25 issue 6, pages 2420-2433, 2016.

J. Karvanen, S. Kulathinal, D. Gasbarra, Optimal designs to select individuals for genotyping conditional on observed binary or survival outcomes and non-genetic covariates. Computational Statistics & Data Analysis, Volume 53, pages 1782–1793, 2009.

S. Kulathinal, J. Karvanen, O. Saarela, K. Kuulasmaa, for the MORGAM Project, Case-cohort design in practice — experiences from The MORGAM Project. Epidemiologic Perspectives & Innovations 4:15, 2007.

Completed PhD thesis: Jaakko Reinikainen, Efficient design and modeling strategies for follow-up studies with time-varying covariates, Defense 2015-11-20

  • Article II (submitted)

Collaborators: Dr. Hanna Tolonen, Dr. Tommi Härkänen, Prof. Kari Kuulasmaa (National Institute for Health and Welfare), Dr. Olli Saarela (McGill University, Montreal) and Prof. Mikko Sillanpää (University of Oulu)

Related projects: Cost-efficient design of experiments,    Non-participation in health examination surveys