## Abstract

This chapter explains how linear regression can be applied to model relationships in biological data. The example response variables include brain weight, size of fish populations, HDL (high-density lipoprotein) cholesterol level, and diabetes progression. The statistical software R and Stata are used to perform the analyses. The main tools the authors use to validate regression assumptions are plots involving standardized residuals and/or fitted values. The chapter then considers the marginal model plots, which have wider application than residual plots. Examination of the residual plots demonstrate whether the assumption of constant error variance is reasonable. The chapter discusses how transforming the variables can lead to a valid model. It also shows how to assess the extent of collinearity among the predictor variables.

Original language | English |
---|---|

Title of host publication | Biological Knowledge Discovery Handbook |

Subtitle of host publication | Preprocessing, Mining and Postprocessing of Biological Data |

Pages | 445-475 |

Number of pages | 31 |

ISBN (Electronic) | 9781118617151 |

DOIs | |

State | Published - 2014 |

### Bibliographical note

Publisher Copyright:© 2014 John Wiley & Sons, Inc.

## ASJC Scopus subject areas

- Computer Science (all)