A simple method for statistical analysis of intensity differences in microarray-derived gene expression data

Background Microarray experiments offer a potent solution to the problem of making and comparing large numbers of gene expression measurements either in different cell types or in the same cell type under different conditions. Inferences about the biological relevance of observed changes in expression depend on the statistical significance of the changes. In lieu of many replicates with which to determine accurate intensity means and variances, reliable estimates of statistical significance remain problematic. Without such estimates, overly conservative choices for significance must be enforced.

      Results
      A simple statistical method for estimating variances from microarray control data which does not require multiple replicates is presented. Comparison of datasets from two commercial entities using this difference-averaging method demonstrates that the standard deviation of the signal scales at a level intermediate between the signal intensity and its square root. Application of the method to a dataset related to the β-catenin pathway yields a larger number of biologically reasonable genes whose expression is altered than the ratio method.


      Conclusions
      The difference-averaging method enables determination of variances as a function of signal intensities by averaging over the entire dataset. The method also provides a platform-independent view of important statistical properties of microarray data.

Data and Resources

Field Value
accessLevel public
bureauCode {009:25}
catalog_@context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
catalog_@id https://healthdata.gov/data.json
catalog_conformsTo https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy https://project-open-data.cio.gov/v1.1/schema/catalog.json
identifier https://healthdata.gov/api/views/qxmy-uies
issued 2025-07-14
landingPage https://healthdata.gov/d/qxmy-uies
modified 2025-09-06
programCode {009:033}
publisher National Institutes of Health
resource-type Dataset
source_datajson_identifier true
source_hash da444972b8dd7889ba724d0b001a843c40e131d2c72bf3b2734b18778b6b1512
source_schema_version 1.1
theme {NIH}
Groups
  • AmeriGEOSS
  • National Provider
  • North America
Tags
  • AmeriGEO
  • AmeriGEOSS
  • CKAN
  • GEO
  • GEOSS
  • National
  • North America
  • United States
  • gene-expression
  • intensity-differences
  • microarray-data
  • nih
  • statistical-analysis
isopen False
license_id notspecified
license_title License not specified
maintainer NIH
maintainer_email info@nih.gov
metadata_created 2025-09-24T16:13:40.946612
metadata_modified 2025-09-24T16:13:40.946620
notes Background Microarray experiments offer a potent solution to the problem of making and comparing large numbers of gene expression measurements either in different cell types or in the same cell type under different conditions. Inferences about the biological relevance of observed changes in expression depend on the statistical significance of the changes. In lieu of many replicates with which to determine accurate intensity means and variances, reliable estimates of statistical significance remain problematic. Without such estimates, overly conservative choices for significance must be enforced. Results A simple statistical method for estimating variances from microarray control data which does not require multiple replicates is presented. Comparison of datasets from two commercial entities using this difference-averaging method demonstrates that the standard deviation of the signal scales at a level intermediate between the signal intensity and its square root. Application of the method to a dataset related to the β-catenin pathway yields a larger number of biologically reasonable genes whose expression is altered than the ratio method. Conclusions The difference-averaging method enables determination of variances as a function of signal intensities by averaging over the entire dataset. The method also provides a platform-independent view of important statistical properties of microarray data.
num_resources 1
num_tags 13
title A simple method for statistical analysis of intensity differences in microarray-derived gene expression data