Methods of Microarray Data Analysis II: Papers from CAMDA ’01

Methods of Microarray Data Analysis II: Papers from Camda Microarray technology is a major experimental tool for functional genomic explorations.
Table of contents

Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required. To get the free app, enter your mobile phone number. Sponsored Products are advertisements for products sold by merchants on Amazon. When you click on a Sponsored Product ad, you will be taken to an Amazon detail page where you can learn more about the product and purchase it.

About This Item

To learn more about Amazon Sponsored Products, click here. Would you like to tell us about a lower price? Learn more about Amazon Prime. Read more Read less. Kindle Cloud Reader Read instantly in your browser. Sponsored products related to this item What's this? Page 1 of 1 Start over Page 1 of 1. Learn what really happens inside the body to cause fat accumulation and keep the fat off long-term. In the Wrong Hands: Cancer Can't Crush Us: This true story will inspire you to push through what you are facing. Learn where your true source of strength comes from.

Statistical methods in microarray analysis - IMS

Make these simple, inexpensive changes to your diet and sta Experience a happier, healthier life. Learn simple, inexpensive changes to your diet that will make you feel better within 24 hours! All Calculation tricks at a s How to Memorize Formulas in Mathematics: Looks at interesting computational aspects of seqeunce-based biology, and sometimes takes an odd sideways view of problems, getting deep into the underlying mathematics.

A great complement to a more straightforward survey book. Runs thorugh algorithmics of restircition digests to microarrays including, of course, sequencing by hybridization. Covers algorithmics of genomic comparisons. Great source book for a serious comp bio student. From DNA spotted on glass to enzymatic oligo arrays to ink jets, to microelectronic arrays.

Written early on when many different approaches were being explored. Great source for technical information. Written by the guy behind KEGG. An eclectic choice of topics, first databases, then sequence analysis basics, then network analysis. Had a few pages on comparing networks that I found useful. Very short, too short to explain how to do things, but describes what to do, issues to consider, and what results will look like.

The linked site has example R and BioConductor code. Lin and Kimberly F. In addition, four variants of these probes are printed on the array: Deleted bases are chosen at random from the center of the probe sequence.

Negative controls Arabidopsis control spots can be used for spike-in experiments or can be used as negative control spots. In the event when Arabidopsis fragments were not used as spike-in, we should see low signal intensities in the Arabidopsis control and the structural control spots. Probes with signal near or below the negative controls cannot be estimated reliably even if signal is greater than local background. Positive controls Many of the positive control probes are printed multiple times across the slide.

These positive controls are determined empirically to be present in high abundance in many sample types.


  • Jailbird?
  • Chapter 9 - Methylation Analysis by Microarray.
  • Lady in Waiting!
  • .
  • Bioinformatics books | Cavalcade of Mammals.

Some of the positive probes are from genomic regions a. To gain some perspectives regarding the spatial variabilities in each hybridization experiment, one can track the signal variations of each type of positive controls that are arrayed multiple times across the slide. We expect to see these probes having similar signal intensities across the array.

Data cleaning Most image quantification programs flag spots that do not pass internal QC criteria; these spots should be removed from subsequent analysis. Flagging probes that fall below pre-determined thresholds for potential dismissal: Threshold for signal to noise ratio. Criterion for threshold determination should be customized. The expected distribution should be derived from the actual distribution of relevant parameters across the chips. Composite scores may also be useful 6. If it is desirable to have values for every probe, missing values may be imputed by a number of standard approaches: Single value decomposition 2.

The array is designed to tile CpG islands; hence, a probe with missing value will often be flanked by probes with signal above the established threshold. The compression of ratios between 0 and 1 may be problematic for downstream data analysis. Log 2 transformation is often used to transform signal intensities prior to expressing them as fold changes. Notations used for log 2 transformation are as follows: Intra-slide normalization adjustment for non-biological differences between the two channels 1. Local weighted loess regression 9 , 10 where the values of A are binned and a linear polynomial is fit to the binned data.

SearchWorks Catalog

L is smoothed at the boundaries of the bins so that the function is continuous in A. Robust linear regression 11 , 12 where L is a linear polynomial in terms of A across multiple slides and replicates. S is a robust estimate of the scale such as the median absolute deviation or the loess regression of the absolute mean-normalized log-ratio on A A spatial plot of the M values can often reveal the need for intra-slide normalization as well as which normalization procedure to employ. MA-plots are 2-dimensional scatter plots, plotting the relationship between M and A.

There should be no discernable pattern relating M to A. Convert the quantitative values of M into a color intensity. Two common approaches are as follows: This color scheme is useful for detecting if dye abundance is correlated with spatial location. Set color value to be blue for the probe with the lowest M value, yellow for the probe with the largest M value, and a continuous color gradient for values in between. This color scheme is useful for detecting a correlation between relative ranking of M-values and spot location. Plot the colors for each probe on a 2-dimensional plot where the x—y coordinate is associated with the location of the probe on the array.

Inter-slide normalization adjusts for non-biological effects between arrays M-values should be scaled so that they have the same median-deviation across arrays. A transformation that brings the mean median intensity of all the arrays to the same level. If a common reference sample has been labeled with one of the florescent dyes, e. The method for adjusting the other channel depends on the intra-slide normalization conducted, but should be adjusted in a manner to not alter the normalized log-ratio within the studied array.

If a reference sample is not used, then one can use quantile normalization to transform the M values. It is possible to correct for these effects using simple linear regression models 15 , Most clustering methods will be overwhelmed by the noise in the data if the entire data set is used. A probe flagged in any chip should be removed from consideration.

Bioinformatics books

Only probes with high variance across arrays should be considered, as probes with lower variance will not have the power necessary to distinguish between traits. Metrics Most clustering procedures require the operator to select a distance function between the observed data points. Different metrics will likely produce different clusters. Euclidean L 2 17 Very common and easy to understand, though hard to interpret in the setting of high-dimensional probe intensity data. Manhattan L 1 18 Also common and easy to understand but difficult to interpret.

One minus absolute correlation 19 Highly correlated points will be closer than uncorrelated points. MI is a generalized measure of correlation since the distance is zero if and only if X and Y are statistically independent. Detecting regions of significantly differentiated methylation A direct approach is to utilize threshold value to make a call of significance e. One of the drawbacks for such a method is the inability to assess the methylation calls statistically.

This method also cannot incorporate information derived from neighboring probes an expected trend of co-methylation in nearby genomic regions. A simple kernel smoothing function termed M-score is used to integrate probe-level information within a sliding window to portray regional methylation events. As an example, the M-score of each probe with respect to other probes within 1-kb region of the genome bp upstream and bp downstream is calculated as follows: Probes are ranked according to their normalized log ratios.

Parametric tests for discovering differential methylation. Significance analysis of Microarrays SAM 23 A method that scores each probe intensity with respect to the change in intensity relative to the standard deviation of repeated measurements. Significance of probes with score greater than a threshold is determined via a permutation test. Non parametric tests for discovering differential methylation. Wilcoxon signed-rank 24 Alternative to the T-test for discovering loci that individually differentiate between two groups.

Peak detection 25 Model-based computational method for locating and testing peaks in landscape data generated using the M-score approach. The methods proposed in 20 can be easily adapted to the DMH protocol. It is, however, important to consider hyper-and hypo-methylation events independently.

Permutation tests Often times the data will not satisfy the theoretical hypothesis for a given statistical test. Permutation tests will allow one to estimate the empirical distribution of the test. Compute the number of cases in which the test statistic from the random sample is less than the test statistic from the real data. False discovery rates and post-hoc p-values correction: For a given DMH experiment, large numbers of comparisons are made resulting in a high probability for false positives.

Therefore, the resultant p-values should be adjusted to correct for multiple testing. Bonferroni and other similar methods Too conservative due to correlation of test statistics Promoter CpG islands near tumor suppressor genes, transcription factors, or genes shown to be methylated in other tumor types are validated using a qualitative followed by a quantitative method.

Non-degenerate primers are designed against the bisulfite-converted DNA sequence to amplify the region in or around the probes identified as being differentially methylated in DMH analysis. It is important that the amplified region should contain at least one restriction site for an enzyme which has a CG-dinucleotide in its recognition site e. As bisulfite-converted DNA will be used, it is important to adjust the PCR primers so that the amplified regions will be between to bp for optimal analysis of fresh-frozen samples and to bp for archival materials. If longer regions are desired, multiple primer sets can be designed to extend the interrogation area.

PCR conditions will vary according to the optimal annealing temperature of the primers being used, but a typical amplification program is as follows: Appropriate restriction enzyme 5 units. Both tubes are then incubated at the appropriate temperature for 1 hour, and samples from both tubes will be run out side-by-side on a 1.

Presence of MW band s corresponding to the size of restricted fragment s in the restricted product lane is indicative of hypermethylation in the interrogated region as sodium bisulfite will abrogate any potential restriction sites if the CG sites are unmethylated. Often times, DMH analysis is performed on a subset of samples to identify regions of interest to be followed up in a large cohort of samples to derive statistical power. Neither an additional band having a much lower intensity than the expected product, nor unused primer bands interfere with the MassARRAY assay.