University of New Mexico Health Sciences Center Education at the HSC (Programs in Medicine :: Pharmacy :: Nursing) Patient Care at the HSC (Hospitals :: Clinics) Research at the HSC HSC Partnerships About the HSC (News Releases :: Calendars :: Administration) Library Health Sciences Center Home Page HSC Site Search ( Search :: Alphabetical Listings) HSC Home Page HSC Intranet  (Resources and News for Employees) University of New Mexico Home Page

 

 

Site Navigation

KUGR Home Page

Services


Affymetrix 3'-Assays

Affymetrix All Exon

Affymetrix 500K SNP

Affymetrix TG

Real Time PCR

TaqMan SNP

Data Analysis

Genomics Software


Acknowledge KUGR !

Frequently Asked Questions

Helpful Information

Contact Information

Equipment/Resources


KUGR History

Microarray Primer

Basics of Microarray Data Analysis

Data Analysis Tutorial

Microarray Comparisons

Faculty Director

Scott A. Ness, Ph.D.


KUGR Personnel

Gavin Pickett, Ph.D.

Marilee Morgan, M.S.

Tel: (505) 272-5564
Room: CRF 118

HSC Resources

Department of Molecular Genetics and Microbiology

HSC Research Commons

UNM Cancer Research and Treatment Center

UNM NIEHS Center

BSGP Graduate Program

Other Links

Albuquerque
New Mexico
UNM HSC
UNM

 
Keck-UNM Genomics Resource


A Primer for Using Microarrays in Biomedical Applications

This information is designed to help researchers at UNM learn about microarrays and decide on the possible experiments they can do.  It may also be helpful during the process of writing grant applications. More detailed protocols and information are available from KUGR facility.

Background, Facilities and Instrumentation

The Approaches

Getting Grants to do Microarray Experiments

Strategic and Experimental Issues

Data Analysis

Example Experimental Details
 


Background, Facilities and Instrumentation

Affymetrix GeneChip

The Affymetrix GeneChip package was purchased in late 1999 with funds from HSC, SOM and CRTC. It includes the following components:

  • Affymetrix GeneChip automated fluidics wash station

  • Affymetrix GeneChip hybridization oven.

  • Hewlett Packard GeneArray Scanner

  • LIMS Data Server

  • Several Windows NT computer workstations

  • Affymetrix GeneChip and Data Mining Tool Software packages.

Custom microarrays

In July, 2000, a $1M grant from the W.M. Keck Foundation funded the establishment of a custom glass slide microarray facility. The facility is completely operational and includes the following:

  • High-throughput and automated systems for EST clone storage, handling and preparation.

  • Robotic liquid handling workstations.

  • High-throughput facilities for PCR amplification, gel electrophoresis and purification of EST clone inserts.

  • Automated glass slide microarray printers.

  • Commercial glass slide microarray scanners.

  • Computers and software for data analysis.

The KUGR facility has already printed yeast genome arrays containing more than 13,000 spots per slide (duplicates of >6,500 yeast genes), representing more than 99% of the theoretical yeast genes. We can amplify human cDNAs from our collection of 50,000 EST clones on a per gene basis, and add those to a collection of "core" spots containing common housekeeping and control genes.

In addition, in collaboration with Sandia National Labs, the facility is participating in the development of a next generation multi-spectral microarray scanner, capable of analyzing 12 or more probes simultaneously on one array. In collaboration with the UNM/Albuquerque High Performance Computing Center, the facility is helping to implement novel software applications for the analysis of microarray/genomics data using massively parallel supercomputers. In collaboration with the UNM SOM Biocomputing Core Facility, we are helping to develop a generalized database suitable for storing and facilitating the analysis of complex microarray data sets.

Back to Top


The Approaches

Affymetrix GeneChip arrays

The Affymetrix system is fully functional and available for gene expression analysis of complex genomes. GeneChips containing specific probes for up to 60,000 human genes are currently available commercially from Affymetrix. Similar GeneChips specific for mouse, rat and yeast gene sets are also available. The Affymetrix system offers many advantages, built-in hybridization and quality controls and simple software analysis tools. In general, results obtained with the Affymetrix system are of very high quality and are highly reproducible. This is the preferred method of starting microarray analyses for researchers working with human or other mammalian systems. The disadvantage is the fairly high cost. Currently, Affymetrix GeneChips cost $400 each for UNM users, and five chips are required to analyze expression of all 60,000 human EST clones with a single probe. The newer U133A arrays from Affymetrix contain probes for approximately 19,000 genes. This includes most of the known, named and well characterized genes. A second chip (U133B) contains probe sets for another 19,000 hypothetical genes and EST clones. Most researchers will want to start with the U133A chip. A typical experiment using up to 10 such chips will cost approximately $4,000 for the GeneChips plus up $2,000 for additional supplies.

Custom EST glass slide arrays

The KUGR facility is currently has a collection of approximately 50,000 human EST clones which can be amplified an spotted on custom arrays for a very modest fee. This is a much less expensive approach for researchers who know which genes to focus their efforts on, or who have already done prelminary experiments with the Affymetrix system to identify the relevant genes. Researchers interested in custom human arrays should contact the KUGR facility staff for more details.

Back to Top


Tips for Getting Grants to do Microarray Experiments

Recently, as microarray experiments have become accessible, adding such experiments to grant proposals has become commonplace. In addition, the NIH has offered special supplements to help investigators pay for microarray experiments. However, applicants often make a few common mistakes when including such sections in their proposals:

  • It is a mistake to add a paragraph to the end of a Specific Aim, saying "we will also pursue this aim using microarray experiments." These assays are too difficult to simply tack them onto the end of an aim. The reviewers know they are difficult and expensive. If you include microarray experiments in your proposal, treat them seriously with the thought and projected effort that they will require.

  • It is a mistake to say "we will use microarrays to identify and clone genes induced by..." Microarray assays are the ultimate fishing experiment. Any criticism that can be used for other methods, such as two-hybrid assays, cDNA library screening, etc., could also be used for proposed microarray experiments. There must be follow up experiments for confirming microarray data. Different types of experiments must be used to confirm that the genes are actually induced and that they are important.

  • It is a mistake to think that only a handful of genes will be induced by any treatment. Published experiments have shown that as many as 10% of the genes change expression when proteins like p53, for example, are activated. Important events are likely to cause very many changes in gene expression. Don't expect microarray assays to identify the one and only drug-induced gene. Instead, be prepared to study hundreds of genes whose expression change.

  • It is a mistake to think that all the EST clones on the microarrays encode proteins with known functions. Remember, only about 10% of all genes even have names. Far fewer have been studied in any real detail. The microarrays containing up to 60,000 EST clones allow researchers to study the expression of many thousands of genes of totally unknown function. Furthermore, most EST clones have been incompletely sequenced. Until the human genome project is finished (1-2 years), it will not be possible to analyze an EST clone sequence and identify the open reading frame. Therefore, it may not be possible to analyze the structure of proteins expressed by EST clones whose expression change in interesting ways.

  • Microarrays offer a tremendous tool for analyzing gene expression. However, from an experimental and grant application point of view, they are nothing but fishing experiments. Consider this when writing a grant application, and be prepared for the criticism.

Some of the most successful uses of microarrays to date have been in the analysis of gene expression patterns, rather than for the identification of specific inducible genes. For example, researchers have used microarrays to identify patterns of gene expression that identify specific types of tumors (e.g. ALL vs. CLL). This is an excellent approach for a grant application, since it does not rely on subsequent, unpredictable analysis of identified genes, but still pays for and requires the use of microarrays.

One might propose to use microarrays to compare two cell types (e.g. control vs. drug treated), in order to see whether gene expression changes occur. Furthermore, one could study these patterns of gene expression in different drug-treated individuals, to see whether the patterns are indicative of clinical outcome. In this case, the microarrays provide an assay, rather than a fishing expedition. This approach will yield valuable preliminary data that can be used to justify the detailed analysis of specific drug-induced genes at a later date (or in a separate grant application). This approach is much more likely to be viewed favorably by reviewers, and will still provide resources for microarray experiments.

Back to Top


Strategic and Experimental Issues

What microarrays can do.

Microarrays (Affymetrix or glass slide) allow researchers to analyze the expression of many genes simultaneously. Contrasted to Northern blots, RT/PCR or other assays, microarrays can measure the expression of thousands of different genes at the same time.

Outline of the procedure.

  • Total RNA is prepared from the samples. These might be different cell lines, tumor samples, normal vs. disease, control or drug-treated, etc. In most cases, a minimum of 1µg of polyA+ RNA or 5 µg of total RNA is required. However, some protocols are written using as little as 0.2 µg of "good quality" polyA+ RNA. As a general rule, 1x106 tissue culture cells should yield 10-15 µg of total RNA.

  • Using Reverse Transcriptase, the RNA is converted into cDNA. At this point the cDNA can be labeled directly (by incorporation of fluorescently-tagged dNTPs). More commonly, the cDNA is prepared using an oligo-dT primer that incorporates a T7 RNA polymerase promoter. The cDNA is then used in a subsequent step to make fluorescently-tagged copy RNA, using T7 RNA polymerase. In general, at least 5 µg of labeled cRNA or cDNA is required for hybridizing to each microarray. However, the probes can be reused. For example, one labeled probe can be used sequentially to hybridize to five separate Affymetrix GeneChips.

  • The fluorescently-labeled probes are hybridized to the microarrays, much as radioactive probes are hybridized to conventional dot-blots. Affymetrix is a one-color system, so each probe is hybridized to a separate array, or GeneChip. The custom glass slide arrays can use dual colors (multiple colors in the future), so two probes, one labeled red and the other green, can be hybridized simultaneously to a single microarray.

  • After washing, the microarrays are analyzed using a fluorescent scanner: a cross between a typical flat-bed scanner and a confocal microscope. The data is an image of the fluorescent spots on the microarray.

  • The image is analyzed using software that identifies the spots and calculates the intensity of the fluorescence in each one. By comparing the intensities obtained with two different probes (e.g. control vs. drug-treated), one can determine how the expression of each gene in the array changes.

Problems and pitfalls.

The microarray approach is extremely powerful, and is being used by a wide variety of researchers in different fields. However, there are some things to consider:

  • Microarray experiments, especially using Affymetrix GeneChips, can be very expensive. For example, using Affymetrix to analyze the expression of all 60,000 human EST clones with just two probes (e.g. control vs. drug treated), in duplicate, would cost at least $5,000.

  • Microarrays can produce huge amounts of data very quickly. Users must have ideas about how to interpret the data before beginning the experiment.

  • The results obtained from microarray experiments can be very complex. Instead of observing just one or a few genes during a time-course experiment, microarrays allow researchers to study the expression of all the genes simultaneously. In addition, especially with patient samples, microarrays detect differences between individuals. Researchers must have ways of deciphering the results, in order identify genes that change because of the drug treatment, not just because the patients were different individuals.

  • Microarrays, which measure gene expression, can not detect changes that occur post-transcriptionally (e.g. signaling mechanisms). Consequently, many important events are not accessible using microarray experiments.

Back to Top


Data Analysis

Microarray experiments generate very large data sets. One of the most difficult, challenging, and often overlooked parts of these assays is the data analysis. Microarray data analysis can be divided into two parts. First, there is a significant amount of post-processing involved in converting the actual microarray images into numeric values. For the Affymetrix system, this initial analysis is mostly automatic, with the exception that the user must stipulate how the data should be normalized. For the custom glass slide arrays, which are composed of individual spots of DNA attached to a glass microscope slide, the first key step in the analysis is 'spot-finding'. This is an extremely important step, since the spotted arrays, and the spots in them, can be variable. Spot-finding is accomplished by software packages available in the KUGR facility.

Pairwise comparisons with scatter plots:

The simplest way to analyze microarray data is to compare two types of samples using a scatter plot. In this analysis, the X and Y axes represent expression levels in two samples. For each gene on the array, a single dot is placed at the intersection of these two values. Most of the dots line up along the middle of plot – these represent genes that are expressed at similar levels in the two samples. Dots that fall off the middle line represent genes that are expressed at significantly higher levels in one of the two samples.

Cluster analyses:

Although scatter plots are useful for pairwise analyses, they are difficult to use when analyzing more complex data sets. Clustering algorithms provide a way of identifying genes that are co-expressed under different conditions. For example, if comparing just two samples (e.g. mouse strains A and B), one can imagine that there should be at least nine clusters:

Off in both.
Moderate in both.
High in both.
High in A, off in B.
High in A, moderate in B.
Moderate in A, off in B.
Moderate in A., high in B.
Off in A, moderate in B.
Off in A, high in B.

Of course, the problem is that genes can 'land' in these clusters either because they are expressed at meaningfully different levels, or because of random 'noise'. Distinguishing between these possibilities is the difficult part. In general, by doing multiple replicate experiments, one should be able to get rid of noise, and find the meaningful differences.

Most good software packages (e.g. Affymetrix Data Mining Tool v.2) allow the user to analyze data from replicate samples. This can include averaging the replicates and performing statistics, such as 't' tests, prior to performing cluster analyses. No matter how the data is analyzed, it is important to think of how the microarray data might be validated using other methods (e.g. real time PCR).

Back to Top



Example experimental details

In the following example, adenovirus vectors will be used to express v-Myb in tissue culture cells, then changes in gene expression will be monitored using Affymetrix microarrays. Custom glass slide microarrays could be used in a very similar fashion.  Also, I have included a lot of detail here. You may want to summarize this more briefly.

Expression vectors and preparation of RNA and labeled probes:

To identify v-Myb-regulated genes, recombinant adenovirus vectors will be used to express GFP (as a negative control) or GFP plus v-Myb. The human cell line HepG2 will be used as the recipient target, since preliminary experiments have already shown that v-Myb can activate Myb-responsive genes in this cell line. Briefly, HepG2 cells will be grown to near confluence, then will be infected with the recombinant adenoviruses at a multiplicity of infection previously determined to give maximal Myb (or GFP) protein expression (usually an MOI of about 100). After 16 hours, total RNA will be prepared and purified using the RNEasy kit (Qiagen). RNA will also be prepared from control, uninfected cells for comparison. In our hands, this method yields approximately 100 µgs of total RNA from one 150 cm culture dish, or approximately 1x107 HepG2 cells.

The following procedures are all recommended by Affymetrix: Approximately 10 µg of purified total RNA will be converted to double stranded cDNA using the Superscript Choice system (Gibco) and an mRNA-specific primer incorporating both a T7-RNA polymerase promoter and a 24 nucleotide long stretch of dT (T7-(dT)24 primer, GENSET). Biotin-labeled cRNA will then be generated using the BioArray HighYield kit (Enzo), followed by purification on RNeasy spin columns (Qiagen). Based on preliminary experiments, we expect to recover approximately 45 µg of purified, biotin-labeled cRNA in this procedure. Biotin-labeled control probe (spike) cRNAs will be prepared directly from plasmids as recommended by Affymetrix.

Microarrays and hybridization:

The biotin-labeled experimental and control probe cRNAs will be fragmented to an average size of approximately 100-200 nucleotides to facilitate hybridization, and will then be used for hybridization. Control oligo B2 (Affymetrix) will also be included, to aid in alignment of the array grids. Probes will be hybridized overnight against Affymetrix U95 GeneChip arrays A-E. These arrays contain spots representing approximately 60,000 unique human EST clones. The GeneChip arrays will be hybridized, washed and scanned according to the manufacturerís recommendations, using the equipment available in our microarray facility.

Data analysis:

The GeneChip software (Affymetrix) will be used to analyze the gene expression data, and to compare the results from uninfected cells or cells infected with control or v-Myb-expressing adenoviruses. Our expectation is that v-Myb will strongly induce the Myb-regulated genes in HepG2 cells. Any genes induced by the control GFP virus are assumed to be caused by the adenovirus vector, and will be excluded from our data set.

Back to Top