GenePattern 3.0 is now available. New features in this release include:
- Web Client:
The Web Client is completely redesigned to improve useability and include features
previously available only in the Java Desktop Client.
- Pipelines: You can now build pipelines that are more complex and easier to use.
You can build pipelines that include other pipelines. You can provide your own names and descriptions for the
parameters passed to a pipeline.
- Module execution: You can now define a command line prefix for all modules or for individual modules.
This allows you, for example,
to use clustering service software (SGE, LSF) to send different modules to different job queues.
- Security: GenePattern now provides individual user accounts,
optional password protection, and a more granular security model.
- User settings: Individual user accounts allow
users to customize GenePattern by determining how many jobs to display in
the "recent jobs" list and how much memory to assign to GenePattern visualizers.
- Server Administration: A greater number of
GenePattern server settings are now customizable from the Web Client. Problems that
you may have encountered using proxy settings to install modules from the Broad repository have been corrected.
GenePattern 3.0 is available at http://www.genepattern.org/download/.
GenePattern 3.0 release notes are available at http://www.genepattern.org/doc/relnotes/3.0/.
We welcome your feedback and encourage you to send questions and comments to gp-help@broad.mit.edu.
A number of modules have been added to the GenePattern module repository since our last newsletter:
- SNP Analysis modules provide support for the analysis of
Affymetrix high-density SNP arrays:
- SNPFileCreator creates a GenePattern .snp file from a
collection of Affymetrix CEL file, determining the probe intensity of each SNP
by summarizing probe intensities across probe sets.
- CopyNumberDivideByNormals determines SNP copy numbers by
dividing the intensity value of the target SNP by the intensity value of the normal SNP.
- GLAD invokes the R package GLAD (Gain and Loss Analysis of DNA),
which detects the altered regions in the genomic pattern and assigns a status (normal, gained or lost)
to each chromosomal region.
- LOHPaired detects loss of heterozygosity (LOH).
- SNPFileSorter sorts SNPs by chromosome and physical location.
This is a prerequisite for some modules, such as SNPViewer.
- SNPViewer provides a powerful visual representation of
SNP copy number and LOH data.
- XChromosomeCorrect, for each sample from a male donor,
doubles the intensity value for each SNP on the X chromosome.
Optionally, create your own SNP Analysis pipeline by using GenePattern pipelines to
combine the SNP Analysis modules into a single customized workflow.
- LandmarkMatch and PeakMatch
provide peak and landmark matching for advanced analysis of LC-MS data.
They are based on work published by
Jaffe, Mani, et al in
PEPPeR, a Platform for Experimental Proteomic Pattern Recognition
(Molecular & Cellular Proteomics 5:1927-1941, 2006).
- Multiplot modules allow you to create 2-parameter scatter
plots from microarray data. The plots, which are customizable and interactive,
display each probe (gene) as an individual dot whose identity and characteristics can be queried.
Use the MultiplotPreprocess module to prepare your expression data for plotting,
Multiplot to view the interactive plots, and MultiplotExtractor to
create expression datasets based on the multiplot data.
- CART and CARTXValidation provide class prediction
based on building classification and regression trees for predicting continuous
dependent variables (regression) and categorical predictor variables (classification)
(Breiman, et al., 1984).
- GSEALeadingEdgeViewer runs the Leading Edge Analysis, which helps you
visualize the overlap among the top gene sets returned by the Gene Set Enrichment Analysis (GSEA).
- HierarchicalClusteringImage creates an image of the
dendrogram generated from HierarchicalClustering, including support for the
coloring of dendrogram nodes.
- KMeansClustering clusters samples or features based
on a randomly selected set of k cluster centers. Data points are assigned to the
nearest cluster center and each cluster center is recalculated to be
the mean value of its members. KMeansClustering repeats this process until the cluster
centers stabilize.
- MergeColumns and MergeRows create new
datasets by merging existing datasets.
In addition, the following modules have been updated:
- ConsensusClustering clusters samples or features by building
consensus clusters across multiple runs of a selected clustering algorithm.
KMeansClustering has been added as one of the supported clustering algorithms.
- ExpressionFileCreator creates an expression dataset from
Affymetrix CEL files. When you use the MAS5 conversion algorithm, expression data
is now normalized using the method that you select.
- GEOImporter can now be used to download GEO Datasets.
The URL for downloading GEO files has now been updated.
- GSEA now uses GSEA v2.0.1, the latest version of the
Gene Set Enrichment Analysis software.
- HierarchicalClusteringViewer now displays expression profiles for
selected samples and features. It also saves images in eps format,
as well as bmp, jpeg, png, and tiff formats.
- HeatMapViewer now displays expression profiles for
selected samples and features. It also saves images in eps format,
as well as bmp, jpeg, png, and tiff formats.
- SelectFeaturesColumns and SelectFeaturesRows
now work with SNP files.
- SVM no longer requires optional parameters. In addition,
its R libraries are updated for compatibility with GenePattern 3.0.
SVM versions 1 and 2 cannot be run on GenePattern 3.0.
To install new and updated modules, open the GenePattern Web Client and click Modules>Install from Repository.
For comprehensive documentation on the modules in the repository, see our
module page.
We've updated our popular GenePattern workshop to introduce participants to the
features of GenePattern 3.0, including:
- intuitive web and application interfaces for users at all levels of computational sophistication
- comprehensive repository of analysis and visualization modules for analyzing gene expression data,
proteomic data, and high-density SNP array data
- pipelines that allows users to chain modules together to create and share methodologies
- easy module creation that allows rapid, code-free integration of new tools
- a programming environment that allows you to access GenePattern modules from the Java, MATLAB, and R programming languages
This one-day workshop is being offered on the following dates:
- Thursday, May 17 (Broad employees only)
- Monday, May 21
- Tuesday, May 22
All workshops will be held 9am-5pm at MIT's Digital Instruction Resource Center (14N-132) in
Cambridge, Massachusetts. Registration is free for attendees from academic or other nonprofit organizations and $600 for attendees from for profit organizations.
Register now at http://www.broad.mit.edu/genepattern/workshop/.
Or, if these dates are inconvenient, use the registration form to request
that we notify you of future workshops.
GenePattern at Harvard-Partners Center for Genetics and Genomics
The Gateway for Integrated Genomics-Proteomics Applications and Data (GIGPAD) is a
software platform that allows investigators and clinicians at the
Harvard-Partners Center for Genetics and Genomics (HPCGG)
to share data and analysis results without compromising patient confidentiality.
GIGPAD relies on GenePattern to provide the computational analysis framework for HPCGG laboratories.
Eugene Clark, HPCGG senior software architect, explains why his team chose to incorporate GenePattern
rather than build an in-house system, "Using GenePattern allows us to decouple bioinformatics from our main application infrastructure, thereby providing our biologists greater freedom to innovate without being constrained by formal software development practices."
Prior to GIGPAD, each HPCGG lab maintained unique processes that required manual intervention, custom scripts, and IT support. Data and analysis results were difficult to share, manual processes time consuming, and parallel IT support expensive. Today the labs are fully automated. GIGPAD receives raw data files from the lab machines and sends the files to GenePattern for processing. GenePattern runs selected computational analysis pipelines and forwards the results to GIGPAD. HPCGG associates can review the raw data and analysis results without compromising patient confidentiality. The GIGPAD-GenePattern integration centralizes data access, reduces processing time, and simplifies maintenance.
"We wanted the labs to retain their independence, but enable collaboration by having a central location for data and analysis results," explains Clark. The computational analysis framework was a critical component of the laboratory infrastructure. The framework had to be flexible enough to allow each lab to adapt its own methodologies, rigorous enough to enable reproducible research, extensible, and maintainable. Clark chose GenePattern based on the benefits it offered:
- Easy definition and maintenance of analysis pipelines
- Server-based architecture that supports clusters for efficient processing
- Multiple platform technology that supports Windows, Mac, and UNIX
- Comprehensive technical support and training
- Freely available
Integrating GenePattern with GIGPAD makes it easy for the IT team to build, deploy, and maintain customized computational analysis pipelines for individual HPCGG laboratories.
Please let us know how you're using GenePattern.
Publications
If you've published a paper that makes use of GenePattern, we'd love to hear about it:
email the GenePattern team.
Even if you're just using GenePattern in a novel way, let us know!
User Survey
If you use GenePattern, we would like to know how your experience has been. Our
user survey
is a brief online form that lets you give us feedback about the software and other aspects of using GenePattern.
Your responses are greatly appreciated - they help us to understand how GenePattern is being used and
how to make it a more valuable tool.
Early Adopters
If you'd like early access to new GenePattern releases to help us test new GenePattern features,
join the early adopters mailing list.
|