If you haven't found what you are looking for, please send an email to gp-help(at)broadinstitute.org.
For information about the latest version of GenePattern and its components, see the Release Notes.
The Release Notes list hardware requirements, supported operating systems, and supported browsers.
Click here for the latest release notes.
In addition to this FAQ, the GenePattern team provides the following online resources:
To provide feedback or ask a question not addressed by the online resources, send email to gp-help(at)broadinstitute.org.
To cite GenePattern, please use the following citation:
Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, Mesirov JP (2006) GenePattern 2.0 Nature Genetics 38 no. 5 (2006): pp500-501 doi:10.1038/ng0506-500.
To cite a GenePattern analysis or visualization module, cite the GenePattern software and the original paper or other source for the module as specified in the module documentation. Documentation for each module is available on the Modules page and in GenePattern (click Help when prompted to enter the module's parameters).
Sometimes jobs with large datasets or parameter settings which cause greater computational load on the system can fail because they ran out of memory. Usually you will get an error message stating that the module ran out of memory, though sometimes this can cause a "silent" failure; meaning that you see that your job failed, but there are no error files. In either case, GenePattern administrators can track down and resolve the problem. GenePattern administrators can assign memory settings on a per user basis, allowing your jobs to run with the required amount of memory for your analysis.
To have an administrator look into your errors and adjust the memory settings for your jobs please contact us at gp-help(at)broadinstitute.org, making sure to provide your username and job id.
You can follow GenePattern on Twitter, FaceBook, or our RSS feed. Check our Twitter feed in the news items at genepattern.org, join our mailing list, or add yourself to the Workshops notification list.
Yes. The GenePattern team has released a GenePattern Amazon Machine Instance (AMI) which is available on us-east-1.
To install GenePattern, go to GenePattern Download and follow the instructions for your operating system.
To uninstall GenePattern, use the utility provided as part of the GenePattern installation. If the GenePattern uninstall utility is unavailable, deleting the GenePattern installation folder removes all GenePattern files other than the desktop icons.
Mac users: If R2.5 is not already installed, GenePattern installs it in the /Library/Frameworks/R.framework/Versions/2.5 folder. Uninstalling GenePattern does not uninstall R. To uninstall R, use the utility provided by R.
Simply install the new version of GenePattern into the same directory as your previous version. Do not uninstall first. It is unnecessary and will delete your existing modules, pipelines and suites. When you overwrite the previous version:
User groups: The userGroups.xml file for GenePattern 3.2 omits the group named Public. In GenePattern 3.2, all users are now in a predefined group named Public. To avoid confusion, do not recreate the group named Public.
R versions: Installing GenePattern 3.1 (or later) installs R2.5 and sets the full path to R2.5. See Using Different Versions of R for information on how to create and/or use GenePattern modules written for other versions of R.
Large uploaded data files or output files will significantly slow down your GenePattern upgrade installation. If you have less than approximately 10 GB of data files either uploaded (via the Upload tab) or output by GenePattern jobs, you can just follow the GenePattern server installation instructions. However, if you have more than 10 GB of uploaded data files or output files, we suggest that you:
mv <GenePatternServer>/Tomcat/temp <GenePatternServer>_data/temp
java.io.tmpdir=<GenePatternServer>_data/temp
Replace <GenePatternServer>
with an actual path.
Yes. Source code for GenePattern and its modules is available under the GenePattern software license. Source code for the GenePattern Server application is on GitHub. Source code for most modules is available directly from GenePattern. From the job input form, open the gear menu to 'Export' the module or view 'Properties' to select source files individually.
No. The R, Perl, and Java installations that come with the GenePattern are installed within the GenePattern directory and do not affect any other versions that you may currently have.
You can configure GenePattern to work with other versions of R/Perl/Java; however, the versions of R, Perl, and Java bundled with GenePattern are the ones that have been fully tested. We cannot guarantee that other versions will work.
Java VM: If you install a GenePattern server without the Java VM, choosing instead to use a Java VM that you have already installed, ensure that the file tools.jar (provided by SUN seperately from the JRE and JDK) is on your classpath. When you install a GenePattern server with an included VM, the GenePattern installation does this for you. If this file is not on your classpath, when you attempt to install a module that requires the MatlabComponentRuntime (MCR) Installer, the MCR Installer fails.
R versions: GenePattern modules can be written for any version of R. For details on how to specify which version to use, see Using Different Versions of R.
GenePattern supports the Basic Latin character set. Characters other than those in the Basic Latin character set may not be displayed correctly. Asian character sets are not currently supported.
All analysis and visualization modules support the decimal point (.) as the separator between the integral and fractional parts of a decimal number. Using a decimal comma (,) may cause unexpected behavior in some modules.
If you did not indicate that you were behind a web proxy/firewall when you installed GenePattern, you must update the proxy settings for your server before you can install the modules:
If you still cannot connect to the repository, email us at gp-help(at)broadinstitute.org.
You need to use the war file installation. Instructions are available here.
If you already have a server such as Tomcat running on this port, you need to install the GenePattern server on a different port to avoid conflicts.
When GenePattern is installed on Windows 64-bit systems in the default C:Program Files (x86) directory, modules fail because of some code that is expecting only "C:Program Files" and then truncates that location to "C:Progr ~1". There is similar bug in ComparativeMarkerSelection. These errors are corrected in the 3.2.2 release of GenePattern. However, if you do not upgrade to release 3.2.2 or after, the work around is to re-install GenePattern in a directory that has no spaces in the name.
There are two useful sections of the GenePattern User Guide that explain how to do this:
The StartGenePatternServer application only starts the server on Linux machines (on Mac it will start the server and launch a web browser). To access the web client interface for your GenePattern server, click the GenePatternHome.html shortcut icon, or, if you did not install icons in your task bar or on your desktop, GenePatternHome.html can be found at the top level of your GenePattern install directory.
Some Mac users have found that the R library is not installing correctly when they try to install GenePattern. Even after making sure that the folder into which GenePattern is installing R has write permissions, upon running a module, they receive the following error message:
java.io.IOException: Cannot run program "/Library/Frameworks/R.framework/Versions/2.5/Resources/bin/R": error=2, No such file or directory while running R command [/Library/Frameworks/R.framework/Versions/2.5/Resources/bin/R, --no--save, --quiet, --slave, --no-restore]
This may be a simple GenePattern server configuration problem. First, check that something is installed at that path. Open the Terminal.app and run the following commands:
ls /Library/Frameworks/R.framework/Versions/2.5 ls /Library/Frameworks/R.framework/Versions
If there is something installed at this path, then check that the path to R is correctly configured in your GenePattern server. Go to the Administration>Server Settings>Programming Languages GenePattern page and verify that:
R 2.5 Home: /Library/Frameworks/R.framework/Versions/2.5/Resources
If this is configured correctly, you may be able to correct the problem by manually downloading and installing R 2.5.
If you have further issues installing R, contact us at gp-help(at)broadinstitute.org.
See Increasing Memory Allocation.
See Increasing Memory Allocation.
Yes. If you are running more than one installation of GenePattern on the same machine, you must make sure that the port numbers for the GenePattern server and the HSQL server are unique to each installation. The Tomcat server listens on two ports, 8080 (requests) and 8005 (shutdown) by default, and the HSQL server listens on port 9001. All 3 ports need to be modified on the second copy of Tomcat. For example, you can set the GenePattern server port to 8080 and 8005 on one install and 8081 and 8086 on the other, and set the HSQL port to 9001 on one and 9002 on the other. You can configure these port numbers when you are installing the server.
Choose one hostname for the GenePattern server; for example, http://servername.domainname.edu:8080/gp/. Edit the genepattern.properties file and set the following properties:
GenePatternURL=http\://servername.domainname.edu:8080/gp/
GENEPATTERN_PORT=8080
gpServerHostAddress=servername.domainname.edu
fqHostName=servername.domainname.edu
fullyQualifiedHostName=servername.domainname.edu
Session timeout is set in the Tomcat configuration file of the GenePattern server. To modify this setting for a local GenePattern server:
<session-config> <session-timeout>1440</session-timeout> </session-config>
On the public GenePattern server, session timeout is set to four hours and cannot be modified by a user.
Queuing systems such as the Load Sharing Facility (LSF) and the Sun Grid Engine (SGE) allow computational resources to be used effectively. If you have such a queuing system installed at your site and you have installed a local GenePattern server, you can configure the GenePattern server to work with the queuing system. For instructions on how to do so, see Using a Queuing System.
From your GenePattern server, go to Administration>Server Settings and click File Purge in the menu at the left. From here you can specify when anaylsis result files are deleted from the server:
Click Save to save your changes. Click Restore to return to the values set at the installation.
Note: This setting can only be modified on local GenePattern servers for which you have administrative rights. You cannot change this setting on the Public Server.
A refused connection is most likely due to a proxy issue. If you are behind a proxy or firewall, verify that you have correctly configured GenePattern and/or talked with your local SysAdmin allow GenePattern access to your machine.
To configure a proxy connection in GenePattern please do the following:
If this does not resolve the issue, please contact us at gp-help(at)broadinstitute.org.
If you install your own GenePattern server, the default setting is not to allow input file paths. To change this, if you have administrator privileges on the server, add or edit the following in your genepattern.properties file:
allow.input.file.paths=true
Then restart your server. This will allow users to input an arbitrary network file path (such as file:///server/directory/file.gct) as the value for an input file parameter. When input file paths are allowed, you can use the server.browse.file.system.root property to set a root directory where the GenePattern server begins browsing for the specified network file path.
Note: On the GenePattern Cloud public server, we prevent users from entering an input file path (file://urls) as an input file for a module in order to better secure the machine running the public server.
If you tried to install GenePattern on Ubuntu, you may have received an installation error: "An internal LaunchAnywhere application error has occurred and this application cannot proceed. (LAX)" with "java.lang.IllegalArgumentException: Malformed \uxxxx encoding." in the stack trace.
LaunchAnywhere can interfere with the prompt string formatter PS1. In order to work around this problem, you need to use the following command:
$ export PS1=">" >sudo sh./GPServer.bin
This is not only important for installing GenePattern on Ubuntu, but also launching GenePatternServer. Use the command before the GenePatternServer startup command, like so:
$ export PS1=">" >./StartGenePatternServer
Updated Oct 14, 2021
Please direct any questions about these modules to our help forum.
ABSOLUTE
ABSOLUTE.review
ABSOLUTE.summarize
BlastNPipeline
BlastParser
BlastSubtraction
BlastSubtractionLoop
BlastXml
Bowtie.aligner
Bowtie.indexer
BWA.Unmapped
BWAPipeline
BWASubtraction
CaArray2.1.0ImportViewer
CaArray2.3.0ImportViewer
CaArray2ImportViewer
caArrayImportViewer
CatalogueReads
CBS
ConcatenateFiles
Cuffcompare
Cuffdiff
Cufflinks
CufflinksCuffmergePipeline
Cuffmerge
ExternalSort
ExternalSort
ExtractFullQuery
ExtractPairsBam
ExtractUnmapped
ExtractUnmappedAdapterBlast
Fasta2FQone
Fastq2FQone
FilterLength
FQone2Fasta
FQone2Fastq
GENE-E
GeneCruiser
GetUnmappedReads
GISTICPreprocess
HAPSEG
HierarchicalClustering.MATLAB
HierarchicalClusteringImage.MATLAB
Lu.Getz.Miska.Nature.June.2005.clustering.ep.mRNA.pipeline
Lu.Getz.Miska.Nature.June.2005.clustering.ep.miRNA.pipeline
Lu.Getz.Miska.Nature.June.2005.clustering.ALL.pipeline
Lu.Getz.Miska.Nature.June.2005.clustering.miGCM218.pipeline
Lu.Getz.Miska.Nature.June.2005.PDT.miRNA.pipeline
Lu.Getz.Miska.Nature.June.2005.PDT.mRNA.pipeline
MAGeCK
MegaBlastPipeline
MultiplotPreprocess
ParallelCBS
ParsedBlastParser
PathSeq.BlastN
PathSeq.BlastX
PathSeq.BWA.aln
PathSeq.MegaBlast
PathSeqPrototype
PathseqReport
PNN
PNNXValidationOptimization
PostBlastN.All
PostBlastX.All
PostMegaBlast.Bacterial
PostMegaBlast.Ribosomal
PostSubtraction
PostSubtraction.Assembly
PostSubtraction.Contigs
PostSubtraction.Contigs.Scatter
PostSubtraction.Contigs.Step
PostSubtraction.Unmapped -
PostSubtraction.Unmapped.Scatter
PostSubtraction.Unmapped.Step
PreSubtraction
QualFilter
RankNormalize
RemoveDuicates
RepeatMaskerFormatChange
RepeatMaskerRead
SAMTools.FastaIndex
ScatterBlastSubtraction
ScatterBWASubtraction
Sit2Pairs
SraToFastQ
Subtraction
TopHat
Trinity_r2012.06.08
UniqueIdentifier
This is a known issue: when users click the Browse Server File System button, the Internet Explorer web browser window (instead of a pop-up window) becomes the file system browser.
If you want to continue using Internet Explorer, you can copy and paste or manually enter the server file path rather than clicking the Browse Server File System button. We recommend using another browser for full functionality.
Yes. Most GenePattern analyses can run on 2-channel or ratio-based data as easily as on single channel or absolute value data. To run 2-channel data in GenePattern, do the following:
Your data is now in a GCT file that can be analyzed by most GenePattern modules. (If you want to use non-negative matrix factorization (NMF) and your data contains negative values, see the NMF note in the Modules & Pipelines section below.)
Ratio values for cDNA data can be computed using a variety of methods. How the ratios are computed determines whether it is possible to create a class (CLS) file for the cDNA ratio data. For example:
normal sample (Cy3) / common reference (Cy5) = phenotype 1
treated sample (Cy3) / common reference (Cy5) = phenotype 2
normal sample (Cy3) / treated sample (Cy5) = phenotype
If you cannot create a CLS file, you can analyze your data using modules that do not require class files (such as ConsensusClustering), but will not be able to use modules that require the CLS file (such as ComparativeMarkerSelection).
Information on file formats supported by the modules currently in GenePattern is available in File Formats.
Run your file through PreprocessDataset. Select the desired output format for your file. If you only want to convert the file type without filtering, select "no filter" as the choice for the "filter flag" parameter.
File Formats describes the file formats used in GenePattern and, where applicable, suggests methods for converting files to these formats.
The ExpressionFileCreator module converts a set of individual CEL files into an expression data set that is usable by GenePattern modules. The MAGEMLImportViewer module imports data in MAGE-ML format into GenePattern, and similarly, the MAGETABImportViewer module imports data in MAGE-TAB format into GenePattern.
This generally occurs for one of two reasons:
Pipeline input files with spaces in their names may give file-not-found errors. If this happens, use DOS' "dir /x" command to get the 8.3 version of the directory and filename and use that instead of the long filename. If you are using a Unix-based platform, you may need to quote the filename parameters on the command line definition.
To run NMF on data that contains negative values, you must do the following (using the method of Kim, P. M. & Tidor, B. (2003) Genome Res. 13, 1706-1718):
To do this in MATLAB, you can execute the following: anew=[max(a,0);-min(a,0)];
where a
is the original data.
We are currently developing a GenePattern module to perform this operation as well.
No, you can use the two files that are created and leave the remaining input box blank. HierarchicalClustering creates a cdt file and one or two additional files: an atr file if you clustered by samples (columns), a gtr file if you clustered by genes (rows), or both atr and gtr files if you clustered by both samples and genes (columns and rows). The JavaTreeView module accepts the two or three files created by HierarchicalClustering.
The HeatMapViewer module currently does not include gene annotations with the saved image. Use the HeatMapImage module to include gene annotations.
When computing the t-test or signal to noise ratio, ClassNeighbors thresholds the standard deviation to ensure that it is at least twenty percent of the mean. Additionally, if the standard deviation is zero, ClassNeighbors sets it to 0.1.
Yes. You can use the GSEA module with the c3 (motif) gene sets. The GSEA module is documented on the Modules page.
Most errors reported by users running the GISTIC module are caused by a mismatch between the segmentation and markers files. If an error occurs, verify that all markers indicated in the segmentation file appear in the markers file and only those markers indicated by the segmentation file appear in the markers file.
The CBS and GLAD segmentation methods produce GISTIC-friendly marker positions. Partek's latest beta version also produces GISTIC-friendly marker positions. However, if you used an earlier version of the Partek algorithm to create the segmentation file, the algorithm did not report the exact physical position of the first and last markers of the segments. If you run GISTIC on a segmentation file generated using the earlier version of the algorithm, the physical positions of the marker file will not agree with the start or stop positions of the segmentation file. Note that Partek also uses the control probes in the generation of the CN/segmentation.
??? Error using ==> plus Matrix dimensions must agree. Error in ==> make_D_from_seg at 158 Error in ==> run_gistic_from_seg at 58 Error in ==> gp_gistic_from_seg at 177 MATLAB:dimagree
If you are running GISTIC and get the error above in your stderr.txt file, you should verify that your segmentation file and markers file are exactly matched. Only the markers from the markers file should be indicated in the segmentation file and only those markers indicated by the segments should be in the markers file.
IE seg file should be
1-4 5-6
and markers file should be
1 2 3 4 5 6
If your run of GISTIC fails with the error below in the stderr.txt file, check your segmentation file format. Please see the sections on the segmentation file format in the GISTIC documentation for more details and examples.
??? Attempted to access rl(:,2); index out of bounds because size(rl)=[0,1]. Error in ==> derunlength at 25 Error in ==> smooth_cbs at 148 Error in ==> run_gistic_from_seg at 125 Error in ==> gp_gistic_from_seg at 177 MATLAB:badsubscript
If this does not resolve the issue, please contact us on the GenePattern help forum.
If your run of GISTIC fails with the error below in the stderr.txt file, check your markers file format. Please see the sections on the markers file format in the GISTIC documentation for more details and examples.
??? Index exceeds matrix dimensions. Error in ==> check_if_has_header at 13 Error in ==> make_D_from_seg at 21 Error in ==> run_gistic_from_seg at 58 Error in ==> gp_gistic_from_seg at 179 MATLAB:badsubscript
If this does not resolve the issue, please contact us on the GenePattern help forum.
Yes, GISTIC supports the Affymetrix Human SNP 6.0 array.
If you have further questions please contact us on the GenePattern help forum.
Some computationally-intense modules can take a day or more to run. Some examples are FLAMEMetacluster, NMFConsensusClustering, GISTIC, and GLAD. In addition, server load can affect queuing times on the GenePattern public server, and this can affect the length of time a module can take to complete.
If your job does not use a computationally-intense module or a large data set, and it takes longer than about 4 hours to complete, please contact us on the GenePattern help forum.
If you receive the following errors while performing an analysis with ComparativeMarkerSelection:
Error in if (min(p) < 0 || max (p) > 1) \{: missing value where TRUE/FALSE needed Execution halted
or
ERROR: The estimated pi0 <=0. Check that you have valid p-values or use another lambda method.
then a gene in your data has insufficient variation in its expression values. Use the PreprocessDataset module with a filter that is more stringent than you have previously used on your data set before running ComparativeMarkerSelection.
If you continue to experience problems, please contact us on the GenePattern help forum.
The usual cause of this error is spaces in any of the input file names.
If you run an imported pipeline on your own GenePattern server, and you get the error, "No such module [module name]", when you know you have that module on your server, then the pipeline requires a version of the module that is not on your server. If you return to the pipeline page and click Properties, you can view the modules that are required but not installed. If you install these module versions from the repository, the pipeline will run.
The default IGV display option for a GCT or RES file is the Heatmap. For the heatmap to make sense, the data must be row-centered, scaled and possibly have a threshold applied.
For complete information, see the blog post about Using IGV Through GenePattern.
There are limitations on file upload size. Files uploaded via the Browse button on the module input page must be under 1.2 GB. To use larger files, there are a few options:
This error can be produced if there are hidden files or directories in the ZIP archive. This usually occurs on a Mac when using the "Compress" option from the right-click pop-up menu. If this is the case, you may want to use the zip command from the terminal window to zip files instead. If you didn't Compress on a Mac, then you should check that there are no hidden files in the ZIP archive.
The recommended format for RNA-seq data in IGV is the BAM file. If you run your SAM or BAM file as the input file for the SortSam module, you can sort and index it, and can convert a SAM file to BAM.
In addition, the IGVTools.sort and IGVTools.index modules can sort and index a SAM file. These modules are currently in beta. If you would like to use them, please contact us at gp-help(at)broadinstitute.org.
If your ZIP file has a directory in it, GenePattern cannot resolve it. Unfortunately, if you generated your ZIP archive using the Finder on the Macintosh OS, the Mac builds a directory structure into your ZIP archive and GenePattern cannot resolve it. To zip on a Mac, use the zip command from a terminal window; for example, if you wanted to create a ZIP archive called "all_foo" that contains the files all_foo.cls and all_foo.gct, you could use the following command:
zip all_foo all_foo.cls all_foo.gct
Some other reasons that your ZIP file may fail include spaces in the names of the files or hidden files. If you cannot locate the issue with your ZIP file, please contact us at gp-help(at)broadinstitute.org.
The first place to look for the reason is the stderr.txt file, which should be available in the job summary or job status page. This file often contains plain text indicating what went wrong with a job, such as formatting or filtering errors. If you find that this file does not help you resolve the error, please contact us at gp-help(at)broadinstitute.org.
You can use the RNA-seq modules either on the public server or by installing them on a GenePattern server installed on your machine or a network-accessible server.
If you have not installed GenePattern on your local machine, instructions for installing a local GenePattern server are provided on the Download GenePattern page.
If you have already installed a GenePattern server, select Modules & Pipelines>Install from repository. The page will present all available modules. You only need to select the checkboxes for the modules you want and click Install Checked.
Note: The main analysis RNA-seq modules (Bowtie, BWA, Cufflinks, TopHat, and Scripture) currently only run on Macintosh and Linux. If you do not have access to machines with these operating systems, you can use the modules on the GenePattern Cloud public server. The conversion/utility modules that are related to the RNA-seq modules are available for Macintosh, Linux, and Windows.
You may find it helpful to enable your GenePattern server to accept file paths in order to handle large input files that are already present on the system where your local server is installed. To do this, edit genepattern.properties (located in the resources directory under your GenePattern server directory) and make allow.input.file.paths=true. This allows users to input a network file path (such as file:///server/directory/file.gct) as the value for an input file parameter. When this value is set to true, you can define a root directory where the GenePattern server begins browsing for network files by setting server.browse.file.system.root to the root directory you want to specify.
Example: In genepattern.properties, setting server.browse.system.root=/Users/mydata/ngs will cause the browser window to open to /Users/mydata/ngs when a user chooses Specify File Path or URL.
Example: In the config_default.yaml file, setting server.browse.system.root: [ "/Users/mydata/ngs", "Users/shared"] will add two folders to the browser window.
There are a few reasons why this might occur. Jobs are often PENDING because GenePattern is a shared resource. When your job is in the PENDING state, it means that it is waiting in the queue behind other jobs for the GenePattern server to submit the job to the server farm. Jobs that use large files and access them via an external URL may hold up the line while those files are transferred to the GenePattern server, even keeping jobs that normally take a few seconds in PENDING.
The job will run when the queue clears up.
If this is a common issue on your GenePattern server, it is possible to configure it to help reduce the wait. If you want to reconfigure your GenePattern server in this way, please contact us at gp-help(at)broadinstitute.org.
If you tried to run your preprocessed GCT file and CLS file in ComparativeMarkerSelection, but it gives you the following error:
An error occurred while reading the file ClassFile.cls. Cause: Header line needs three numbers!
Make sure your CLS file is space delimited and not tab-delimited. This is the most common cause of this error. If this does not stop the error, please contact us at gp-help(at)broadinstitute.org.
If you try to run an indexed BAM file through a module and receive a warning that your index file (BAI) is older than your BAM file, it means that the timestamps for these files are out of sync. If you receive this warning, you should index your BAM file by using the SortSam module.
You can do this by creating a pipeline for the jobs you want to run in parallel.
Then you can submit your set of data files to the pipeline as batch job. For more information, see Batch Processing.
There are additional features that make it easier to work with large input files and to run batches of jobs in parallel:
On Windows, you need to select the files to be added to the ZIP archive (hold down the Control or Shift key while selecting to select a group). Then right-click on the group and select WinZip (or whichever zip application you have on your machine). Do not select a folder and zip it – that will create a directory inside the ZIP archive; if your ZIP archive has a directory in it, GenePattern cannot resolve it.
On Macintosh, if you generate your ZIP archive using the Finder, Mac builds a directory structure into your ZIP archive and GenePattern cannot resolve it. To zip files on a Mac, use the zip command from a terminal window (launched from Applications/Utilities); for example, if you wanted to create a ZIP archive called "all_foo" that contains the files all_foo.cls and all_foo.gct, you could use the following command:
zip all_foo all_foo.cls all_foo.gct
If you follow these instructions and find that GenePattern does not accept your ZIP file, check for spaces in the names of the files or hidden files in the ZIP archive. If you cannot locate the issue with your ZIP file, please contact us. The GenePattern team plans to develop a ZIP module to help users with creating ZIP archives.
As of GenePattern 3.3.3, GenePattern supports batch jobs. To use this feature:
The module will be run once for each file selected. All the job results for the batch will be listed under a single batch ID.
If you have difficulties with the batch upload function in GenePattern 3.3.3, please contact us at gp-help(at)broadinstitute.org.
While as of GenePattern 3.3.3, GenePattern supports the use of directories as input for modules, not all modules support this function.
A few quick ways to tell if a module does accept directories are:
ExpressionFileCreator does not currently support Exon arrays. The GenePattern module development team is working on a module for this.
There are currently no modules in GenePattern for submitting data to GEO. The NCBI has webtools for this purpose, such as GEOarchive.
To generate a new heat map image at a resolution near 300 dpi, you can:
If you already have a heat map image that you cannot for some reason recreate that is at 72 dpi, you can use an image manipulation application that can scale images (like Adobe Photoshop or GIMP) to increase the resolution to 180 dpi. This will shrink the image by half, but 180 dpi is usually the minimum resolution necessary for print publication.
Yes: the RNAseQC module in GenePattern calculates standard RNA-seq related metrics, including depth of coverage, ribosomal RNA contamination, continuity of coverage and GC bias. See the module documentation for the recommended data processing workflow for optimal use of this QC analysis.
Try the ClsFileCreator module in GenePattern. The ClsFileCreator is a wizard-based tool that can be used to create class label (CLS) files from array data in the GCT or RES file formats.
To install GISTIC follow the instructions below: Note that you must install on a 64-bit linux machine.
To install GISTIC on your 64-bit Linux machine, export it from the public GenePattern Server.
Select GISTIC from the list of modules. Click the export link to save the module in a zip file.
If you do not already have MATLAB installed, you will need to do so. An executable and instructions can be found on the GISTIC 2.0 github repository.
Once MATLAB is installed, you will need to add lines like the following to your <GenePatternServer>/resources/genepattern.properties or custom.properties file:
MATLAB_LIBRARY=/xchip/sqa/Modules/GISTIC/GISTIC2.0_stdAlone/MATLAB_Component_Runtime/v714/sys/java/jre/glnxa64/jre/lib/amd64\:/xchip/sqa/Modules/GISTIC/GISTIC2.0_stdAlone/MATLAB_Component_Runtime/v714/sys/java/jre/glnxa64/jre/lib/amd64/server\:/xchip/sqa/Modules/GISTIC/GISTIC2.0_stdAlone/MATLAB_Component_Runtime/v714/sys/java/jre/glnxa64/jre/lib/amd64/native_threads\:/xchip/sqa/Modules/GISTIC/GISTIC2.0_stdAlone/MATLAB_Component_Runtime/v714/sys/os/glnxa64\:/xchip/sqa/Modules/GISTIC/GISTIC2.0_stdAlone/MATLAB_Component_Runtime/v714/runtime/glnxa64\:/xchip/sqa/Modules/GISTIC/GISTIC2.0_stdAlone/MATLAB_Component_Runtime/v714/bin/glnxa64
and
APPLERES_DIR=/xchip/sqa/Modules/GISTIC/GISTIC2.0_stdAlone/MATLAB_Component_Runtime/v714/X11/app-defaults
Then restart your GenePattern Server. (To avoid restarting - enter these via the Administration>Custom page)
Then connect to your GenePattern Server, and import GISTIC from the zip file you just exported.
If you then wish to run it from command line. Please use one of the programming interfaces, as described in the Programmers Guide.
The GenePattern server itself does not connect to any database, but modules can and have been written to connect to databases and retrieve data from them including caArray (caArrayImportViewer) and Gene Expression Omnibus (GEOImporter). To connect to any database of your choice, write a simple command-line program to connect to the database and retrieve data into a file format and install this program as a module into GenePattern (see Creating Modules).
When creating a matlab visualizer using matlab 7.0 compiled m-code (any release before 7.4), any figures that you create in MATLAB must have the value visible set to on or they will not be drawn to the screen.
GenePattern modules can be written for any version of R. For details on how to specify which version to use, see Using Different Versions of R.
You can simply make it available in zip format on the web wherever you choose. If you would like to have it on the public GenePattern servers, contact us on our help forum.
GenePattern does not have a valid CPU Type for 64-bit platforms. So if you try to specify a 64-bit CPU Type, the module will fail on 64-bit platforms, whether or not they are running compatibility mode. You will have to set the CPU Type to 'any' and add more information on the appropriate platforms in your documentation. If this does not stop the module from failing on appropriate platforms, contact us at gp-help(at)broadinstitute.org.
The reference guide for accessing GenePattern modules from Java, MATLAB, and R is the Programmers Guide.
GenePattern provides a REST API for use by web applications. A WADL file for the REST API can be accessed at the URL below:
http://your_server:your_port/gp/rest/application.wadl
GenePattern also provides an older SOAP API. This API is deprecated, but is still available. The WSDL file for the GenePattern SOAP API is available at:
http://your_server:your_port/gp/services
For more information about the programming libraries, see the Programmers Guide.
A pipeline or module with a period in its name cannot be called from MATLAB.
CSV stands for "comma-separated values". While CSV files will open in Excel or similar spreadsheet applications, it is important to remember that the values in these files are comma-delimited, not space- or tab-delimited.
We do not currently distribute the source code for GISTIC. The executable is available and can be exported from our public GenePattern server. Note that the GISTIC module and executable are currently compiled only for 64-bit Linux.
The GISTIC developers are working on a version that will allow us to distribute the source code, but it is still currently in development.
First, please look at GenePattern Archive (GParc) to see if this will satisfy your requirements. If it seems that GParc is not the right answer for you, please contact the GenePattern team at gp-help(at)broadinstitute.org to begin discussing the possibility of releasing your module on the GenePattern public server. When you contact us, please provide your code and any documentation you currently have.
Yes, GISTIC will support Agilent data. However, you must convert your aCGH data into SEG (segmented) format. GenePattern does not currently provide a module for converting Agilent data to SEG format.
There are several points you need to check in your gene sets. Check that your gene identifiers are all uppercase if you are not using the collapse to gene symbols option. For other information, please see the error 1001 FAQ for GSEA for the list.
There are several things you can check in your files that commonly cause file errors:
The ConsensusClustering module does not work with Java 1.6.0_33 on Macintosh. As a workaround, you can run ConsensusClustering on the GenePattern public server, or on a server that is on a Windows machine or a Macintosh with a Java version other than 1.6.0_33.
Licensed modules can only be installed on servers running GenePattern 3.5 or higher. Upgrade your GenePattern server and try again.
GISTIC expects that the segments for a sample should cover almost all of its genome, even the regions where the copy number is normal. Any gaps in coverage for any sample are removed from the GISTIC analysis.
Several of the modules also accept reference genome annotation files (GTF files) and/or whole genome FASTA files. A list of these are available from our FTP site in the following locations:
The modules can usually accept an FTP URL directly wherever a file input is allowed, so there is no need for you to download the reference file; instead, just copy and paste the file's FTP URL into the file input parameter.
The best way to create a markers file for your data ( so that it matches correctly) is to take the first 3 columns from the copy number file you used as input to the segmentation method used to create the seg file for GISTIC.
The SNP 6.0 markers file used for our TCGA GISTIC analyses is available here: ftp://ftp.broadinstitute.org/
It is likely that you installed the Java 8 JRE via your browser, which allows you to run Java apps, but is not sufficient for running GenePattern. You need the Java JDK. https://java.com/en/download/manual.jsp An easy way to see what jdk you have installed is to bring up a terminal window and type "java -version" (without the quotes).