Overview
In this tutorial, you'll learn how to convert VCF files to HapMap format using the command-line interface (CLI) of Tassel. The CLI version of Tassel is particularly useful for large datasets, where the GUI version might be less efficient. Follow the steps below to convert your VCF file to HapMap format.
Downloading Tassel Command-Line Version
First, you'll need to download the command-line version of Tassel 5.0. The following link provides detailed instructions for installation:
Prerequisites
Before you begin, ensure you have the following installed on your system:
- Java Development Kit (JDK) 8.0 or later is required for running Tassel.
Converting VCF to HapMap
Once you’ve installed Tassel and have your VCF file ready, follow these steps to convert it to HapMap format.
Place your VCF file in your current working directory, then run the following command:
./run_pipeline.pl -Xms10g -Xmx40g -fork1 -vcf data.vcf -export outputfilename -exportType Hapmap
Explanation of Command
./run_pipeline.pl
: Executes the pipeline script for Tassel.-Xms10g
: Specifies the initial memory size (10 GB). You can adjust this based on your system’s available memory.-Xmx40g
: Specifies the maximum memory size (40 GB). Modify this according to your system’s memory capacity.-fork1
: Runs the first step in parallel to speed up the process.-vcf data.vcf
: Specifies the input VCF file to be converted.-export outputfilename
: Defines the output file name for the HapMap format.-exportType Hapmap
: Specifies that the output format should be HapMap.
Output File
After running the command, the output will be a HapMap file. This file will contain your genomic data in the HapMap format, ready for further analysis.
Using the GUI Version (Optional)
While the command-line version is recommended for large datasets, Tassel also has a graphical user interface (GUI) version. If your VCF file is not too large, you can use the GUI to perform the conversion. However, for large datasets, the CLI is much more efficient.
Reference
For more information on the Tassel software and its applications, refer to the following paper:
"TASSEL: software for association mapping of complex traits in diverse samples", Peter J. Bradbury, Zhiwu Zhang, Dallas E. Kroon, Terry M. Casstevens, Yogesh Ramdoss, Edward S. Buckler, Bioinformatics, Volume 23, Issue 19, 1 October 2007, Pages 2633–2635.