[Tue Aug 16 04:11:51 CDT 2016] picard.illumina.IlluminaBasecallsToSam BASECALLS_DIR=testdata/picard/illumina/25T8B25T/Data/Intensities/BaseCalls LANE=1 RUN_BARCODE=HiMom READ_STRUCTURE=25T8B25T LIBRARY_PARAMS=/tmp/multiplexedBarcode.7057667497876239334.dir/barcode.params SEQUENCING_CENTER=BI PLATFORM=illumina ADAPTERS_TO_CHECK=[INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM] NUM_PROCESSORS=0 FORCE_GC=true APPLY_EAMSS_FILTER=true MAX_READS_IN_RAM_PER_TILE=1200000 MINIMUM_QUALITY=2 INCLUDE_NON_PF_READS=true IGNORE_UNEXPECTED_BARCODES=false MOLECULAR_INDEX_TAG=RX MOLECULAR_INDEX_BASE_QUALITY_TAG=QX VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=3098916 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
[Tue Aug 16 04:11:51 CDT 2016] Executing as jli@corona on Linux 3.19.0-61-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14; Picard version: null
INFO 2016-08-16 04:11:51 IlluminaBasecallsToSam DONE_READING STRUCTURE IS 25T8B25T
INFO 2016-08-16 04:11:51 IlluminaBasecallsConverter All work is complete.
INFO 2016-08-16 04:11:51 IlluminaBasecallsConverter All work is complete.
[Tue Aug 16 04:11:51 CDT 2016] picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=1776287744
[Tue Aug 16 04:11:52 CDT 2016] picard.illumina.IlluminaBasecallsToSam BASECALLS_DIR=testdata/picard/illumina/25T8B25T/Data/Intensities/BaseCalls LANE=1 RUN_BARCODE=HiMom READ_STRUCTURE=25T8B4M4M17T LIBRARY_PARAMS=/tmp/multiplexedBarcode2.88065765387413032.dir/barcode.params SEQUENCING_CENTER=BI PLATFORM=illumina ADAPTERS_TO_CHECK=[INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM] NUM_PROCESSORS=0 FORCE_GC=true APPLY_EAMSS_FILTER=true MAX_READS_IN_RAM_PER_TILE=1200000 MINIMUM_QUALITY=2 INCLUDE_NON_PF_READS=true IGNORE_UNEXPECTED_BARCODES=false MOLECULAR_INDEX_TAG=RX MOLECULAR_INDEX_BASE_QUALITY_TAG=QX VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=3098916 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
[Tue Aug 16 04:11:52 CDT 2016] Executing as jli@corona on Linux 3.19.0-61-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14; Picard version: null
INFO 2016-08-16 04:11:52 IlluminaBasecallsToSam DONE_READING STRUCTURE IS 25T8B4M4M17T
INFO 2016-08-16 04:11:52 IlluminaBasecallsConverter All work is complete.
INFO 2016-08-16 04:11:52 IlluminaBasecallsConverter All work is complete.
[Tue Aug 16 04:11:52 CDT 2016] picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=1776287744
[Tue Aug 16 04:11:52 CDT 2016] picard.illumina.IlluminaBasecallsToSam BASECALLS_DIR=testdata/picard/illumina/25T8B25T/Data/Intensities/BaseCalls LANE=1 RUN_BARCODE=HiMom READ_STRUCTURE=25T8B4M21T LIBRARY_PARAMS=/tmp/multiplexedBarcode.6506162777973282532.dir/barcode.params SEQUENCING_CENTER=BI PLATFORM=illumina ADAPTERS_TO_CHECK=[INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM] NUM_PROCESSORS=0 FORCE_GC=true APPLY_EAMSS_FILTER=true MAX_READS_IN_RAM_PER_TILE=1200000 MINIMUM_QUALITY=2 INCLUDE_NON_PF_READS=true IGNORE_UNEXPECTED_BARCODES=false MOLECULAR_INDEX_TAG=RX MOLECULAR_INDEX_BASE_QUALITY_TAG=QX VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=3098916 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
[Tue Aug 16 04:11:52 CDT 2016] Executing as jli@corona on Linux 3.19.0-61-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14; Picard version: null
INFO 2016-08-16 04:11:52 IlluminaBasecallsToSam DONE_READING STRUCTURE IS 25T8B4M21T
INFO 2016-08-16 04:11:52 IlluminaBasecallsConverter All work is complete.
INFO 2016-08-16 04:11:52 IlluminaBasecallsConverter All work is complete.
[Tue Aug 16 04:11:52 CDT 2016] picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=1761607680
[Tue Aug 16 04:11:52 CDT 2016] picard.illumina.IlluminaBasecallsToSam BASECALLS_DIR=testdata/picard/illumina/25T8B25T/Data/Intensities/BaseCalls LANE=1 RUN_BARCODE=HiMom READ_STRUCTURE=25T8B25T LIBRARY_PARAMS=/tmp/singleBarcodeAltName.4557201023634981693.dir/multiplexed_positive_rgtags.params SEQUENCING_CENTER=BI PLATFORM=illumina ADAPTERS_TO_CHECK=[INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM] NUM_PROCESSORS=0 FORCE_GC=true APPLY_EAMSS_FILTER=true MAX_READS_IN_RAM_PER_TILE=1200000 MINIMUM_QUALITY=2 INCLUDE_NON_PF_READS=true IGNORE_UNEXPECTED_BARCODES=false MOLECULAR_INDEX_TAG=RX MOLECULAR_INDEX_BASE_QUALITY_TAG=QX VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=3098916 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
[Tue Aug 16 04:11:52 CDT 2016] Executing as jli@corona on Linux 3.19.0-61-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14; Picard version: null
INFO 2016-08-16 04:11:52 IlluminaBasecallsToSam DONE_READING STRUCTURE IS 25T8B25T
INFO 2016-08-16 04:11:52 IlluminaBasecallsConverter All work is complete.
INFO 2016-08-16 04:11:52 IlluminaBasecallsConverter All work is complete.
[Tue Aug 16 04:11:52 CDT 2016] picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=1761607680
[Tue Aug 16 04:11:52 CDT 2016] picard.illumina.IlluminaBasecallsToSam BASECALLS_DIR=testdata/picard/illumina/25T8B25T/Data/Intensities/BaseCalls LANE=1 OUTPUT=/tmp/nonBarcoded.6232941135011143044.sam RUN_BARCODE=HiMom SAMPLE_ALIAS=HiDad LIBRARY_NAME=Hello, World READ_STRUCTURE=25S8S25T SEQUENCING_CENTER=BI PLATFORM=illumina ADAPTERS_TO_CHECK=[INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM] NUM_PROCESSORS=0 FORCE_GC=true APPLY_EAMSS_FILTER=true MAX_READS_IN_RAM_PER_TILE=1200000 MINIMUM_QUALITY=2 INCLUDE_NON_PF_READS=true IGNORE_UNEXPECTED_BARCODES=false MOLECULAR_INDEX_TAG=RX MOLECULAR_INDEX_BASE_QUALITY_TAG=QX VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=3098916 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
[Tue Aug 16 04:11:52 CDT 2016] Executing as jli@corona on Linux 3.19.0-61-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14; Picard version: null
INFO 2016-08-16 04:11:52 IlluminaBasecallsToSam DONE_READING STRUCTURE IS 25S8S25T
INFO 2016-08-16 04:11:52 IlluminaBasecallsConverter All work is complete.
[Tue Aug 16 04:11:52 CDT 2016] picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=1761607680
[Tue Aug 16 04:11:52 CDT 2016] picard.illumina.IlluminaBasecallsToSam BASECALLS_DIR=testdata/picard/illumina/25T8B25T/Data/Intensities/BaseCalls LANE=1 OUTPUT=/tmp/nonBarcodedWithDualMI.3742079275455323593.sam RUN_BARCODE=HiMom SAMPLE_ALIAS=HiDad LIBRARY_NAME=Hello, World READ_STRUCTURE=25S4M4M25T SEQUENCING_CENTER=BI PLATFORM=illumina ADAPTERS_TO_CHECK=[INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM] NUM_PROCESSORS=0 FORCE_GC=true APPLY_EAMSS_FILTER=true MAX_READS_IN_RAM_PER_TILE=1200000 MINIMUM_QUALITY=2 INCLUDE_NON_PF_READS=true IGNORE_UNEXPECTED_BARCODES=false MOLECULAR_INDEX_TAG=RX MOLECULAR_INDEX_BASE_QUALITY_TAG=QX VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=3098916 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
[Tue Aug 16 04:11:52 CDT 2016] Executing as jli@corona on Linux 3.19.0-61-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14; Picard version: null
INFO 2016-08-16 04:11:52 IlluminaBasecallsToSam DONE_READING STRUCTURE IS 25S4M4M25T
INFO 2016-08-16 04:11:52 IlluminaBasecallsConverter All work is complete.
[Tue Aug 16 04:11:52 CDT 2016] picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=1761607680
[Tue Aug 16 04:11:52 CDT 2016] picard.illumina.IlluminaBasecallsToSam BASECALLS_DIR=testdata/picard/illumina/25T8B25T/Data/Intensities/BaseCalls LANE=1 OUTPUT=/tmp/nonBarcodedWithMI.7985993685673628530.sam RUN_BARCODE=HiMom SAMPLE_ALIAS=HiDad LIBRARY_NAME=Hello, World READ_STRUCTURE=25S8M25T SEQUENCING_CENTER=BI PLATFORM=illumina ADAPTERS_TO_CHECK=[INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM] NUM_PROCESSORS=0 FORCE_GC=true APPLY_EAMSS_FILTER=true MAX_READS_IN_RAM_PER_TILE=1200000 MINIMUM_QUALITY=2 INCLUDE_NON_PF_READS=true IGNORE_UNEXPECTED_BARCODES=false MOLECULAR_INDEX_TAG=RX MOLECULAR_INDEX_BASE_QUALITY_TAG=QX VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=3098916 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
[Tue Aug 16 04:11:52 CDT 2016] Executing as jli@corona on Linux 3.19.0-61-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14; Picard version: null
INFO 2016-08-16 04:11:53 IlluminaBasecallsToSam DONE_READING STRUCTURE IS 25S8M25T
INFO 2016-08-16 04:11:53 IlluminaBasecallsConverter All work is complete.
[Tue Aug 16 04:11:53 CDT 2016] picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=1761607680
[Tue Aug 16 04:11:53 CDT 2016] picard.illumina.IlluminaBasecallsToSam BASECALLS_DIR=testdata/picard/illumina/25T8B25T/Data/Intensities/BaseCalls LANE=1 OUTPUT=/tmp/nonBarcodedWithDualMI.5368413457939786653.sam RUN_BARCODE=HiMom SAMPLE_ALIAS=HiDad LIBRARY_NAME=Hello, World READ_STRUCTURE=25S4M4M25T TAG_PER_MOLECULAR_INDEX=[ZA, ZB] SEQUENCING_CENTER=BI PLATFORM=illumina ADAPTERS_TO_CHECK=[INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM] NUM_PROCESSORS=0 FORCE_GC=true APPLY_EAMSS_FILTER=true MAX_READS_IN_RAM_PER_TILE=1200000 MINIMUM_QUALITY=2 INCLUDE_NON_PF_READS=true IGNORE_UNEXPECTED_BARCODES=false MOLECULAR_INDEX_TAG=RX MOLECULAR_INDEX_BASE_QUALITY_TAG=QX VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=3098916 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
[Tue Aug 16 04:11:53 CDT 2016] Executing as jli@corona on Linux 3.19.0-61-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14; Picard version: null
INFO 2016-08-16 04:11:53 IlluminaBasecallsToSam DONE_READING STRUCTURE IS 25S4M4M25T
INFO 2016-08-16 04:11:53 IlluminaBasecallsConverter All work is complete.
[Tue Aug 16 04:11:53 CDT 2016] picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=1761607680
[Tue Aug 16 04:11:53 CDT 2016] picard.illumina.IlluminaBasecallsToSam BASECALLS_DIR=testdata/picard/illumina/25T8B25T/Data/Intensities/BaseCalls LANE=1 OUTPUT=/tmp/nonBarcodedWithDualMI.7330189718968577274.sam RUN_BARCODE=HiMom SAMPLE_ALIAS=HiDad LIBRARY_NAME=Hello, World READ_STRUCTURE=25S2M2M2M2M25T TAG_PER_MOLECULAR_INDEX=[ZA, ZB, ZC, ZD] SEQUENCING_CENTER=BI PLATFORM=illumina ADAPTERS_TO_CHECK=[INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM] NUM_PROCESSORS=0 FORCE_GC=true APPLY_EAMSS_FILTER=true MAX_READS_IN_RAM_PER_TILE=1200000 MINIMUM_QUALITY=2 INCLUDE_NON_PF_READS=true IGNORE_UNEXPECTED_BARCODES=false MOLECULAR_INDEX_TAG=RX MOLECULAR_INDEX_BASE_QUALITY_TAG=QX VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=3098916 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
[Tue Aug 16 04:11:53 CDT 2016] Executing as jli@corona on Linux 3.19.0-61-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14; Picard version: null
INFO 2016-08-16 04:11:53 IlluminaBasecallsToSam DONE_READING STRUCTURE IS 25S2M2M2M2M25T
INFO 2016-08-16 04:11:53 IlluminaBasecallsConverter All work is complete.
[Tue Aug 16 04:11:53 CDT 2016] picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=1761607680
The number of tags given in TAG_PER_MOLECULAR_INDEX does not match the number of molecular indexes in READ_STRUCTURE
USAGE: IlluminaBasecallsToSam [options]
Documentation: http://broadinstitute.github.io/picard/command-line-overview.html#IlluminaBasecallsToSam
Transforms raw Illumina sequencing data into an unmapped SAM or BAM file.
The IlluminaBaseCallsToSam program collects, demultiplexes, and sorts reads across all of the tiles of a lane via barcode to produce an unmapped SAM/BAM file. An unmapped BAM file is often referred to as a uBAM. All barcode, sample, and library data is provided in the LIBRARY_PARAMS file. Note, this LIBRARY_PARAMS file should be formatted according to the specifications indicated below. The following is an example of a properly formmated LIBRARY_PARAMS file:
BARCODE_1 OUTPUT SAMPLE_ALIAS LIBRARY_NAME
AAAAAAAA SA_AAAAAAAA.bam SA_AAAAAAAA LN_AAAAAAAA
AAAAGAAG SA_AAAAGAAG.bam SA_AAAAGAAG LN_AAAAGAAG
AACAATGG SA_AACAATGG.bam SA_AACAATGG LN_AACAATGG
N SA_non_indexed.bam SA_non_indexed LN_NNNNNNNN
The BARCODES_DIR file is produced by the ExtractIlluminaBarcodes (http://broadinstitute.github.io/picard/command-line-overview.html#ExtractIlluminaBarcodes) tool for each lane of a flow cell.
Usage example:
java -jar picard.jar IlluminaBasecallsToSam \
BASECALLS_DIR=/BaseCalls/ \
LANE=001 \
READ_STRUCTURE=25T8B25T \
RUN_BARCODE=run15 \
IGNORE_UNEXPECTED_BARCODES=true \
LIBRARY_PARAMS=library.params
Version: null
Options:
--help
-h Displays options specific to this tool.
--stdhelp
-H Displays options specific to this tool AND options common to all Picard command line
tools.
--version Displays program version.
BASECALLS_DIR=File
B=File The basecalls directory. Required.
BARCODES_DIR=File
BCD=File The barcodes directory with _barcode.txt files (generated by ExtractIlluminaBarcodes). If
not set, use BASECALLS_DIR. Default value: null.
LANE=Integer
L=Integer Lane number. Required.
OUTPUT=File
O=File Deprecated (use LIBRARY_PARAMS). The output SAM or BAM file. Format is determined by
extension. Required. Cannot be used in conjuction with option(s) LIBRARY_PARAMS
BARCODE_PARAMS
RUN_BARCODE=String The barcode of the run. Prefixed to read names. Required.
SAMPLE_ALIAS=String
ALIAS=String Deprecated (use LIBRARY_PARAMS). The name of the sequenced sample Required. Cannot be
used in conjuction with option(s) LIBRARY_PARAMS BARCODE_PARAMS
READ_GROUP_ID=String
RG=String ID used to link RG header record with RG tag in SAM record. If these are unique in SAM
files that get merged, merge performance is better. If not specified, READ_GROUP_ID will
be set to <first 5 chars of RUN_BARCODE>. . Default value: null.
LIBRARY_NAME=String
LIB=String Deprecated (use LIBRARY_PARAMS). The name of the sequenced library Default value: null.
Cannot be used in conjuction with option(s) LIBRARY_PARAMS BARCODE_PARAMS
SEQUENCING_CENTER=String The name of the sequencing center that produced the reads. Used to set the RG.CN tag.
Default value: BI. This option can be set to 'null' to clear the default value.
RUN_START_DATE=Date The start date of the run. Default value: null.
PLATFORM=String The name of the sequencing technology that produced the read. Default value: illumina.
This option can be set to 'null' to clear the default value.
READ_STRUCTURE=String
RS=String A description of the logical structure of clusters in an Illumina Run, i.e. a description
of the structure IlluminaBasecallsToSam assumes the data to be in. It should consist of
integer/character pairs describing the number of cycles and the type of those cycles (B
for Sample Barcode, M for molecular barcode, T for Template, and S for skip). E.g. If
the input data consists of 80 base clusters and we provide a read structure of
"28T8M8B8S28T" then the sequence may be split up into four reads:
* read one with 28 cycles (bases) of template
* read two with 8 cycles (bases) of molecular barcode (ex. unique molecular barcode)
* read three with 8 cycles (bases) of sample barcode
* 8 cycles (bases) skipped.
* read four with 28 cycles (bases) of template
The skipped cycles would NOT be included in an output SAM/BAM file or in read groups
therein. Required.
BARCODE_PARAMS=File Deprecated (use LIBRARY_PARAMS). Tab-separated file for creating all output BAMs for
barcoded run with single IlluminaBasecallsToSam invocation. Columns are BARCODE, OUTPUT,
SAMPLE_ALIAS, and LIBRARY_NAME. Row with BARCODE=N is used to specify a file for no
barcode match Required. Cannot be used in conjuction with option(s) LIBRARY_PARAMS
SAMPLE_ALIAS (ALIAS) OUTPUT (O) LIBRARY_NAME (LIB)
LIBRARY_PARAMS=File Tab-separated file for creating all output BAMs for a lane with single
IlluminaBasecallsToSam invocation. The columns are OUTPUT, SAMPLE_ALIAS, and
LIBRARY_NAME, BARCODE_1, BARCODE_2 ... BARCODE_X where X = number of barcodes per cluster
(optional). Row with BARCODE_1 set to 'N' is used to specify a file for no barcode
match. You may also provide any 2 letter RG header attributes (excluding PU, CN, PL, and
DT) as columns in this file and the values for those columns will be inserted into the
RG tag for the BAM file created for a given row. Required. Cannot be used in conjuction
with option(s) SAMPLE_ALIAS (ALIAS) OUTPUT (O) LIBRARY_NAME (LIB) BARCODE_PARAMS
ADAPTERS_TO_CHECK=IlluminaAdapterPair
Which adapters to look for in the read. Default value: [INDEXED, DUAL_INDEXED,
NEXTERA_V2, FLUIDIGM]. This option can be set to 'null' to clear the default value.
Possible values: {PAIRED_END, INDEXED, SINGLE_END, NEXTERA_V1, NEXTERA_V2, DUAL_INDEXED,
FLUIDIGM, TRUSEQ_SMALLRNA, ALTERNATIVE_SINGLE_END} This option may be specified 0 or more
times. This option can be set to 'null' to clear the default list.
NUM_PROCESSORS=Integer The number of threads to run in parallel. If NUM_PROCESSORS = 0, number of cores is
automatically set to the number of cores available on the machine. If NUM_PROCESSORS < 0,
then the number of cores used will be the number available on the machine less
NUM_PROCESSORS. Default value: 0. This option can be set to 'null' to clear the default
value.
FIRST_TILE=Integer If set, this is the first tile to be processed (used for debugging). Note that tiles are
not processed in numerical order. Default value: null.
TILE_LIMIT=Integer If set, process no more than this many tiles (used for debugging). Default value: null.
FORCE_GC=Boolean If true, call System.gc() periodically. This is useful in cases in which the -Xmx value
passed is larger than the available memory. Default value: true. This option can be set
to 'null' to clear the default value. Possible values: {true, false}
APPLY_EAMSS_FILTER=Boolean Apply EAMSS filtering to identify inappropriately quality scored bases towards the ends
of reads and convert their quality scores to Q2. Default value: true. This option can be
set to 'null' to clear the default value. Possible values: {true, false}
MAX_READS_IN_RAM_PER_TILE=Integer
Configure SortingCollections to store this many records before spilling to disk. For an
indexed run, each SortingCollection gets this value/number of indices. Default value:
1200000. This option can be set to 'null' to clear the default value.
MINIMUM_QUALITY=Integer The minimum quality (after transforming 0s to 1s) expected from reads. If qualities are
lower than this value, an error is thrown.The default of 2 is what the Illumina's spec
describes as the minimum, but in practice the value has been observed lower. Default
value: 2. This option can be set to 'null' to clear the default value.
INCLUDE_NON_PF_READS=Boolean
NONPF=Boolean Whether to include non-PF reads Default value: true. This option can be set to 'null' to
clear the default value. Possible values: {true, false}
IGNORE_UNEXPECTED_BARCODES=Boolean
INGORE_UNEXPECTED=Boolean Whether to ignore reads whose barcodes are not found in LIBRARY_PARAMS. Useful when
outputting BAMs for only a subset of the barcodes in a lane. Default value: false. This
option can be set to 'null' to clear the default value. Possible values: {true, false}
MOLECULAR_INDEX_TAG=String The tag to use to store any molecular indexes. If more than one molecular index is
found, they will be concatenated and stored here. Default value: RX. This option can be
set to 'null' to clear the default value.
MOLECULAR_INDEX_BASE_QUALITY_TAG=String
The tag to use to store any molecular index base qualities. If more than one molecular
index is found, their qualities will be concatenated and stored here (.i.e. the number of
"M" operators in the READ_STRUCTURE) Default value: QX. This option can be set to 'null'
to clear the default value.
TAG_PER_MOLECULAR_INDEX=StringThe list of tags to store each molecular index. The number of tags should match the
number of molecular indexes. Default value: null. This option may be specified 0 or more
times.
The number of tags given in TAG_PER_MOLECULAR_INDEX does not match the number of molecular indexes in READ_STRUCTURE
USAGE: IlluminaBasecallsToSam [options]
Documentation: http://broadinstitute.github.io/picard/command-line-overview.html#IlluminaBasecallsToSam
Transforms raw Illumina sequencing data into an unmapped SAM or BAM file.
The IlluminaBaseCallsToSam program collects, demultiplexes, and sorts reads across all of the tiles of a lane via barcode to produce an unmapped SAM/BAM file. An unmapped BAM file is often referred to as a uBAM. All barcode, sample, and library data is provided in the LIBRARY_PARAMS file. Note, this LIBRARY_PARAMS file should be formatted according to the specifications indicated below. The following is an example of a properly formmated LIBRARY_PARAMS file:
BARCODE_1 OUTPUT SAMPLE_ALIAS LIBRARY_NAME
AAAAAAAA SA_AAAAAAAA.bam SA_AAAAAAAA LN_AAAAAAAA
AAAAGAAG SA_AAAAGAAG.bam SA_AAAAGAAG LN_AAAAGAAG
AACAATGG SA_AACAATGG.bam SA_AACAATGG LN_AACAATGG
N SA_non_indexed.bam SA_non_indexed LN_NNNNNNNN
The BARCODES_DIR file is produced by the ExtractIlluminaBarcodes (http://broadinstitute.github.io/picard/command-line-overview.html#ExtractIlluminaBarcodes) tool for each lane of a flow cell.
Usage example:
java -jar picard.jar IlluminaBasecallsToSam \
BASECALLS_DIR=/BaseCalls/ \
LANE=001 \
READ_STRUCTURE=25T8B25T \
RUN_BARCODE=run15 \
IGNORE_UNEXPECTED_BARCODES=true \
LIBRARY_PARAMS=library.params
Version: null
Options:
--help
-h Displays options specific to this tool.
--stdhelp
-H Displays options specific to this tool AND options common to all Picard command line
tools.
--version Displays program version.
BASECALLS_DIR=File
B=File The basecalls directory. Required.
BARCODES_DIR=File
BCD=File The barcodes directory with _barcode.txt files (generated by ExtractIlluminaBarcodes). If
not set, use BASECALLS_DIR. Default value: null.
LANE=Integer
L=Integer Lane number. Required.
OUTPUT=File
O=File Deprecated (use LIBRARY_PARAMS). The output SAM or BAM file. Format is determined by
extension. Required. Cannot be used in conjuction with option(s) LIBRARY_PARAMS
BARCODE_PARAMS
RUN_BARCODE=String The barcode of the run. Prefixed to read names. Required.
SAMPLE_ALIAS=String
ALIAS=String Deprecated (use LIBRARY_PARAMS). The name of the sequenced sample Required. Cannot be
used in conjuction with option(s) LIBRARY_PARAMS BARCODE_PARAMS
READ_GROUP_ID=String
RG=String ID used to link RG header record with RG tag in SAM record. If these are unique in SAM
files that get merged, merge performance is better. If not specified, READ_GROUP_ID will
be set to <first 5 chars of RUN_BARCODE>. . Default value: null.
LIBRARY_NAME=String
LIB=String Deprecated (use LIBRARY_PARAMS). The name of the sequenced library Default value: null.
Cannot be used in conjuction with option(s) LIBRARY_PARAMS BARCODE_PARAMS
SEQUENCING_CENTER=String The name of the sequencing center that produced the reads. Used to set the RG.CN tag.
Default value: BI. This option can be set to 'null' to clear the default value.
RUN_START_DATE=Date The start date of the run. Default value: null.
PLATFORM=String The name of the sequencing technology that produced the read. Default value: illumina.
This option can be set to 'null' to clear the default value.
READ_STRUCTURE=String
RS=String A description of the logical structure of clusters in an Illumina Run, i.e. a description
of the structure IlluminaBasecallsToSam assumes the data to be in. It should consist of
integer/character pairs describing the number of cycles and the type of those cycles (B
for Sample Barcode, M for molecular barcode, T for Template, and S for skip). E.g. If
the input data consists of 80 base clusters and we provide a read structure of
"28T8M8B8S28T" then the sequence may be split up into four reads:
* read one with 28 cycles (bases) of template
* read two with 8 cycles (bases) of molecular barcode (ex. unique molecular barcode)
* read three with 8 cycles (bases) of sample barcode
* 8 cycles (bases) skipped.
* read four with 28 cycles (bases) of template
The skipped cycles would NOT be included in an output SAM/BAM file or in read groups
therein. Required.
BARCODE_PARAMS=File Deprecated (use LIBRARY_PARAMS). Tab-separated file for creating all output BAMs for
barcoded run with single IlluminaBasecallsToSam invocation. Columns are BARCODE, OUTPUT,
SAMPLE_ALIAS, and LIBRARY_NAME. Row with BARCODE=N is used to specify a file for no
barcode match Required. Cannot be used in conjuction with option(s) LIBRARY_PARAMS
SAMPLE_ALIAS (ALIAS) OUTPUT (O) LIBRARY_NAME (LIB)
LIBRARY_PARAMS=File Tab-separated file for creating all output BAMs for a lane with single
IlluminaBasecallsToSam invocation. The columns are OUTPUT, SAMPLE_ALIAS, and
LIBRARY_NAME, BARCODE_1, BARCODE_2 ... BARCODE_X where X = number of barcodes per cluster
(optional). Row with BARCODE_1 set to 'N' is used to specify a file for no barcode
match. You may also provide any 2 letter RG header attributes (excluding PU, CN, PL, and
DT) as columns in this file and the values for those columns will be inserted into the
RG tag for the BAM file created for a given row. Required. Cannot be used in conjuction
with option(s) SAMPLE_ALIAS (ALIAS) OUTPUT (O) LIBRARY_NAME (LIB) BARCODE_PARAMS
ADAPTERS_TO_CHECK=IlluminaAdapterPair
Which adapters to look for in the read. Default value: [INDEXED, DUAL_INDEXED,
NEXTERA_V2, FLUIDIGM]. This option can be set to 'null' to clear the default value.
Possible values: {PAIRED_END, INDEXED, SINGLE_END, NEXTERA_V1, NEXTERA_V2, DUAL_INDEXED,
FLUIDIGM, TRUSEQ_SMALLRNA, ALTERNATIVE_SINGLE_END} This option may be specified 0 or more
times. This option can be set to 'null' to clear the default list.
NUM_PROCESSORS=Integer The number of threads to run in parallel. If NUM_PROCESSORS = 0, number of cores is
automatically set to the number of cores available on the machine. If NUM_PROCESSORS < 0,
then the number of cores used will be the number available on the machine less
NUM_PROCESSORS. Default value: 0. This option can be set to 'null' to clear the default
value.
FIRST_TILE=Integer If set, this is the first tile to be processed (used for debugging). Note that tiles are
not processed in numerical order. Default value: null.
TILE_LIMIT=Integer If set, process no more than this many tiles (used for debugging). Default value: null.
FORCE_GC=Boolean If true, call System.gc() periodically. This is useful in cases in which the -Xmx value
passed is larger than the available memory. Default value: true. This option can be set
to 'null' to clear the default value. Possible values: {true, false}
APPLY_EAMSS_FILTER=Boolean Apply EAMSS filtering to identify inappropriately quality scored bases towards the ends
of reads and convert their quality scores to Q2. Default value: true. This option can be
set to 'null' to clear the default value. Possible values: {true, false}
MAX_READS_IN_RAM_PER_TILE=Integer
Configure SortingCollections to store this many records before spilling to disk. For an
indexed run, each SortingCollection gets this value/number of indices. Default value:
1200000. This option can be set to 'null' to clear the default value.
MINIMUM_QUALITY=Integer The minimum quality (after transforming 0s to 1s) expected from reads. If qualities are
lower than this value, an error is thrown.The default of 2 is what the Illumina's spec
describes as the minimum, but in practice the value has been observed lower. Default
value: 2. This option can be set to 'null' to clear the default value.
INCLUDE_NON_PF_READS=Boolean
NONPF=Boolean Whether to include non-PF reads Default value: true. This option can be set to 'null' to
clear the default value. Possible values: {true, false}
IGNORE_UNEXPECTED_BARCODES=Boolean
INGORE_UNEXPECTED=Boolean Whether to ignore reads whose barcodes are not found in LIBRARY_PARAMS. Useful when
outputting BAMs for only a subset of the barcodes in a lane. Default value: false. This
option can be set to 'null' to clear the default value. Possible values: {true, false}
MOLECULAR_INDEX_TAG=String The tag to use to store any molecular indexes. If more than one molecular index is
found, they will be concatenated and stored here. Default value: RX. This option can be
set to 'null' to clear the default value.
MOLECULAR_INDEX_BASE_QUALITY_TAG=String
The tag to use to store any molecular index base qualities. If more than one molecular
index is found, their qualities will be concatenated and stored here (.i.e. the number of
"M" operators in the READ_STRUCTURE) Default value: QX. This option can be set to 'null'
to clear the default value.
TAG_PER_MOLECULAR_INDEX=StringThe list of tags to store each molecular index. The number of tags should match the
number of molecular indexes. Default value: null. This option may be specified 0 or more
times.
[Tue Aug 16 04:11:53 CDT 2016] picard.illumina.IlluminaBasecallsToSam BASECALLS_DIR=testdata/picard/illumina/25T8B25T/Data/Intensities/BaseCalls LANE=1 OUTPUT=/tmp/nonBarcodedWithMI.4719340022267850949.sam RUN_BARCODE=HiMom SAMPLE_ALIAS=HiDad LIBRARY_NAME=Hello, World READ_STRUCTURE=25S8M25T TAG_PER_MOLECULAR_INDEX=[] SEQUENCING_CENTER=BI PLATFORM=illumina ADAPTERS_TO_CHECK=[INDEXED, DUAL_INDEXED, NEXTERA_V2, FLUIDIGM] NUM_PROCESSORS=0 FORCE_GC=true APPLY_EAMSS_FILTER=true MAX_READS_IN_RAM_PER_TILE=1200000 MINIMUM_QUALITY=2 INCLUDE_NON_PF_READS=true IGNORE_UNEXPECTED_BARCODES=false MOLECULAR_INDEX_TAG=RX MOLECULAR_INDEX_BASE_QUALITY_TAG=QX VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=3098916 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
[Tue Aug 16 04:11:53 CDT 2016] Executing as jli@corona on Linux 3.19.0-61-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14; Picard version: null
INFO 2016-08-16 04:11:53 IlluminaBasecallsToSam DONE_READING STRUCTURE IS 25S8M25T
INFO 2016-08-16 04:11:53 IlluminaBasecallsConverter All work is complete.
[Tue Aug 16 04:11:53 CDT 2016] picard.illumina.IlluminaBasecallsToSam done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=1767374848