Genome Analysis Toolkit

Variant Discovery in High-Throughput Sequencing Data

GATK process banner

Need Help?

Search our documentation

Community Forum

Hi, How can we help?

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. Learn more

Read groups Follow

3 comments

  • Avatar
    lee Ethan

    hello, when I used gatk MarkDuplicates command to mark duplicates for my bam file(after I used the bwa mem command ),the code as in the folllowing:

    $bwa mem -t 10 \
                  -M
    -R "@RG\tID:${ID}\\tSM:${ID}\\tLB:Targeted\\tPL:Illumina"
                            $wkdir1/5.ref/hg_38/hg38.fa \
                                   $wkdir1/3.clean/Yangliying/fastp/${sample}_R1.fastp.fastq.gz \
                               $wkdir1/3.clean/Yangliying/fastp/${sample}_R2.fastp.fastq.gz \
                            | samtools sort -@ 10 -o   $wkdir1/4.align/bwa/Yangliying/test1/${ID}.bam -\
                            2>$wkdir1/4.align/bwa/Yangliying/test1/${ID}.sorted.log    


    $gatk MarkDuplicates \

    -I /home/data/vip8t13/wes_pro1/4.align/bwa/Yangliying/R18067578LU01.bam \

    -M /home/data/vip8t13/wes_pro1/6.gatk/Yangliying/R18067578LU01.markdup_metrics.txt \

    -O /home/data/vip8t13/wes_pro1/6.gatk/Yangliying/R18067578LU01.sort.markdup.bam

    $ samtools view -H R18067578LU01.bam | grep '^@RG'
    @RG     ID:R18067578LU01        SM:R18067578LU01        LB:Targeted     PL:Illumina
    $ samtools view -H R18067578LU01.bam | grep '^@PG'
    @PG     ID:bwa  PN:bwa  VN:0.7.17-r1188 CL:bwa mem -t 10 -M -R @RG\tID:R18067578LU01\tSM:R18067578LU01\tLB:Targeted\tPL:Illumina /home/data/vip8t13/wes_pro1/5.ref/hg_38/hg38.fa /home/data/vip8t13/wes_pro1/3.clean/Yangliying/R18067578LU01-Yangliying_R1_val_1.fq.gz /home/data/vip8t13/wes_pro1/3.clean/Yangliying/R18067578LU01-Yangliying_R2_val_2.fq.gz
    @PG     ID:samtools     PN:samtools     PP:bwa  VN:     CL:samtools sort -@ 10 -o /home/data/vip8t13/wes_pro1/4.align/bwa/Yangliying/R18067578LU01.bam -
    @PG     ID:samtools.1   PN:samtools     PP:samtools     VN:     CL:samtools view -H R18067578LU01.bam

    To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
    htsjdk.samtools.SAMFormatException: Error parsing SAM header. Problem parsing @PG key:value pair. Line:
    @PG     ID:samtools     PN:samtools     PP:bwa  VN:     CL:samtools sort -@ 10 -o /home/data/vip8t13/wes_pro1/4.align/bwa/Yangliying/R18067578LU01.bam -; File /home/data/vip8t13/wes_pro1/4.align/bwa/Yangliying/R18067578LU01.bam; Line number 644
            at htsjdk.samtools.SAMTextHeaderCodec.reportErrorParsingLine(SAMTextHeaderCodec.java:258)
            at htsjdk.samtools.SAMTextHeaderCodec.access$200(SAMTextHeaderCodec.java:46)
            at htsjdk.samtools.SAMTextHeaderCodec$ParsedHeaderLine.<init>(SAMTextHeaderCodec.java:307)
            at htsjdk.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:97)
            at htsjdk.samtools.BAMFileReader.readHeader(BAMFileReader.java:704)
            at htsjdk.samtools.BAMFileReader.<init>(BAMFileReader.java:298)
            at htsjdk.samtools.BAMFileReader.<init>(BAMFileReader.java:176)
            at htsjdk.samtools.SamReaderFactory$SamReaderFactoryImpl.open(SamReaderFactory.java:406)
            at picard.sam.markduplicates.util.AbstractMarkDuplicatesCommandLineProgram.openInputs(AbstractMarkDuplicatesCommandLineProgram.java:265)
            at picard.sam.markduplicates.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:507)
            at picard.sam.markduplicates.MarkDuplicates.doWork(MarkDuplicates.java:258)
            at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:308)
            at org.broadinstitute.hellbender.cmdline.PicardCommandLineProgramExecutor.instanceMain(PicardCommandLineProgramExecutor.java:37)
            at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
            at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
            at org.broadinstitute.hellbender.Main.main(Main.java:289)

    0
    Comment actions Permalink
  • Avatar
    maryam montazeri

    Hi
    How find RGID,RGSM of addorreadgroup for rnaseq data of SRA?
    I only know the platform.

    0
    Comment actions Permalink
  • Avatar
    Shin Lin

    Just wanted to confirm that ONT is not a valid value for PL.  Thanks.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk