Using CrosscheckFingerprints for panelseq data
Hi,
I have a bunch of tumour normal pair of panelseq data. I was wondering if the crosscheckfingerprint can be used with this kind of data.
What I get currently are inconclusive. Do I need to adjust the LOD thresholds for panelseq data
## htsjdk.samtools.metrics.StringHeader
# CrosscheckFingerprints --INPUT malign.bam --INPUT control.bam --CROSSCHECK_MODE CHECK_ALL_OTHERS --OUTPUT malign_control.crosscheck_metrics --HAPLOTYPE_MAP Homo_sapiens_assembly19.haplotype_database.txt --LOD_THRESHOLD -5.0 --CROSSCHECK_BY FILE --EXPECT_ALL_GROUPS_TO_MATCH true --REQUIRE_INDEX_FILES false --NUM_THREADS 1 --CALCULATE_TUMOR_AWARE_RESULTS true --ALLOW_DUPLICATE_READS false --GENOTYPING_ERROR_RATE 0.01 --OUTPUT_ERRORS_ONLY false --LOSS_OF_HET_RATE 0.5 --EXIT_CODE_WHEN_MISMATCH 1 --EXIT_CODE_WHEN_NO_VALID_CHECKS 1 --MAX_EFFECT_OF_EACH_HAPLOTYPE_BLOCK 3.0 --TEST_INPUT_READABILITY true --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 2 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
## htsjdk.samtools.metrics.StringHeader
# Started on: Tue Apr 16 18:03:59 CEST 2024
## METRICS CLASS picard.fingerprint.CrosscheckMetric
LEFT_GROUP_VALUE RIGHT_GROUP_VALUE RESULT DATA_TYPE LOD_SCORE LOD_SCORE_TUMOR_NORMAL LOD_SCORE_NORMAL_TUMOR LEFT_RUN_BARCODE LEFT_LANE LEFT_MOLECULAR_BARCODE_SEQUENCE LEFT_LIBRARY LEFT_SAMPLE LEFT_FILE RIGHT_RUN_BARCODE RIGHT_LANE RIGHT_MOLECULAR_BARCODE_SEQUENCE RIGHT_LIBRARY RIGHT_SAMPLE RIGHT_FILE
malign.bam::malign malign.bam::malign EXPECTED_MATCH FILE 5.55237 5.372379 5.372379 ? -1 ? Agilent_SureSelect_XTHS malign malign.bam ? -1 ? Agilent_SureSelect_XTHS malign malign.bam
malign.bam::malign control.bam::control INCONCLUSIVE FILE 1.918966 1.808918 1.818223 ? -1 ? Agilent_SureSelect_XTHS malign malign.bam ? -1 ? Agilent_SureSelect_XTHS control control.bam
control.bam::control malign.bam::malign INCONCLUSIVE FILE 1.918966 1.818223 1.808918 ? -1 ? Agilent_SureSelect_XTHS control control.bam ? -1 ? Agilent_SureSelect_XTHS malign malign.bam
control.bam::control control.bam::control INCONCLUSIVE FILE 4.201154 4.018887 4.018887 ? -1 ? Agilent_SureSelect_XTHS control control.bam ? -1 ? Agilent_SureSelect_XTHS control control.bam
-
Hi Shashwat Sahay,
Yes, you can use CrosscheckFIngerprints for panel data.
The low LOD scores you are seeing generally imply that not very many of the fingerprinting sites are well covered. This could either be due to very few of the fingerprinting sites being covered by your targets, or your data having low coverage in general.
You could adjust the LOD_THRESHOLD parameter for you data, but this will increase the chances that the conclusion of the tool will be incorrect.
A better approach, imo, would be to use a fingerprint map that is better suited to your data (ie, one where more sites are covered). You could probably spend a bunch of time trying to optimize for your particular data type, but I think a much faster, and nearly as effective approach would be to just grab one of the fingerprinting maps from here and try using that (shouldn't really matter which one, just make sure it's the correct reference). These maps contain way more sites than the standard maps that are generally used, so you should see your LOD scores increase significantly.
Please sign in to leave a comment.
1 comment