GNU bug report logs -
#36130
split bug
Previous Next
Reported by: Heather Wick <heather.c.wick <at> gmail.com>
Date: Fri, 7 Jun 2019 18:47:01 UTC
Severity: normal
Tags: notabug
Done: Assaf Gordon <assafgordon <at> gmail.com>
Bug is archived. No further changes may be made.
Full log
Message #8 received at 36130 <at> debbugs.gnu.org (full text, mbox):
Hello,
On Fri, Jun 07, 2019 at 02:23:15PM -0400, Heather Wick wrote:
> I am using split to split up some large, paired fastq files [...]:
>
> zcat MH1_R1.fastq.gz | split - -l 40000000 DHT_R1_
> zcat MH1_R2.fastq.gz | split - -l 40000000 DHT_R2_
>
> This creates 96 chunks for the R1 and 95 chunks for R2, even though the
> orignal fastq files have the same number of reads.
>
> Do you have any suggestions for how to proceed? Perhaps zcatting and piping
> the files is not the best way to call split?
To help diagnose to issue better, please run the following commands
and tell us what are the results:
1. number of lines in each file:
zcat MH1_R1.fastq.gz | wc -l
zcat MH1_R2.fastq.gz | wc -l
2. The first two sequence IDs:
zcat MH1_R1.fastq.gz | head -n8 | grep ^@
zcat MH1_R2.fastq.gz | head -n8 | grep ^@
3. Last two sequence IDs:
zcat MH1_R1.fastq.gz | tail -n8 | grep ^@
zcat MH1_R2.fastq.gz | tail -n8 | grep ^@
These will just verify the FASTQ files are indeed paired with no
surprises. The files should have the same number of lines,
and matching sequence IDs in the first and last lines.
regards,
- assaf
This bug report was last modified 5 years and 332 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.