Aidenlab Juicer Versions Save

A One-Click System for Analyzing Loop-Resolution Hi-C Experiments

1.6

3 years ago

Major updates to CPU, SLURM, and UGER (note that we cannot test PBS or LSF and thus are not updating; since AWS will soon be superseded by ENCODE, we have also ceased development on that branch).

This is the last release before we convert entirely to Juicer 2.0 (the ENCODE version, currently under pre-release).

Major:

Intra fragment reads are NO LONGER discarded by default. To discard them from the hic file, use the flag --skip-intra-frag when calling "pre". This depends on using the latest jar for Juicer Tools: https://github.com/aidenlab/juicer/wiki/Download . Using old jars will result in old behavior (silently discarding intrafragment reads)
The latest jar has extensive bug fixes
BWA now aligns in paired end mode. This requires BWA version 0.7.17 or higher; short read and short end mode are now deprecated
Changed chimeric blacklist to handle quadruple reads and eliminate MT exception
Rewrite of generate_site_positions
The default site is now "none" if no site is sent in
Fragment maps no longer included in Hi-C file by default. Before you would exclude them with the -x flag; now use -f to include.
Dups has a bug fix for some degenerate cases resulting in large memory usage. Also now have flag -j for "just exact matches"; this will only eliminate exact match duplicates. Overall this flag is not recommended. However, if you find your jobs are often getting stuck at the dedup phase, it can be because of low complexity or low mapping quality and this flag will allow the jobs to finish much faster. You will still be left with near-duplicates in your library - so use caution when interpreting results. In particular note that near-duplicates are usually machine errors, not true biological results, and thus ought to be removed.
Multiple ligation junctions now supported in juicer and statistics.pl script

Minor:

The chimera handling script now includes the header and prints out tab-delimited, for better conversion to BAM; it also no longer looks for the /1, /2 but rather looks for the SAM flag
We dedup collisions now
An addition to the dups script makes it run faster and with less memory when there are a lot of duplicates
Statistics updated in CPU to properly handle multiple ligations; also added scripts in CPU that were missing for mega
Made the names correct in the stats_sub script
Count ligations explicitly excludes the readname in the fastq file
LibraryComplexity no longer a separate jar
No more stats calculation on duplicates
Java memory options now exported instead of separate scripts
Multiple ligation junctions handled
Flag added for no wobble / just exact duplicates (-j)
Adding in options to mega script; updating memory requirements
Making scripts more consistent by having check for system for juicer_tools path

1.6.2

6 years ago

1.6.1

6 years ago

Bug fix in HiC_tmp folder creation

1.6.0

6 years ago

Differences from Juicer 1.5.6:

bwa mem –SP5M paired end mode; most recent BWA release 0.7.17
nofrag default
no short end read alignment anymore
alignonly, mergeonly, deduponly stages added
simplification of chimeric_blacklist in terms of output stem
chimeric_blacklist producing partition of reads into bams: - alignable.bam with duplicates marked - collisions.bam (those two ideally ENCODE product that one can download) - collisions_low_mapq.bam - unmapped.bam - mapq0.bam
mitochondria no longer treated as a special case
collisions now include contigs as well (blacklist instead of whitelist) The merged_nodups file still appears as part of the pipeline. To recover merged_nodups from alignable.bam produced by ENCODE (for example), use the bamtotxt.sh script

1.5.6

6 years ago

Biggest change is that information describing how this file was produced is now printed in the top of the inter.txt file and added to the .hic file as usual.
Adding in new flags -b ligation and -t threads
Changed juicer_arrowhead, juicer_hiccups, and juicer_postprocessing to not fail immediately when bed files not found
Corrected scripts so bc is not required
Proper unzipping logic in SLURM and UGER
Will always split if chunk size sent in
Bug fixes in terms of looking for endings in splits directory
Exits changed to 1
Debugging changed to better determine exactly where script failed
Touch files added to deal with jobs killed by the cluster
Warning for genomePath
Header added to hic file containing all the information about versions of software used to create that file
Fixed bug in CPU version where multiple fastqs were not processed
Some other bug fixes in SLURM and CPU

1.5.5

6 years ago

Bug fixes • Naming of opt_dups versus optdups in CPU script • nofrag bug in all scripts when creating hic files • early exit now correct in CPU script

Added threads and ligation flag to CPU script

README improvements SLURM script improvements Reconciling scripts for next dev push

1.5.4

7 years ago

Minor changes and features added

no frag flag added
naming changes, everything consistent with the Juicer Tools terminology
get rid of unnecessary bc dependency
remove jars so that users download the newest juicer_tools jar instead

1.5.3

7 years ago

Added in LSF and CPU versions

1.5.2

7 years ago

Updates to mega script: added in resolutions flag to mega.sh map
Changed naming to be more consistent, separate out arrowhead and hiccups jobs

1.5.1

7 years ago

Minor bug fixes, mostly for LSF and AWS versions:

Changed shortread check from -v to -n, problem for LSF users otherwise
small change to echo in the Experiment description so that newlines are handled properly
updated jar
updated shortread test to be consistent across scripts
Fixing bug in relaunch at final stage on AWS
adding in wait time to split rmdups
Updated juicer to fix bug with none as restriction enzyme
fixed bug when restriction site file sent in