CISA consists of four phases for contig integration.
Four folders are generated after running CISA.
In the phase 1 (CISA1):
- The representative contigs and their explained contigs are recorded in the file of Explained.txt.
- Information about contig extensions are recorded in the file of Extend_info.
- For example,
- Head: >Skeleton contig >Extended contig
- Tail: >Skeleton contig >Extended contig
- The processed contigs named R1_contigs.fa is placed outside the CISA1 folder.
In the phase 2 (CISA2):
- The uncertain regions located in the end of contigs were clipped (clip_info)
- The clipped out sequences are recorded in the file of clip_out
- The unalignable gaps are recorded in the file of Gaps, and the size of gaps larger than 95th quantile (R2_gap=0.95) are clipped.
- The misassembled contigs recorded in the file of Remove_Info are removed and extra contigs are introduced if necessary.
- For example, in the case of E. coli:
- 1 14641 | 36367 21727 | 14641 14641 | 99.99 | CLC_100_len:33932 Abyss_133_len:170886
- 14608 33932 | 58206 38882 | 19325 19325 | 100.00 | CLC_100_len:33932 Abyss_129_len:72302
- 1 14641 | 31800 17160 | 14641 14641 | 99.99 | CLC_100_len:33932 Edena_65_len:79603
- 14608 33932 | 58206 38882 | 19325 19325 | 100.00 | CLC_100_len:33932 Edena_51_len:126254
- 1 14641 | 31804 17164 | 14641 14641 | 99.99 | CLC_100_len:33932 Velvet_45_len:70264
- 14608 33932 | 13945 33269 | 19325 19325 | 100.00 | CLC_100_len:33932 Velvet_42_len:58831
- The representative contig CLC_100 is misassembled. We removed this contig and introduce an extra representative contig - Abyss_129.