[ 源代码: pychopper ]
软件包:python3-pychopper(2.7.10-1)
identify, orient and trim full-length Nanopore cDNA reads
Pychopper v2 is a Python module to identify, orient and trim full-length Nanopore cDNA reads. It is also able to rescue fused reads and provides the script 'pychopper.py'. The general approach of Pychopper v2 is the following:
* Pychopper first identifies alignment hits of the primers across the length of the sequence. The default method for doing this is using nhmmscan with the pre-trained strand specific profile HMMs, included with the package. Alternatively, one can use the edlib backend, which uses a combination of global and local alignment to identify the primers within the read. * After identifying the primer hits by either of the backends, the reads are divided into segments defined by two consecutive primer hits. The score of a segment is its length if the configuration of the flanking primer hits is valid (such as SPP,-VNP for forward reads) or zero otherwise. * The segments are assigned to rescued reads using a dynamic programming algorithm maximizing the sum of used segment scores (hence the amount of rescued bases). A crucial observation about the algorithm is that if a segment is included as a rescued read, then the next segment must be excluded as one of the primer hits defining it was "used up" by the previous segment. This put constraints on the dynamic programming graph. The arrows in read define the optimal path for rescuing two fused reads with the a total score of l1 + l3.
A crucial parameter of Pychopper v2 is -q, which determines the stringency of primer alignment (E-value in the case of the pHMM backend). This can be explicitly specified by the user, however by default it is optimized on a random sample of input reads to produce the maximum number of classified reads.
其他与 python3-pychopper 有关的软件包
|
|
|
|
-
- dep: python3
- interactive high-level object-oriented language (default python3 version)
-
- dep: python3-biopython
- Python3 library for bioinformatics
-
- dep: python3-matplotlib
- Python based plotting system in a style similar to Matlab
-
- dep: python3-numpy
- 用于 Python 语言的快速数组运算库(Python 3)
-
- dep: python3-pandas
- data structures for "relational" or "labeled" data
-
- dep: python3-parasail
- Python3 bindings for the parasail C library
-
- dep: python3-pysam
- interface for the SAM/BAM sequence alignment and mapping format (Python 3)
-
- dep: python3-pytest
- Simple, powerful testing in Python3
-
- dep: python3-tqdm
- fast, extensible progress bar for Python 3 and CLI tool