Hot-keys on this page
r m x p toggle line displays
j k next/prev highlighted chunk
0 (zero) top of page
1 (one) first highlighted chunk
########################################################################### # # # This file is part of Counter RNAseq Window (craw) package. # # # # Authors: Bertrand Neron # # Copyright (c) 2017-2019 Institut Pasteur (Paris). # # see COPYRIGHT file for details. # # # # craw is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 3 of the License, or # # (at your option) any later version. # # # # craw is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. # # See the GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with craw (see COPYING file). # # If not, see <http://www.gnu.org/licenses/>. # # # ###########################################################################
""" This function return a new function :func:`get_sum_coverage`
:param input_data: the input either a samfile (see pysam library) or a genome build from a wig file (see wig module) :type input_data: :class:`craw.wig.Genome` or :class:`pysam.AlignmentFile` object :param qual_thr: The quality threshold if input data come from wig this parameter is not used, :type qual_thr: int :return: :func:`get_sum_coverage`, a function which compute the sum of coverage for a gene on each strand between position [start, stop[ This function take 3 parameters:
- **annot_entry**: an entry of the annotation file. - **start**: The position to start to compute the coverage(coordinates are 0-based, start position is included). - **stop**: The position to stop to compute the coverage (coordinates are 0-based, stop position is excluded).
and
- **return**: a tuple with 2 tuple of float or int representing the coverage on strand forward and reverse.
:rtype: function """
# if start is negative # when start is compute from large window and reads map at the beginning of the reference # pysam crash see issue #10 # and numpy return empty silce # So we consider that negative positions (which doe not really exists) have coverage of 0
"""
:param input_data: the input either a samfile (see pysam library) or a genome build from a wig file (see wig module) :type input_data: :class:`craw.wig.Genome` or :class:`pysam.AlignmentFile` object :param new_size: the number of values in the coverage vector. :type new_size: postive int :param qual_thr: The quality threshold if input data come from wig this parameter is not used, :type qual_thr: int :return: a function :func:`get_resized_coverage`, a function which compute the coverage for a gene on each strand between position [start, stop[ This function take 3 parameters: the coverage values are generate by linear interpolation from raw values between [start, stop[ using the scipy. see https://docs.scipy.org/doc/scipy-0.19.0/reference/generated/scipy.interpolate.interp1d.html
- **annot_entry**: an entry of the annotation file. - **start**: The position to start to compute the coverage(coordinates are 0-based, start position is included). - **stop**: The position to stop to compute the coverage (coordinates are 0-based, stop position is excluded).
and
- **return**: a tuple with 2 tuple of float or int representing the coverage on strand forward and reverse.
:rtype: function """
"""
:param annot_entry: an entry of the annotation file. :type annot_entry: :class:`annotation.Entry` object. :param start: The position to start to compute the coverage(coordinates are 0-based, start position is included). :type start: int :param stop: The position to stop to compute the coverage (coordinates are 0-based, stop position is excluded). :type stop: int :return: a new serie with new_size length :rtype: list of float instance. """ # if start is negative # when start is compute from large window and reads map at the beginning of the reference # pysam crash see issue #10 # and numpy return empty silce # So we ommit negative value generate new values using real positions
"""
:param input_data: the input either a samfile (see pysam library) or a genome build from a wig file (see wig module) :type input_data: :class:`wig.Genome` or :class:`pysam.AlignmentFile` object :param max_left: The highest number of base before the reference position to take in account. :type max_left: int :param max_right: The highest number of base after the reference position to take in account. :type max_right: int :param qual_thr: The quality threshold if input data come from wig this parameter is not used, :type qual_thr: int :return: a function :func:`get_padded_coverage`, a function which compute the coverage for a gene on each strand between position *[start, stop[*
The coverage values are centered on the annot_entry.ref position, the matrix is padded by ``None`` value.::
[.......[ coverage ref.pos ] .....] [....[covergae ref.pos ] .....] [............[ cov ref.pos ]]
This function take 3 parameters:
- **annot_entry**: an entry of the annotation file. - **start**: The position to start to compute the coverage(coordinates are 0-based, start position is included). - **stop**: The position to stop to compute the coverage (coordinates are 0-based, stop position is excluded).
and
- **return**: a tuple with 2 tuple of float or int representing the coverage on strand forward and reverse. :rtype: function """
"""
:param annot_entry: an entry of the annotation file. :type annot_entry: :class:`annotation.Entry` object. :param start: The position to start to compute the coverage(coordinates are 0-based, start position is included). :type start: int :param stop: The position to stop to compute the coverage (coordinates are 0-based, stop position is excluded). :type stop: int :return: the coverage for forward and reverse strand padded with ``None``. :rtype: tuple of 2 list containing int or float (forward coverage, reverse coverage) """ # if start is negative # when start is compute from large window and reads map at the beginning of the reference # pysam crash see issue #10 # so we ask coverage from 0 and pad with None value for negative positions
# -1 because the ref must not be take in account in pad # start and stop are 0 based (see docstring) # but stop is excluded in get_bam and included in annot_entry # so it (stop -1) - ( ref -1) => stop -1 else:
"""
:param input: the input either a samfile (see pysam library) or a genome build from a wig file (see wig module) :type input: :class:`wig.Genome` or :class:`pysam.calignmentfile.AlignmentFile` object :return: get_wig_coverage or get_bam_coverage according the type of input :rtype: function :raise RuntimeError: when input is not instance of :class:`pysam.calignmentfile.AlignmentFile` or :class:`wig.Genome` """ else: "'pysam.calignmentfile.AlignmentFile' as Input, not {}".format(input.__class__.__name__))
""" :param genome: The genome which store all coverages. :type genome: :class:`craw.wig.Genome` object :param annot_entry: an entry of the annotation file :type annot_entry: :class:`annotation.Entry` object :param start: The position to start to compute the coverage(coordinates are 0-based, start position is included). :type start: int :param stop: The position to stop to compute the coverage (coordinates are 0-based, stop position is excluded). :type stop: int :param qual_thr: this parameter is not used, It's here to have the same api as get_bam_coverage. :type qual_thr: None :return: the coverage (all bases) :rtype: tuple of 2 list containing int or float """
""" Compute the coverage for a region position by position on each strand
:param sam_file: the samfile openend with pysam :type sam_file: :class:`pysam.AlignmentFile` object. :param annot_entry: an entry of the annotation file :type annot_entry: :class:`annotation.Entry` object :param start: The position to start to compute the coverage(coordinates are 0-based, start position is included). :type start: positive int :param stop: The position to start to compute the coverage (coordinates are 0-based, stop position is excluded). :type stop: positive int :param qual_thr: The quality threshold :type qual_thr: int :return: the coverage (all bases) :rtype: tuple of 2 list containing int """ """ :param al_seg: a pysam aligned segment (the object used by pysam to represent an aligned read) :type al_seg: :class:`pysam.AlignedSegment` :return: True if read is mapped to forward strand :rtype: boolean """
""" :param al_seg: a pysam aligned segment (the object used by pysam to represent an aligned read) :type al_seg: :class:`pysam.AlignedSegment` :return: True if read is mapped to reverse strand. :rtype: boolean """
""" Compute the coverage for each position between start and stop on the chromosome on the strand.
:param sam_file: the sam alignment to use :type sam_file: a :class:`pysam.AlignmentFile` object :param chromosome: the name of the chromosome :type chromosome: basestring :param start: The position to start to compute the coverage(coordinates are 0-based, start position is included). :type start: int :param stop:The position to start to compute the coverage (coordinates are 0-based, stop position is excluded). :type stop: int :param qual: The quality threshold. :type qual: int :param strand: the strand on which the read match :type strand: string :return: the coverage on forward then on reverse strand. The coverage is the sum of all kind bases mapped for each position :rtype: tuple of 2 list containing int """
start=start, stop=stop, quality_threshold=qual, read_callback=call_back) "reference= {chromosome} \n" "start={start}\n" "end={stop}\n" "quality_threshold={qual}\n" "read_callback={call_back}".format(chromosome=chromosome, start=start, stop=stop, qual=qual, call_back=call_back) )
annot_entry.chromosome, start, stop, qual_thr, '+' ) annot_entry.chromosome, start, stop, qual_thr, '-' )
|