Hot-keys on this page
r m x p toggle line displays
j k next/prev highlighted chunk
0 (zero) top of page
1 (one) first highlighted chunk
#! /usr/bin/env python3
########################################################################### # # # This file is part of Counter RNAseq Window (craw) package. # # # # Authors: Bertrand Neron # # Copyright (c) 2017-2019 Institut Pasteur (Paris). # # see COPYRIGHT file for details. # # # # craw is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 3 of the License, or # # (at your option) any later version. # # # # craw is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. # # See the GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with craw (see COPYING file). # # If not, see <http://www.gnu.org/licenses/>. # # # ###########################################################################
""" Parse value given by the parser
:param value: the value given by the parser :type value: string :return: the integer corresponding to the value :rtype: int :raise: :class:`argparse.ArgumentTypeError` """
""" Parse value given by the parser
:param value: the value given by the parser :type value: string :return: the integer >=0 and <=42 corresponding to the value :rtype: int :raise: :class:`argparse.ArgumentTypeError` if value does not represent a integer >=0 and <=42 """
""" Compute the header for the results. the firts lines start with # they contains some general information about the craw (version) and options used (for tracbility) the last line is the header of columns separated by --sep option and can be used as header with pandas
:param annot_parser: the annotation parser :type annot_parser: :class:`annotation.AnnotationParser` object :param parsed_args: the command line argument parsed with argparse :type parsed_args: :class:`argparse.Namespace` :return: The header of the result file :rtype: str """ # Version: {} # # craw_coverage run with the following arguments: """.format(commented_ver)
else: else:
else:
else:
""" :return: The human readable CRAW version. :rtype: str """ Using: - pysam {pysam_ver} (samtools {samtools_ver}) - scipy {sp_ver} (only for --justify opt) """.format(pysam_ver=pysam.__version__, samtools_ver=pysam.__samtools_version__, sp_ver=craw.coverage.scipy.__version__)
"""
:param str sense_opt: how to managed the sense and antisense results
* **mixed**: sense and antisense are interleaved in same file * **split**: sense and antisense are in separated files * **S**: only sense results are write down * **AS**: only antisense are write down
:param str basename: the basename of the results file :param str suffix: the suffix of the results file :return: the file objects where to write sense and antisense results :rtype: tuple (`file object` sense, `file object` antisense) """ else:
"""
:param args: The options set on the command line (without the program name) :type args: list of string :return: """ help="""The path of the bam file to analyse. --bam option is not compatible with any --wig or --wig-for or --wig-rev options. but at least --bam or any of --wig* options is required.""") help="""The path of the wig file to analyse. The file encode the coverage for the both strand. The positive coverage ar on the forward strand whereas the negative coverage a located on the reverse one. The --wig option is incompatible with both --bam or --wig-for or --wig-reverse options.""") metavar='FORWARD WIG', help="""The path of a wig file to analyse. This file encode the coverage for the forward strand. The --wig-for option is incompatible with both --bam or --wig options.""") metavar='REVERSE WIG', help="""The path of a wig file to analyse. This file encode the coverage for the reverse strand. The --wig-rev option is incompatible with both --bam or --wig options.""") required=True, help="The path of the annotation file (required).") dest='qual_thr', type=quality_checker, default=15, help="The minimal quality of read mapping to take it in account") default="cov", help="The name of the suffix to use for the output file.") dest='output', help="The path of the output (default= base name of annotation file with --suffix)") default='\t', help="the separator use to delimit the annotation fields") type=positive_int, help="to resize all genes coverage to this new size.") action='store_true', default=False, help="sum all the coverages on the window.")
description="""Parameters which define regions to compute.
There is 2 way to define regions: * all regions have same length. * each region have different lengths.
In both case a position of reference must be define (--ref-col).
If all regions have same length:
--window define the number of nucleotide to take in account before and after the reference position (the window will be centered on reference) --before define the number of nucleotide to take in account before the reference position. --after define the number of nucleotide to take in account after the reference position. --before and --after allow to define non centered window.
--after and --before options must be set together and are incompatible with --window option.
If all regions have different lengths:
The regions must be specified in the annotation file. --start-col define the name of the column in annotation file which define the start position of the region to compute. --stop-col define the name of the column in annotation file which define the stop position of the region to compute. """) default="position", help="The name of the column for the reference position (default: position).") type=positive_int, help="The number of base to compute after the position of reference.") type=positive_int, help="The number of base to compute before the position of reference.") type=positive_int, help="The number of base to compute around the position of reference.") help="The name of the column to define the start position.") help="The name of the column to define the stop position.") default='strand', help="Specify the name of the column representing the strand (default: strand)") default='chromosome', help="Specify the name of the column representing the chromosome (default: chromosome)")
choices=('S', 'AS', 'split', 'mixed'), default='mixed', help="compute result only on: " "sense (S), " "antisense (AS), " "on both senses but produce two separated files (split), " "or in one file (mixed)." "(default: mixed)" )
action=argparse_util.VersionAction, version=get_version_message()) action="count", default=0, help="Reduce verbosity.") action="count", default=0, help="Increase verbosity.")
############################# # Check wig and bam options # ############################# " '--bam', '--wig' , '--wig-for', '--wig-rev'.") "'--bam', '--wig' , '--wig-for', '--wig-rev' cannot specify at the same time.") " '--wig', '--wig-for' or '--wig-rev' options.") " '--wig-for' or '--wig-rev' options.") ########################### # Checking window options # ########################### " must be specified") "are mutually exclusives.") " The both options must be specified in same time") else: pass # window is None, before and after are specify # => nothing to do else: # parsed_args.window is not None: else: # --before, --after are None "The both options must be specified in same time")
""" The entrypoint for craw_coverage script It will generate a coverage matrix around the position of interest and write the results in files
:param args: the arguments and options given on the command line :type args: list of string as given by sys.argv without the program name :param log_level: the level of logger :type log_level: positive int or logging flag logging.DEBUG, logging.INFO, logging.ERROR, logging.CRITICAL """
else: verbosity = log_level
####################### # Parsing input files # ####################### chr_col=parsed_args.chr_col, strand_col=parsed_args.strand_col, start_col=parsed_args.start_col, stop_col=parsed_args.stop_col, sep=parsed_args.sep)
# input_data is a samfile # input_data is a wig.Genome object else: # input_data is a wig.Genome object
############################ # checking outputs options # ############################ parsed_args.output = os.path.splitext(input_file)[0] out_name = parsed_args.output suffix = parsed_args.suffix else: suffix = parsed_args.suffix
########################### # Computing output matrix # ###########################
# if parsed_args.sense is mixed the sense_file and antisense_file are the same object print(header, file=antisense_file)
# get the appropriate function according to the input type # the 2 functions # - get_wig_coverage # - get_bam_coverage # have exactly the same api else: max_left = max_right = parsed_args.window else:
progress(annot_num, annot_line_number) # pos in get_coverage functions are # 0 based whereas in annotation they are 1 based # start is included, stop is excluded else: else: # if feature is on reverse strand # the before and after are inverted
else:
else:
main() |