hyb_analyze

Read one or more ‘.hyb’ format files and analyze the contained hybrid sequences.

Analysis Types:

type : Type Analysis

Analyze of segment types included in the analyzed hyb_records

mirna : miRNA Analysis

Analyze counts/types of miRNA in hyb records

summary : Summary Analysis

Combined Type and miRNA Analysis

target : Target Analysis

Analyze sequences targeted by >= 1 individual miRNA.

This utility reads in one or more files in hyb-format (see the hybkit Hyb File Specification) and analyzes hybrid record properties.

type Analysis:

Utilizes hybkit.analysis.TypeAnalysis to analyze hybrid types.

Requires the record flags: seg1_type and seg2_type to be set by the hybkit.HybRecord.eval_types() method.

Example system calls:
$ hyb_analyze -a type -i my_file_1.hyb

$ hyb_analyze -a type -i my_file_1.hyb \
           --make_plots False
mirna Analysis:

Utilizes hybkit.analysis.MirnaAnalysis to analyze miRNA counts and details in hyb file.

Requires the record flag: miRNA_seg to be set by the hybkit.HybRecord.eval_mirna() method.

Example system calls:
$ hyb_analyze -a mirna -i my_file_1.hyb
summary Analysis:

Utilizes hybkit.analysis.SummaryAnalysis to analyze hybrid types and miRNA counts / details in hyb file(s).

Requires the record flags: seg1_type and seg2_type to be set by the hybkit.HybRecord.eval_types(), and the record flag: miRNA_seg to be set by hybkit.HybRecord.eval_mirna().

Example system calls:
$ hyb_analyze -a summary -i my_file_1.hyb
target Analysis:

Utilizes hybkit.analysis.TargetAnalysis to analyze hybrid types and miRNA counts / details in hyb file(s).

Requires the record flag: miRNA_seg to be set by hybkit.HybRecord.eval_mirna(). Only records with miRNA_seg set to one of ‘3p’ or ‘5p’ will be evaluated. If ‘allow_mirna_dimers=True’, then miRNA_seg == ‘B’ will also be included.

Example system calls:
$ hyb_analyze -a target -i my_file_1.hyb

$ hyb_analyze -a target -i my_file_1.hyb \
           --allow_mirna_dimers True
Output File Naming:

Output files can be named in two fashions: via automatic name generation, or by providing specific out file names.

Automatic Name Generation:

For output name generation, the default respective naming scheme is used:

hyb_script -i PATH_TO/MY_FILE_1.HYB [...]
    -->  OUT_DIR/MY_FILE_1_ADDSUFFIX.HYB

This output file path can be modified with the arguments {–out_dir, –out_suffix} described below.

The output directory defaults to the current working directory ($PWD), and can be modified with the --out_dir <dir> argument. Note: The provided directory must exist, or an error will be raised. For Example:

hyb_script -i PATH_TO/MY_FILE_1.HYB [...] --out_dir MY_OUT_DIR
    -->  MY_OUT_DIR/MY_FILE_1_ADDSUFFIX.HYB

The suffix used for output files is based on the primary actions of the script. It can be specified using --out_suffix <suffix>. This can optionally include the “.hyb” final suffix. for Example:

hyb_script -i PATH_TO/MY_FILE_1.HYB [...] --out_suffix MY_SUFFIX
    -->  OUT_DIR/MY_FILE_1_MY_SUFFIX.HYB
#OR
hyb_script -i PATH_TO/MY_FILE_1.HYB [...] --out_suffix MY_SUFFIX.HYB
    -->  OUT_DIR/MY_FILE_1_MY_SUFFIX.HYB
Specific Output Names:

Alternatively, specific file names can be provided via the -o/–out_hyb argument, ensuring that the same number of input and output files are provided. This argument takes precedence over all automatic output file naming options (–out_dir, –out_suffix), which are ignored if -o/–out_hyb is provided. For Example:

hyb_script [...] --out_hyb MY_OUT_DIR/OUT_FILE_1.HYB MY_OUT_DIR/OUT_FILE_2.HYB
    -->  MY_OUT_DIR/OUT_FILE_1.hyb
    -->  MY_OUT_DIR/OUT_FILE_2.hyb

Note: The directory provided with output file paths (MY_OUT_DIR above) must exist, otherwise an error will be raised.

usage: hyb_analysis [-h] -i PATH_TO/MY_FILE.HYB [PATH_TO/MY_FILE.HYB ...]
                    [-o PATH_TO/OUT_BASENAME [PATH_TO/OUT_BASENAME ...]]
                    [-d OUT_DIR] [-u OUT_SUFFIX]
                    [-a {type,mirna,summary,target}]
                    [--write_individual [{True,False}]] [-n ANALYSIS_NAME]
                    [-p {True,False}] [-v | -s]
                    [--mirna_types MIRNA_TYPES [MIRNA_TYPES ...]]
                    [--custom_flags CUSTOM_FLAGS [CUSTOM_FLAGS ...]]
                    [--hyb_placeholder HYB_PLACEHOLDER]
                    [--reorder_flags {True,False}]
                    [--allow_undefined_flags [{True,False}]]
                    [--allow_unknown_seg_types [{True,False}]]
                    [--check_complete_seg_types [{True,False}]]
                    [--hybformat_id [{True,False}]]
                    [--hybformat_ref [{True,False}]]
                    [--count_mode {read,record}] [--mirna_sort {True,False}]
                    [--allow_mirna_dimers [{True,False}]]
                    [--type_sep TYPE_SEP] [--out_delim OUT_DELIM]

Named Arguments

-i, --in_hyb

REQUIRED path to one or more hyb-format files with a “.hyb” suffix for use in the evaluation.

-o, --out_basename

Optional path to one or more basename prefixes to use for analysis output. The appropriate suffix will be added based on the specific name. If not provided, the output for input file “PATH_TO/MY_FILE.HYB” will be used as a template for the basename “OUT_DIR/MY_FILE”.

-d, --out_dir

Path to directory for output of evaluation files. Defaults to the current working directory.

Default: $PWD

-u, --out_suffix

Suffix to add to the name of output files, before any file- or analysis-specific suffixes. The file-type appropriate suffix will be added automatically.

-a, --analysis_type

Possible choices: type, mirna, summary, target

Analysis to perform on input hyb file.

--write_individual

Possible choices: True, False

Additionally write / plot output per individual miRNA.

Default: False

-n, --analysis_name

Name / title of analysis data.

-p, --make_plots

Possible choices: True, False

Create plots of analysis output.

Default: True

-v, --verbose

Print verbose output during run.

Default: False

-s, --silent

Print no output during run.

Default: False

Hyb Record Settings

--mirna_types

“seg_type” fields identifying a miRNA

Default: [‘miRNA’, ‘microRNA’]

--custom_flags

Custom flags to allow in addition to those specified in the hybkit specification.

Default: []

--hyb_placeholder

placeholder character/string for missing data in hyb files.

Default: “.”

--reorder_flags

Possible choices: True, False

Re-order flags to the hybkit-specificiation order when writing hyb records.

Default: True

--allow_undefined_flags

Possible choices: True, False

Allow use of flags not definied in the hybkit-specificiation order when reading and writing hyb records. As the preferred alternative to using this setting, the –custom_flags arguement can be be used to supply custom allowed flags.

Default: False

--allow_unknown_seg_types

Possible choices: True, False

Allow unknown segment types when assigning segment types.

Default: False

--check_complete_seg_types

Possible choices: True, False

Check every segment possibility when assigning segment types, rather than breaking after the first match is found. If True, finding segment types is slower but better at catching errors.

Default: False

Hyb File Settings

--hybformat_id

Possible choices: True, False

The Hyb Software Package places further information in the “id” field of the hybrid record that can be used to infer the number of contained read counts. When set to True, the identifiers will be parsed as: “<read_id>_<read_count>”

Default: False

--hybformat_ref

Possible choices: True, False

The Hyb Software Package uses a reference database with identifiers that contain sequence type and other sequence information. When set to True, all hyb file identifiers will be parsed as: “<gene_id>_<transcript_id>_<gene_name>_<seg_type>”

Default: False

Analysis Settings

--count_mode

Possible choices: read, record

Method for counting records. “read”: use the number of reads per hyb record as the count (may contain PCR duplicates); “record” count the number of records represented by each (hyb record entry 1 for “unmerged” records, >= 1 for “merged” records)

Default: “record”

--mirna_sort

Possible choices: True, False

During TypeAnalysis, sort miRNAs first for “miRNA”-“Other” segtype pairs. If False, sort alphabetically.

Default: True

--allow_mirna_dimers

Possible choices: True, False

Include miRNA / miRNA dimers in TargetAnalysis. If False, exclude these from analysis results.

Default: False

--type_sep

Separator-string to place between types in analysis output.

Default: “-”

--out_delim

Delimiter-string to place between fields in analysis output.

Default: “,”