hybkit.analysis¶
Functions for analysis of HybRecord and FoldRecord objects.
Analysis¶
- class hybkit.analysis.Analysis(analysis_types=None, name=None, quant_mode=None)¶
Class for analysis of hybkit HybRecord and FoldRecord objects.
This class contains multiple conceptual analyses for HybRecord/FoldRecord Data:
This class used by selecting the desired analysis types on object initialization. Analyses are performed either by using either the
add_record()
or theadd_all_records()
methods. The results of the analysis are then available through theget_all_results()
,get_analysis_results()
,get_specific_result()
, andplot_analysis_results()
methods, which can return (or plot) the results of all analyses or of a specific subset of analyses.Details for each respective analysis are provided here:
Energy Analysis:
This analysis evaluates the energy of each
HybRecord
object and provides a binned-histogram of all energy values represented.- Output Results:
energy_analysis_count
(int
): Count of energy values evaluatedhas_energy_val
(int
): Count of hyb_records with an energy valueno_energy_val
(int
): Count of hyb_records without an energy valueenergy_min
(float
): Minimum energy valueenergy_max
(float
): Maximum energy valueenergy_mean
(float
): Mean energy valueenergy_std
(float
): Standard deviation of energy valuesbinned_energy_vals
(Counter
): Counter with integer keys of energy values fromenergy_min
toenergy_max
storing the count of any hyb_records with energy values that fall within that range (rounded to the next highest integer (e.g. -12.5 -> -12).
Type Analysis:
This analysis evaluates the counts of each type of segment included in the
HybRecord
objects. The types of segments are determined by the seg1_type and seg2_type flags, which are set by thehybkit.HybRecord.eval_types()
method.Requirements:
seg1_type and seg2_type flags must be set for each HybRecord, (can be done byhybkit.HybRecord.eval_types()
).- Output Results:
types_analysis_count
(int
): Count of hybrid types analyzedhybrid_types
(Counter
): Counter containing annotated types of seg1 and seg (in original 5p / 3p order)reordered_hybrid_types
(Counter
): Counter containing annotated types of seg1 and seg2. This is provided in "sorted" order, where types are sorted alphabetically (independent of 5p / 3p position).mirna_hybrid_types
(Counter
): Counter containing annotated types of seg1 and seg2. This is provided in "miRNA-prime" order, where a miRNA type is always listed before other types, and then remaining types are sorted alphabetically (independent of 5p / 3p position).seg1_types
(Counter
): Counter containing annotated type of segment in position seg1seg2_types
(Counter
): Counter containing annotated type of segment in position seg2all_seg_types
(Counter
): Counter containing position-independent annotated types
miRNA Analysis:
Analysis of miRNA segments in hybrids.
The mirna_analysis provides an analysis of what miRNA types are present in the hyb records. If a miRNA dimer is present in a hybrid, this is counted in
mirna_dimers
. If a single miRNA is present in a hybrid, this is counted inmirnas_5p
ormirnas_3p
depending on the miRNA location.- Requirements:
- mirna_seg flag must be set for each HybRecord (can be done by
hybkit.HybRecord.eval_mirna()
). - Output Results:
mirna_analysis_count
(int
): Count of miRNA hybrids analyzedmirnas_5p
(int
): Count of 5p miRNAs detectedmirnas_3p
(int
): Count of 3p miRNAs detectedmirna_dimers
(int
): Count of miRNA dimers (5p + 3p) detectednon_mirna
(int
): Count of non-miRNA hybrids detectedhas_mirna
(int
): Hybrids with 5p, 3p, or both as miRNA
Target Analysis:
Analysis of targets in miRNA-containing hybrids.
The target analysis provides an analysis of what annotated sequences and sequence types are targeted by any miRNA within the hyb records. If a miRNA is not present in a hybrid, the hybrid is not included in the analysis. If a miRNA dimer is present in a hybrid, the 5p miRNA is used for the analysis, and the 3p miRNA is considered the "target."
- Requirements:
- mirna_seg flag must be set for each HybRecord (can be done by
hybkit.HybRecord.eval_mirna()
). - Output Results:
Fold Analysis:
This analysis evaluates the predicted binding of miRNA within hyb records that contain a miRNA and have an associated
FoldRecord
object as the attributefold_record
. This includes an analysis and plotting of the predicted binding by position among the provided miRNA.- Requirements:
- The mirna_seg flag must be set for each HybRecord (can be done by
hybkit.HybRecord.eval_mirna()
).The fold_record attribute must be set for each HybRecord with a correspondingFoldRecord
object. This can be done using thehybkit.HybRecord.set_fold_record()
method. - Output Results:
fold_analysis_count
(int
): Count of miRNA fold predictions analyzedfolds_recorded
(int
): Count of fold predictions with a mirna foldmirna_nt_fold_counts
(Counter
) : Counter with keys of miRNA position index and values of number of miRNAs with a predicted bound state at that index.mirna_nt_fold_props
(Counter
) : Counter with keys of miRNA position index and values of proportion (0.0 - 1.0) of miRNAs with a predicted bound state at that index.fold_match_counts
(Counter
) : Counter with keys of count of predicted matches between miRNA and target with values of count of miRNAs with that number of predicted matches.
- Parameters
analysis_types (
str
orlist
ofstr
) -- Analysis types to performname (
str
, optional) -- Name of the analysisquant_mode (
str
, optional) -- Mode to use for record quantification. Options are "single": One count per record; "reads": If "read_count" flag is set, count all reads in record (else count 1); "records": if the "record_count" flag is set, count all individual records within combined record (else count 1). If not provided, defaults to the value inAnalysis.settings['quant_mode'].
- Variables
- settings = {'out_delim': ',', 'quant_mode': 'single'}¶
Class-level settings. See
hybkit.settings.Analysis_settings
for descriptions.
- analysis_options = ['energy', 'type', 'mirna', 'target', 'fold']¶
- add_hyb_record(hyb_record)¶
Add a HybRecord object to the analysis.
- Parameters
hyb_record (
HybRecord
) -- HybRecord object to be added to the analysis.
- add_hyb_records(hyb_records, eval_types=False, eval_mirna=False)¶
Add a list of HybRecord objects to the analysis.
- Parameters
hyb_records (
HybFile
orlist
ofHybRecord
) -- HybFile to iterate over, or iterable of HybRecord objects to be added to the analysis.eval_types (bool) -- If
True
, evaluate the hybrid type of the HybRecord before adding it to the analysis usinghybkit.HybRecord.eval_types()
.eval_mirna (bool) -- If
True
, evaluate the miRNA segment of the HybRecord before adding it to the analysis usinghybkit.HybRecord.eval_mirna()
.
- get_all_results()¶
Return a dictionary with all results for all active analyses.
See Analyses for details on the results for each analysis type.
- Returns
- Dictionary with keys of analysis type and values of
dictionaries with results for that analysis type.
- Return type
- get_analysis_results(analysis)¶
Return a dictionary with all results for a specific analysis.
See Analyses for details on the results for each analysis type.
- get_specific_result(result_key)¶
Get a specific result from the analysis.
See Analyses for details on the results for each analysis type.
- Parameters
result_key (str) -- Result key to return from one of the enabled analyses.
- Returns
Result value for the specified result key.
- get_analysis_delim_str(analysis=None, out_delim=None)¶
Return a delimited string containing the results of the analysis.
See Analyses for details on the results for each analysis type.
- Parameters
analysis (
str
orlist
ofstr
) -- Analysis type for return results. If not provided, return the results for all active analyses.out_delim (str) -- Delimiter to use for output. If not provided, defaults to the value in
settings['out_delim']
.
- write_analysis_delim_str(out_file_name=None, analysis=None, out_delim=None)¶
Write the results of the analysis to a delimited text file.
See Analyses for details on the results for each analysis type.
- Parameters
out_file_name (str) -- Path to output file. If not provided, defaults to: ./<analysis_name>_<analysis>.csv if analysis/analyses provided, or ./<analysis_name>_multi_analysis.csv if no analysis/analyses provided.
analysis (
str
orlist
ofstr
) -- Analysis type for return results. If not provided, return the results for all active analyses.out_delim (str) -- Delimiter to use for output. If not provided, defaults to the value in
settings['out_delim']
.
- write_analysis_results_special(out_basename=None, analysis=None, out_delim=None)¶
Write the results of the analyses to specialized text files.
See Analyses for details on the results for each analysis type.
- Parameters
out_basename (str) -- Path for basename of output file. Files will be renamed using the provided path as the base name. If not provided, defaults to: ./<analysis_name>_<analysis> if
name
is set, or ./Analysis_multi_<analysis> if name not set.analysis (
str
orlist
ofstr
) -- Analysis type to write results files for. If not provided, write results files for all active analyses.out_delim (str) -- Delimiter to use for output where applicable. If not provided, defaults to the value in
settings['out_delim']
.
- plot_analysis_results(out_basename=None, analysis=None)¶
Plot the results of the analyses.
See Analyses for details on the results for each analysis type.
- key = 'fold'¶