repic.utils.build_subsets
Creates cross-validation subsets for iterative ensemble particle picking
Attributes
module name (used by argparse subparser) |
|
NumPy random generator (set to zero for reproducibility) |
|
argparse parse_args() object |
Functions
|
Adds argparse command line arguments for build_subsets.py |
|
Calculates subsets of examples (micrographs) for desired sampling percentages (1, 25, 50, and 100%) |
|
Creates symlinks for cross-validation files |
|
Creates Matplotlib line plot of CTFFIND4 defocus values |
|
Samples example from a random defocus bin (low, medium, and high) if the bin has items else randomly choose another bin to sample from |
|
Builds training, validation, and testing subsets (cross-validation files) for machine learning algorithm training |
Module Contents
- repic.utils.build_subsets.name = 'build_subsets'
module name (used by argparse subparser)
- Type:
str
- repic.utils.build_subsets.rng
NumPy random generator (set to zero for reproducibility)
- repic.utils.build_subsets.add_arguments(parser)
Adds argparse command line arguments for build_subsets.py
- Parameters:
parser (object) – argparse parse_args() object
- Returns:
None
- repic.utils.build_subsets.calc_subsets(n, s=3)
Calculates subsets of examples (micrographs) for desired sampling percentages (1, 25, 50, and 100%)
- Parameters:
n (int) – total number of examples to sample from
s (int) – number of examples to sample each iteration (s = 3 represents the low, medium, and high defocus bins)
- Returns:
Python dictionary containing the number of examples (values) per subset (key)
- Return type:
dict
- repic.utils.build_subsets.create_symlinks(args, files, label)
Creates symlinks for cross-validation files
- Parameters:
args (obj) – argparse command line argument object
files (list) – list of micrograph filenames to be symlinled
label (str) – name for created subdirectory that will contain linked files
- Returns:
None
- repic.utils.build_subsets.plot_defocus(data, low, med, out_file)
Creates Matplotlib line plot of CTFFIND4 defocus values
- Parameters:
data (list) – list of paired micrograph filenames and CTFFIND4 defocus values
low (float) – low defocus bin upper threshold
med (float) – medium defocus bin upper threshold
outfile (str) – filepath of the produced line plot
- Returns:
None
- repic.utils.build_subsets.sample_from_bin(bins, i)
Samples example from a random defocus bin (low, medium, and high) if the bin has items else randomly choose another bin to sample from
- Parameters:
bins (list) – list of defocus bins
i (int) – index of defocus bin to sample from
- Returns:
filename (str) and CTFFIND4 defocus value (float) of sampled example
- Return type:
tuple
- repic.utils.build_subsets.main(args)
Builds training, validation, and testing subsets (cross-validation files) for machine learning algorithm training
- Parameters:
args (obj) – argparse command line argument object
- repic.utils.build_subsets.parser
argparse parse_args() object
- Type:
obj