API reference

This section details the functions and classes available in MDDB Workflow.

class mddb_workflow.mwf.Project(directory: str = '.', accession: str | None = None, database_url: str = 'https://irb-dev.mddbr.eu/api/', inputs_filepath: str | None = None, input_topology_filepath: str | None = None, input_structure_filepath: str | None = None, input_trajectory_filepaths: list[str] | None = None, md_directories: list[str] | None = None, input_md_config: list[list[str]] | None = None, reference_md_index: int | None = None, forced_inputs: list[list[str]] | None = None, populations_filepath: str = 'populations.json', transitions_filepath: str = 'transitions.json', aiida_data_filepath: str | None = None, filter_selection: bool | str = False, pbc_selection: str | None = None, cg_selection: str | None = None, dummy_selection: str | None = None, forced_class_selections: dict | None = None, image: bool = False, fit: bool = False, translation: list[float] = [0, 0, 0], mercy: list[str] | bool = [], trust: list[str] | bool = [], faith: bool = False, ssleep: bool = False, pca_analysis_selection: str = "(protein and name N CA C) or (nucleic and name P O5' O3' C5' C4' C3')", pca_fit_selection: str = "(protein and name N CA C) or (nucleic and name P O5' O3' C5' C4' C3')", rmsd_cutoff: float = 9, interaction_cutoff: float = 0.1, interactions_auto: str | None = None, guess_bonds: bool = False, ignore_bonds: bool = False, sample_trajectory: int | None = None, screenshot_frame: int | None = None, local_blast: bool = False)[source]

Bases: object

Class for the main project of an MDDB accession. A project is a set of related MDs. These MDs share all or most topology and metadata.

__init__(directory: str = '.', accession: str | None = None, database_url: str = 'https://irb-dev.mddbr.eu/api/', inputs_filepath: str | None = None, input_topology_filepath: str | None = None, input_structure_filepath: str | None = None, input_trajectory_filepaths: list[str] | None = None, md_directories: list[str] | None = None, input_md_config: list[list[str]] | None = None, reference_md_index: int | None = None, forced_inputs: list[list[str]] | None = None, populations_filepath: str = 'populations.json', transitions_filepath: str = 'transitions.json', aiida_data_filepath: str | None = None, filter_selection: bool | str = False, pbc_selection: str | None = None, cg_selection: str | None = None, dummy_selection: str | None = None, forced_class_selections: dict | None = None, image: bool = False, fit: bool = False, translation: list[float] = [0, 0, 0], mercy: list[str] | bool = [], trust: list[str] | bool = [], faith: bool = False, ssleep: bool = False, pca_analysis_selection: str = "(protein and name N CA C) or (nucleic and name P O5' O3' C5' C4' C3')", pca_fit_selection: str = "(protein and name N CA C) or (nucleic and name P O5' O3' C5' C4' C3')", rmsd_cutoff: float = 9, interaction_cutoff: float = 0.1, interactions_auto: str | None = None, guess_bonds: bool = False, ignore_bonds: bool = False, sample_trajectory: int | None = None, screenshot_frame: int | None = None, local_blast: bool = False)[source]

Initialize a Project.

Parameters:

directory (str) – Project directory where the whole workflow is to be run.
accession (Optional[str]) – Project accession to download missing input files from the database (if already uploaded).
database_url (str) – API URL to download missing data. when an accession is provided.
inputs_filepath (str) – Path to a file with inputs for metadata, simulation parameters and analysis config.
input_topology_filepath (Optional[str]) – Path or glob pattern to input topology file relative to the project directory.
input_structure_filepath (Optional[str]) – Path or glob pattern to input structure file. It may be relative to the project or to each MD directory. If this value is not passed then the standard structure file is used as input by default.
input_trajectory_filepaths (Optional[list[str]]) – Paths or glob patterns to input trajectory files relative to each MD directory. If this value is not passed then the standard trajectory file path is used as input by default.
md_directories (Optional[list[str]]) – Path or glob pattern to the different MD directories. Each directory is to contain an independent trajectory and structure. Several output files will be generated in every MD directory.
input_md_config (Optional[list[list[str]]]) – Configuration of a specific MD. You may declare as many as you want. Every MD requires a directory name and at least one trajectory path. The structure is -md <directory> <trajectory_1> <trajectory_2> … Note that all trajectories from the same MD will be merged. For legacy reasons, you may also provide a specific structure for an MD. e.g. -md <directory> <structure> <trajectory_1> <trajectory_2> …
reference_md_index (Optional[int]) – Index of the reference MD (used by project-level functions; defaults to first MD).
forced_inputs (Optional[list]) – Force a specific input through the command line. Inputs passed through command line have priority over the ones from the inputs file. In fact, these values will overwritten or be appended in the inputs file. Every forced input requires an input name and a value. The structure is -fi <input name> <new input value>
populations_filepath (str) – Path to equilibrium populations file (Markov State Model only)
transitions_filepath (str) – Path to transition probabilities file (Markov State Model only).
aiida_data_filepath (Optional[str]) – Path to the AiiDA data file. This file may be generated by the aiida-gromacs plugin and contains provenance data.
filter_selection (bool|str) – Atoms selection to be filtered in VMD format. If the argument is passed alone (i.e. with no selection) then water and counter ions are filtered.
pbc_selection (Optional[str]) – Selection of atoms which stay in Periodic Boundary Conditions even after imaging the trajectory. e.g. remaining solvent, ions, membrane lipids, etc. Selection passed through console overrides the one in inputs file.
cg_selection (Optional[str]) – Selection of atoms which are not actual atoms but Coarse Grained beads. Selection passed through console overrides the one in inputs file.
dummy_selection (Optional[str]) – Selection of atoms which are not real atoms but dummy atoms.
forced_class_selections – Custom forced selections for molecular classification.
image (bool) – Set if the trajectory is to be imaged so atoms stay in the PBC box. See -pbc for more information.
fit (bool) – Set if the trajectory is to be fitted (both rotation and translation) to minimize the RMSD to PROTEIN_AND_NUCLEIC_BACKBONE selection.
translation (list[float]) – Set the x y z translation for the imaging process. e.g. -trans 0.5 -1 0
mercy (list[str]|bool) – Failures to be tolerated (or boolean to set all/none).
trust (list[str]|bool) – Tests to skip/trust (or boolean to set all/none).
faith (bool) – If True, require input files to match expected output files and skip processing.
ssleep (bool) – If True, SSL certificate authentication is skipped when downloading data from an API.
pca_analysis_selection (str) – Atom selection for PCA analysis in VMD syntax.
pca_fit_selection (str) – Atom selection for the PCA fitting in VMD syntax.
rmsd_cutoff (float) – Set the cutoff for the RMSD sudden jumps analysis to fail. This cutoff stands for the number of standard deviations away from the mean an RMSD value is to be.
interaction_cutoff (float) – Set the cutoff for the interactions analysis to fail. This cutoff stands for percent of the trajectory where the interaction happens (from 0 to 1).
interactions_auto (Optional[str]) – Guess input interactions automatically. A VMD selection may be passed to limit guessed interactions to a specific subset of atoms.
guess_bonds (bool) – Force the workflow to guess atom bonds based on distance and atom radii in different frames along the trajectory instead of mining topology bonds.
ignore_bonds (bool) – Force the workflow to ignore atom bonds. This will result in many check-ins being skipped
sample_trajectory (Optional[int]) – If passed, download the first 10 (by default) frames from the trajectory. You can specify a different number by providing an integer value.
screenshot_frame (Optional[int]) – If passed, the project screenshot is made using the specified frame (0-based), from the reference MD. Negative number may be passed to select frames starting by the end. e.g. -1 is the last frame. By default the screenshot is made using the reference frame from the reference MD.
local_blast (bool) – Run the protein mapping blast locally against a Swiss-Prot database (downloaded on first use) instead of using the remote NCBI blast service. Requires BLAST+ to be installed, which can be done withc ‘conda install -c bioconda blast’ or ‘apt install ncbi-blast+’.

property aiida_data_file: File | None: AiiDA data file (read only)

autofill_inputs()[source]: Make the workflow create (if does not exist yet) and automatically fill the inputs file. Note that, by definition, most inputs are not to be defined by the workflow, but user inputs. However there are a few exceptions where the workflow may attempt to guess some values. For instance, the PBC selection or the dummy atoms selection are often correctly guessed. In addition other selections will be parsed to atomic indices selections, which are safer.

property cg_residues: list[int]: Indices of residues in coarse grain (read only)

property cg_selection: Selection: Periodic boundary conditions atom selection (read only)

property charges: Atom charges (read only)

check_inputs_file_available() → bool[source]: Set a function to check if inputs file is available. Note that asking for it when it is not available will lead to raising an input error.

check_is_time_dependent() → bool[source]: Set if MDs are time dependent.

static create_cache(directory: str = '.') → Cache[source]: Create or load the project cache. Cannot fail unless the directory doesn’t exist, which is a pre-condition anyway.

property dihedrals: list[dict]: Topology dihedrals (read only)

property dummy_selection: Selection: Dummy atoms selection (read only)

property file_inputs: dict: Inputs from the inputs file (read only)

get_aiida_data_file() → File | None[source]: Get the AiiDA data file.

get_cg_residues() → list[int][source]: Get indices of coarse grain residues. Make sure they are coherent among all MDs.

get_cg_selection() → Selection[source]: Get the coarse grain atom selection. Make sure it is coherent among all MDs.

get_chain_references = <Task (chains)>

get_charges()[source]

get_dihedrals() → list[dict][source]: Get the topology dihedrals.

get_dummy_selection() → Selection[source]: Get the dummy atoms selection. Make sure it is coherent among all MDs.

get_file(target_file: File) → bool[source]: Check if a file exists. If not, try to download it from the database. If the file is not found in the database it is fine, we do not even warn the user. Note that nowadays this function is used to get populations and transitions files, which are not common.

get_file_input(name: str) → Any[source]: Get a specific ‘input’ value from the inputs file.

get_file_inputs() → dict[source]: Get inputs.

get_inchi_references = <Task (inchimap)>

get_inchikeys = <Task (inchikeys)>

get_input_default_value(name: str) → Any[source]: Get a specific ‘input’ default value used when the user provides no value by any means. If no default value is specified then return None.

get_input_structure_file() → File[source]: Get the input structure file.

get_input_topology_file() → File | None[source]: Get the input topology file. If the file is not found try to download it.

get_input_topology_filepath() → str | None[source]

Get the input topology filepath from the inputs or try to guess it.

If the input topology filepath is a ‘no’ flag then we consider there is no topology at all So far we extract atom charges and atom bonds from the topology file In this scenario we can keep working but there are some consecuences:

Analysis using atom charges such as ‘energies’ will be skipped

The standard topology file will not include atom charges

Bonds will be guessed from atom radii and distance along multiple frames

get_input_trajectory_files() → list[File][source]: Get the input trajectory file(s) from the reference MD. If file(s) are not found then try to download them.

get_inputs_file() → File[source]: Set a function to load the inputs yaml file.

get_ligand_references = <Task (ligmap)>

get_lipid_references = <Task (lipmap)>

get_md_charges()[source]

get_mds() → list[MD][source]: Get the available MDs (read only).

get_membrane_map = <Task (memmap)>

get_pbc_residues() → list[int][source]: Get the indices of residues in periodic boundary conditions.

get_pbc_selection() → Selection[source]: Get the periodic boundary conditions atom selection. Make sure it is coherent among all MDs.

get_pdb_ids() → list[str][source]: Get the tested and standardized PDB ids.

get_pdb_references = <Task (pdbs)>

get_populations() → list[float] | None[source]: Get the equilibrium populations from a MSM.

get_populations_file() → File | None[source]: Get the MSM equilibrium populations file.

get_processed_interactions() → dict[source]: Get the processed interactions from the reference replica, which are the same for all replicas.

static get_project_directory(directory: str) → str[source]: Get the project directory from the input directory.

get_protein_map = <Task (protmap)>

get_protein_references_file() → File[source]

get_reference_md() → MD[source]: Get the reference MD.

get_reference_md_index() → int[source]: Get the reference MD index.

get_residue_map = <Task (resmap)>

get_screenshot_filename = <Task (screenshot)>

get_snapshots() → int[source]: Get the reference MD snapshots.

get_standard_topology_file() → File[source]

get_structure() → Structure[source]: Get a reference structure. Use the reference MD structure but make sure there are no inconsistency with other MDs.

get_structure_file() → File[source]: Get the processed structure from the reference MD.

get_topology_file() → File[source]: Get the processed topology from the reference MD.

get_topology_filepath() → str[source]: Get the processed topology file path.

get_topology_reader() → Topology[source]: Get the topology data reader.

get_trajectory_file() → File[source]: Get the processed trajectory from the reference MD.

get_transitions() → list[list[float]] | None[source]: Get the transition probabilities from a MSM.

get_transitions_file() → File | None[source]: Get the MSM transition probabilities file.

get_universe() → int[source]: Get the MDAnalysis Universe from the reference MD.

get_warnings() → list[source]

Get the warnings.

The warnings list should not be reasigned, but it was back in the day. To avoid silent bugs, we read it directly from the register every time.

property inchikey_map: InChI references (read only)

property inchikeys: InChI keys (read only)

inherit_topology_filename() → str | None[source]: Set the expected output topology filename given the input topology filename. Note that topology formats are conserved.

property input_authors: Input authors (read only)

property input_boxtype: Input boxtype (read only)

property input_cg_selection: Selection of atoms which are not acutal atoms but Coarse Grained beads (read only)

property input_chain_names: Input chain names (read only)

property input_citation: Input citation (read only)

property input_collections: Input collections (read only)

property input_contact: Input contact (read only)

property input_customs: Input custom representations (read only)

property input_cv19_abs: Input Covid-19 antibodies (read only)

property input_cv19_nanobs: Input Covid-19 nanobodies (read only)

property input_cv19_startconf: Input Covid-19 starting conformation (read only)

property input_cv19_unit: Input Covid-19 Unit (read only)

property input_dataset: Dataset storage file. (read only)

property input_description: Input description (read only)

property input_dummy_selection: The original user input dummy atoms selection (read only)

property input_ensemble: Input ensemble (read only)

property input_force_fields: Input force fields (read only)

property input_forced_class_selections: The original input custom forced selections for molecular classification (read only)

property input_framestep: Input framestep (read only)

property input_groups: Input groups (read only)

property input_interactions: Interactions to be analyzed (read only)

property input_license: Input license (read only)

property input_ligands: Input ligand references (read only)

property input_linkcense: Input license link (read only)

property input_links: Input links (read only)

property input_metadditions: Author-customizable metadata additional fields (read only)

property input_method: Input method (read only)

property input_multimeric: Input multimeric labels (read only)

property input_name: Input name (read only)

property input_orientation: Input orientation (read only)

property input_pbc_selection: Selection of atoms which are still in periodic boundary conditions (read only)

property input_pdb_ids: Protein Data Bank IDs used for the setup of the system (read only)

property input_program: Input program (read only)

property input_protein_references: Uniprot IDs to be used first when aligning protein sequences (read only)

property input_structure_file: File: Input structure file for each MD (read only)

property input_temperature: Input temperature (read only)

property input_thanks: Input acknowledgements (read only)

property input_timestep: Input timestep (read only)

property input_topology_file: File | None: Input topology file (read only)

property input_trajectory_files: list[File]: Input trajectory files for each MD (read only)

property input_type: Set if its a trajectory or an ensemble (read only)

property input_version: Input version (read only)

property input_water: Input water force field (read only)

property inputs_file: File: Inputs filename (read only)

inputs_property(doc: str = '')[source]: Set a function to get a specific ‘input’ value by its key/name. Note that we return the property without calling the getter.

property interactions: dict: Processed interactions (read only)

property is_inputs_file_available: bool: Inputs file availability (read only)

property is_time_dependent: bool: Check if trajectory frames are time dependent (read only)

property ligand_references: Ligand references (read only)

property lipid_references: Lipid references (read only)

property md_charges: Atom charges from each MD (read only)

property mds: list[MD]: Available MDs (read only)

property membrane_map: Membrane mapping (read only)

pathify(filename_or_relative_path: str) → str[source]: Given a filename or relative path, add the project directory path at the beginning.

property pbc_residues: list[int]: Indices of residues in periodic boundary conditions (read only)

property pbc_selection: Selection: Periodic boundary conditions atom selection (read only)

property pdb_ids: list[str]: Tested and standarized PDB ids (read only)

property pdb_references: PDB references (read only)

pdb_references_file() → File[source]

property populations: list[float] | None: Equilibrium populations from a MSM (read only)

property populations_file: File | None: MSM equilibrium populations file (read only)

prepare_metadata = <Task (pmeta)>

prepare_standard_topology = <Task (stopology)>

produce_provenance = <Task (aiidata)>

property protein_map: Protein residues mapping (read only)

property protein_references_file: File: File including protein refereces data mined from UniProt (read only)

property reference_md: MD: Reference MD (read only)

property reference_md_index: int: Reference MD index (read only)

property residue_map: Residue map (read only)

property screenshot_filename: Screenshot filename (read only)

property snapshots: int: Reference MD snapshots (read only)

property standard_topology_file: File: Standard topology filename (read only)

property structure: Structure: Parsed structure from the reference MD (read only)

property structure_file: File: Structure filename from the reference MD (read only)

property topology_file: File: Topology file (read only)

property topology_reader: Topology: Topology reader (read only)

property trajectory_file: File: Trajectory filename from the reference MD (read only)

property transitions: list[list[float]] | None: Transition probabilities from a MSM (read only)

property transitions_file: File | None: MSM transition probabilities file (read only)

property universe: int: MDAnalysis Universe object (read only)

update_cache_inputs(arg_key: str, new_value: Any, verbose: bool = False)[source]: Update an input argument cksum in the cache of all tasks using it. This may be useful after updating an input value if you don’t want all tasks using it to run again in the next workflow run

update_file_inputs(nested_key: str, new_value: Any) → bool[source]: Permanently update the inputs file. This may be done when command line inputs do not match file inputs. Return True if the inputs is updated correctly. Return False if there is no update.

property warnings: list: Project warnings to be written in metadata

class mddb_workflow.mwf.MD(project: Project, name: str, number: int, directory: str, input_topology_filepath: str, input_structure_filepath: str, input_trajectory_filepaths: list[str])[source]

Bases: object

A Molecular Dynamics (MD) is the union of a structure and a trajectory. Having this data several analyses are possible. Note that an MD is always defined inside of a Project and thus it has additional topology and metadata.

property average_structure_file: File: Average structure filename (read only)

property cg_residues: list[int]: Indices of residues in coarse grain (read only)

property cg_selection: Selection: Periodic boundary conditions atom selection (read only)

property charges: Atom charges (read only)

check_inputs_file_available() → bool[source]: Set a function to check if inputs file is available. Note that asking for it when it is not available will lead to raising an input error. This function is inherited from the project.

property dihedrals: list[dict]: Topology dihedrals (read only)

property dummy_selection: Selection: Dummy atoms selection (read only)

property first_frame_file: File: First frame (read only)

property forced_class_selections: dict[str, Selection] | None: Custom forced selections for molecular classification (read only)

get_MDAnalysis_Universe = <Task (mda_univ)>

get_average_structure = <Task (average)>

get_average_structure_file() → File[source]

get_cg_residues() → list[int][source]: Get indices of residues in coarse grain.

get_cg_selection() → Selection[source]: Get the coarse grain atom selection.

get_charges = <Task (charges)>

get_check_stable_bonds() → bool[source]: Check if we must check stable bonds.

get_dihedrals() → list[dict][source]: Get the topology dihedrals.

get_dummy_selection() → Selection[source]: Get the dummy atoms selection.

get_file(target_file: File) → bool[source]: Check if a file exists. If not, try to download it from the database. If the file is not found in the database it is fine, we do not even warn the user. Note that this function is used to get populations and transitions files, which are not common.

get_file_input(name: str)[source]: Get a specific ‘input’ value from MD inputs.

get_first_frame = <Task (firstframe)>

get_first_frame_file() → File[source]

get_forced_class_selections() → dict[str, Selection] | None[source]: Get the forced class atoms selection.

get_input_structure_file() → File[source]: Get the input pdb filename from the inputs. If the file is not found try to download it.

get_input_structure_filepath() → str[source]: Set a function to get input structure file path.

get_input_topology_file() → File | None[source]: Get the input topology file. If the file is not found try to download it.

get_input_topology_filepath() → str | None[source]

Get the input topology filepath from the inputs or try to guess it.

If the input topology filepath is a ‘no’ flag then we consider there is no topology at all So far we extract atom charges and atom bonds from the topology file In this scenario we can keep working but there are some consecuences:

Analysis using atom charges such as ‘energies’ will be skipped

The standard topology file will not include atom charges

Bonds will be guessed from atom radii and distance along multiple frames

get_input_trajectory_filepaths() → list[str][source]: Get the input trajectory file paths.

get_input_trajectory_files() → list[File][source]: Get the input trajectory filename(s) from the inputs. If file(s) are not found try to download it.

get_pbc_residues() → list[int][source]: Get indices of residues in periodic boundary conditions.

get_pbc_selection() → Selection[source]: Get the periodic boundary conditions atom selection.

get_populations() → list[float][source]: Get equilibrium populations from a MSM from the project.

get_processed_interactions = <Task (inter)>

get_protein_map() → dict[source]: Get the residues mapping from the project.

get_reference_bonds = <Task (refbonds)>

get_reference_frame = <Task (reframe)>

get_snapshots = <Task (frames)>

get_structure() → Structure[source]: Get the parsed structure.

get_structure_file() → File[source]

get_thickness_analysis = <Task (thickness)>

get_topology_file() → File[source]

get_topology_filepath() → str[source]: Get the processed topology file path.

get_topology_reader() → Topology[source]: Get the topology data reader.

get_trajectory_file() → File[source]

get_trajectory_filepath() → str[source]

get_transitions() → list[list[float]][source]: Get transition probabilities from a MSM from the project.

get_warnings() → list[source]

Get the warnings.

The warnings list should not be reasigned, but it was back in the day. To avoid silent bugs, we read it directly from the register every time.

inherit_topology_filename() → str | None[source]: Set the expected output topology filename given the input topology filename. Note that topology formats are conserved.

property input_cg_selection: Selection of atoms which are not actual atoms but coarse grain beads (read only)

property input_dummy_selection: Selection of atoms which are not real atoms but dummy atoms (read only)

input_files_processing = <Task (inpro)>

property input_forced_class_selections: Custom forced selections for molecular classification (read only)

input_getter()[source]: Get input values which may be MD specific. If the MD input is missing then we use the project input value.

property input_interactions: Interactions to be analyzed (read only)

property input_pbc_selection: Selection of atoms which are still in periodic boundary conditions (read only)

property input_structure_file: File: Input structure filename (read only)

property input_topology_file: File | None: Input topology file (read only)

property input_trajectory_files: list[File]: Input trajectory filenames (read only)

property interactions: Processed interactions (read only)

property is_inputs_file_available: bool: Inputs file availability (read only)

is_trajectory_integral() → bool | None[source]: Sudden jumps test.

property must_check_stable_bonds: bool: Check if we must check stable bonds (read only)

pathify(filename_or_relative_path: str) → str[source]: Given a filename or relative path, add the MD directory path at the beginning.

property pbc_residues: list[int]: Indices of residues in periodic boundary conditions (read only)

property pbc_selection: Selection: Periodic boundary conditions atom selection (read only)

property populations: list[float]: Equilibrium populations from a MSM (read only)

prepare_metadata = <Task (mdmeta)>

print_tests_summary()[source]: Make a summary of tests and their status.

property protein_map: dict: Residues mapping (read only)

property reference_bonds: Atom bonds to be trusted (read only)

property reference_frame: Reference frame to be used to represent the MD (read only)

run_apl_analysis = <Task (apl)>

run_channels_analysis = <Task (channels)>

run_clusters_analysis = <Task (clusters)>

run_density_analysis = <Task (density)>

run_dihedral_energies = <Task (dihedrals)>

run_dist_perres_analysis = <Task (dist)>

run_energies_analysis = <Task (energies)>

run_hbonds_analysis = <Task (hbonds)>

run_helical_analysis = <Task (helical)>

run_lipid_interactions_analysis = <Task (linter)>

run_lipid_order_analysis = <Task (lorder)>

run_markov_analysis = <Task (markov)>

run_pca_analysis = <Task (pca)>

run_pockets_analysis = <Task (pockets)>

run_rgyr_analysis = <Task (rgyr)>

run_rmsd_pairwise_analysis = <Task (pairwise)>

run_rmsd_perres_analysis = <Task (perres)>

run_rmsds_analysis = <Task (rmsds)>

run_rmsf_analysis = <Task (rmsf)>

run_sas_analysis = <Task (sas)>

run_tmscores_analysis = <Task (tmscore)>

property snapshots: Trajectory snapshots (read only)

property structure: Structure: Parsed structure (read only)

property structure_file: File: Structure file (read only)

property thickness_analysis: Membrane thickness analysis

property topology_file: File: Topology file (read only)

property topology_filepath: str: Topology file path (read only)

property topology_reader: Topology: Topology reader (read only)

property trajectory_file: File: Trajectory file (read only)

property transitions: list[list[float]]: Transition probabilities from a MSM (read only)

property universe: MDAnalysis Universe object (read only)

update_file_inputs(key: str, new_value) → bool[source]: Permanently update current MD inputs in the inputs file. Do it only if the project value is not already the same. Return True if the inputs is updated correctly. Return False if there is no update.

property warnings: list: MD warnings to be written in metadata

class mddb_workflow.utils.structures.Structure(atoms: list[Atom] = [], residues: list[Residue] = [], chains: list[Chain] = [], residue_numeration_base: int = 10)[source]

Bases: object

A structure is a group of atoms organized in chains and residues.

SUPPORTED_SELECTION_SYNTAXES = {'gmx', 'pytraj', 'vmd'}

property atom_count: int: The number of atoms in the structure (read only)

auto_chainer(verbose: bool = False)[source]: Smart function to set chains automatically. Original chains will be overwritten.

property bonds: list[list[int]]: The structure bonds

property chain_count: int: Number of chains in the structure (read only)

chainer(selection: Selection | None = None, letter: str | None = None, whole_fragments: bool = False)[source]: Set chains on demand. If no selection is passed then the whole structure will be affected. If no chain is passed then a “chain by fragment” logic will be applied.

check_available_chains()[source]: Check if there are more chains than available letters.

check_incoherent_bonds() → bool[source]: Check bonds to be incoherent i.e. check atoms not to have more or less bonds than expected according to their element. Return True if any incoherent bond is found.

check_merged_residues(fix_residues: bool = False, display_summary: bool = False) → bool[source]

There may be residues which contain unconnected (unbonded) atoms. They are not allowed. They may come from a wrong parsing and be indeed duplicated residues.

Search for merged residues. Create new residues for every group of connected atoms if the fix_residues argument is True. Note that the new residues will be repeated, so you will need to run check_repeated_residues after. Return True if there were any merged residues.

check_repeated_atoms(fix_atoms: bool = False, display_summary: bool = False) → bool[source]

Check atoms to search for repeated atoms. Atoms with identical chain, residue and name are considered repeated atoms.

Parameters:

fix_atoms (bool) – If True, rename repeated atoms.
display_summary (bool) – If True, display a summary of repeated atoms.

Returns:

True if there were any repeated atoms, False otherwise.

Return type:

bool

check_repeated_chains(fix_chains: bool = False, display_summary: bool = False) → bool[source]

There may be chains which are equal in the structure (i.e. same chain name). This means we have a duplicated/splitted chain. Repeated chains are usual and they are usually supported but with some problems. Also, repeated chains usually come with repeated residues, which means more problems (see explanation below).

In the context of this structure class we may have 2 different problems with a different solution each:

There is more than one chain with the same letter (repeated chain) -> rename the duplicated chains
There is a chain with atom indices which are not consecutive (splitted chain) -> create new chains

Rename repeated chains or create new chains if the fix_chains argument is True.

WARNING: These fixes are possible only if there are less chains than the number of letters in the alphabet. Although there is no limitation in this code for chain names, setting long chain names is not compatible with pdb format.

Check splitted chains (a chains with non consecutive residues) and try to fix them if requested. Check repeated chains (two chains with the same name) and return True if there were any repeats.

check_repeated_residues(fix_residues: bool = False, display_summary: bool = False) → bool[source]

There may be residues which are equal in the structure (i.e. same chain, number and icode). In case 2 residues in the structure are equal we must check distance between their atoms. If atoms are far it means they are different residues with the same notation (duplicated residues). If atoms are close it means they are indeed the same residue (splitted residue).

Splitted residues are found in some pdbs and they are supported by some tools. These tools consider all atoms with the same ‘record’ as the same residue. However, there are other tools which would consider the splitted residue as two different residues. This causes inconsistency along different tools besides a long list of problems. The only possible is fix is changing the order of atoms in the topology. Note that this is a breaking change for associated trajectories, which must change the order of coordinates. However here we provide tools to fix associates trajectories as well.

Duplicated residues are usual and they are usually supported but with some problems. For example, pytraj analysis outputs use to sort results by residues and each residue is tagged. If there are duplicated residues with the same tag it may be not possible to know which result belongs to each residue. Another example are NGL selections once in the web client. If you select residue ‘:A and 1’ and there are multiple residues 1 in chain A all of them will be displayed.

Check residues to search for duplicated and splitted residues. Renumerate repeated residues if the fix_residues argument is True. Return True if there were any repeats.

check_splitted_chains(fix_chains: bool = False, display_summary: bool = False) → bool[source]

Check if non-consecutive atoms belong to the same chain. If so, separate pieces of non-consecuite atoms in different chains. Note that the new chains will be duplicated, so you will need to run check_repeated_chains after.

Parameters:

fix_chains (bool) – If True then the splitted chains will be fixed.
display_summary (bool) – If True then a summary of the splitted chains will be displayed.

Returns:

True if we encountered splitted chains and false otherwise.

Return type:

bool

copy() → Structure[source]: Make a copy of the current structure.

copy_bonds() → list[list[int]][source]: Make a copy of the bonds list.

del_bonds()[source]: Delete structure bonds.

display_summary()[source]: Get a summary of the structure.

property dummy_atom_indices: set: Atom indices for what we consider dummy atoms

filter(selection: Selection | str, selection_syntax: str = 'vmd') → Structure[source]: Create a new structure from the current using a selection to filter the atoms we want to keep.

filter_away(selection: Selection | str, selection_syntax: str = 'vmd') → Structure[source]: Create a new structure from the current using a selection to filter the atoms we want to remove.

find_chain(name: str) → Chain | None[source]: Get a chain by its name.

find_covalent_bonds(selection: Selection | None = None, safe_elements: bool = True) → list[list[int]][source]: Get all atomic covalent (strong) bonds. Bonds are defined as a list of atom indices for each atom in the structure. Rely on VMD logic to do so.

find_fragments(selection: Selection | None = None, coherent: bool = True, exclude_dummy_fragments: bool = False, atom_bonds: list[list[int]] | None = None) → Generator[Selection, None, None][source]

Find fragments in a selection of atoms. A fragment is a selection of covalently bonded atoms. All atoms are searched if no selection is provided.

WARNING: Note that fragments generated from a specific selection may not match the structure fragments. A selection including 2 separated regions of a structure fragment will yield 2 fragments.

For convenience, bonds between non-consecutive residues are excluded from this logic. This is useful to ignore disulfide bonds. May also help to properly find chains in CG simulations where chains may be bonded.

There is also a flag to exclude fragments which are made of dummy atoms only

find_or_create_chain(name: str) → Chain[source]: Get a chain by its name or create it if not exists.

find_ptms() → Generator[dict, None, None][source]: Find Post Translational Modifications (PTM) in the structure.

find_residue(chain_name: str, number: int, icode: str = '') → Residue | None[source]: Find a residue by its chain, number and insertion code.

find_rings(max_ring_size: int, selection: Selection | None = None) → list[list[Atom]][source]: Find rings with a maximum specific size or less in the structure and yield them as they are found.

find_whole_fragments(selection: Selection) → Generator[Selection, None, None][source]: Given a selection of atoms, find all whole structure fragments on them.

fix_atom_elements(trust: bool = True, show_warnings: bool = True) → bool[source]

Fix atom elements by guessing them when missing. Set all elements with the first letter upper and the second (if any) lower. Also check if atom elements are coherent with atom names.

Parameters:: trust (bool) – If ‘trust’ is set as False then we impose elements according to what we can guess from the atom name.
Returns:: Return True if any element was modified or False if not.
Return type:: bool

force_classifications(classifications: dict[str, Selection])[source]: Apply a set of forced residue classifications.

property fragments: list[Selection]: The structure fragments (read only)

classmethod from_file(mysterious_filepath: str)[source]: Set the structure from a file if the file format is supported.

classmethod from_mdanalysis(mdanalysis_universe)[source]: Set the structure from an MD analysis object.

classmethod from_mmcif(mmcif_content: str, model: int = 1, author_notation: bool = False)[source]: Set the structure from mmcif. You may filter the content for a specific model. You may ask for the author notation instead of the standarized notation for legacy reasons. This may have an effect in atom names, residue names, residue numbers and chain names. Read the pdb content line by line and set the parsed atoms, residues and chains.

classmethod from_mmcif_file(mmcif_filepath: str, model: int = 1, author_notation: bool = False)[source]: Set the structure from a mmcif file.

classmethod from_pdb(pdb_content: str, model: int = 1, flexible_numeration: bool = True)[source]: Set the structure from a pdb file. You may filter the PDB content for a specific model. Some weird numeration systems are not supported and, when encountered, they are ignored. In these cases we set our own numeration system. Set the flexible numeration argument as false to avoid this behaviour, thus crashing instead.

classmethod from_pdb_file(pdb_filepath: str, model: int = 1, flexible_numeration: bool = True)[source]: Set the structure from a pdb file. You may filter the input PDB file for a specific model. Some weird numeration systems are not supported and, when encountered, they are ignored. In these cases we set our own numeration system. Set the flexible numeration argument as false to avoid this behaviour, thus crashing instead.

classmethod from_pdb_id(pdb_id: str, model: int = 1, author_notation: bool = False)[source]: Download and parse the structure from a PDB entry.

classmethod from_prmtop_file(prmtop_filepath: str)[source]: Set the structure from a prmtop file.

classmethod from_tpr_file(tpr_filepath: str)[source]: Set the structure from a tpr file.

generate_pdb(show_warnings: bool = True)[source]: Generate a pdb file content with the current structure.

generate_pdb_file(pdb_filepath: str, show_warnings: bool = True)[source]: Generate a pdb file with current structure.

get_atom_count() → int[source]: Get the number of atoms in the structure.

get_available_chain_name() → str | None[source]: Get an available chain name. Find alphabetically the first letter which is not yet used as a chain name. If all letters in the alphabet are used already then raise an error.

get_bonds(safe: bool = True) → list[list[int]][source]: Get the bonds between atoms. The safe argument makes sure elemnts are corrected before the calculation. Note that elements are important since atom radii are taken in count to calculate bonds.

get_chain_count() → int[source]: Get the number of chains in the structure.

get_dummy_atom_indices() → set[source]: Get all dummy atom indices together in a set.

get_fragments() → list[Selection][source]: Get the groups of atoms which are covalently bonded.

get_ion_atom_indices() → set[source]: Get all supported ion atom indices together in a set.

get_next_available_chain_name(anterior: str) → str[source]

Get the next available chain name.

Parameters:: anterior (str) – The last chain name used, which is expected to be a single letter
Raises:: ValueError – If the anterior is not a letter or if there are more chains than available

get_parsed_chains(only_protein: bool = False) → list[source]: Get each chain name and aminoacids sequence in a topology.

get_pytraj_topology()[source]: Get the structure equivalent pytraj topology.

get_rechained_structure(atom_chain_map: list) → Structure[source]: Given a chain map, copy this structure but applying the new chain map.

get_residue_count() → int[source]: Get the number of residues in the structure (read only).

get_selection_chain_indices(selection: Selection) → list[int][source]: Given an atom selection, get a list of chain indices for chains implicated. Note that if a single atom from the chain is in the selection then the chain index is returned.

get_selection_chains(selection: Selection) → list[Chain][source]: Given an atom selection, get a list of chains implicated. Note that if a single atom from the chain is in the selection then the chain is returned.

get_selection_classification(selection: Selection) → str[source]: Get type of the chain.

get_selection_outer_bonds(selection: Selection) → list[int][source]: Given an atom selection, get all bonds between these atoms and any other atom in the structure. Note that inner bonds between atoms in the selection are discarded.

get_selection_residue_indices(selection: Selection) → list[int][source]: Given an atom selection, get a list of residue indices for residues implicated. Note that if a single atom from the residue is in the selection then the residue index is returned.

get_selection_residues(selection: Selection) → list[Residue][source]: Given an atom selection, get a list of residues implicated. Note that if a single atom from the residue is in the selection then the residue is returned.

get_sequences(polymer_type: str | None = None) → list[str][source]: Get list of protein sequences in the structure.

has_cg() → bool[source]: Ask if the structure has at least one coarse grain atom/residue.

invert_selection(selection: Selection) → Selection[source]: Invert a selection.

property ion_atom_indices: set: Atom indices for what we consider supported ions

is_missing_any_bonds() → bool[source]

merge(other: Structure) → Structure[source]: Merge current structure with another structure.

name_selection(selection: Selection) → str[source]: Name an atom selection depending on the chains it contains. This is used for debug purpouses.

ptm_options = {'acetyl': 'Acetylation', 'amide': 'Amidation', 'carbohydrate': 'Glycosilation', 'dna': 'DNA linkage', 'fatty': 'Lipidation', 'ion': Warning('Ion is covalently bonded to protein'), 'other': Warning('Unknow type of PTM'), 'protein': ValueError('A PTM residue must never be protein'), 'rna': 'RNA linkage', 'solvent': Warning('Solvent is covalently bonded to protein'), 'steroid': 'Steroid linkage'}

purge_chain(chain: Chain)[source]: Purge chain from the structure. This can be done only when the chain has no residues left in the structure. Renumerate all chain indices which have been offsetted as a result of the purge.

purge_residue(residue: Residue)[source]: Purge residue from the structure and its chain. This can be done only when the residue has no atoms left in the structure. Renumerate all residue indices which have been offsetted as a result of the purge.

raw_protein_chainer()[source]: This is an alternative system to find protein chains (anything else is chained as ‘X’). This system does not depend on VMD. It totally overrides previous chains since it is expected to be used only when chains are missing.

property residue_count: int: Number of residues in the structure (read only)

select(selection_string: str, syntax: str = 'vmd') → Selection | None[source]

Select atoms from the structure thus generating an atom indices list.

Different tools may be used to make the selection: - vmd (default) - pytraj

select_all() → Selection[source]: Get a selection with all atoms.

select_atom_indices(atom_indices: list[int]) → Selection[source]: Set a function to make selections using atom indices.

select_by_classification(classification: str) → Selection[source]: Select atoms according to the classification of its residue.

select_carbohydrates() → Selection[source]: Select carbohydrates.

select_cartoon(include_terminals: bool = False) → Selection[source]

Select cartoon representable regions for VMD.

Rules are:

Residues must be protein (i.e. must contain C, CA, N and O atoms) or nucleic (P, OP1, OP2, O3’, C3’, C4’, C5’, O5’)
There must be at least 3 covalently bonded residues

It does not matter their chain, numeration or even index order as long as they are bonded. * Note that we can represent cartoon while we display one residue alone, but it must be connected anyway. Also, we have the option to include terminals in the cartoon selection although they are not representable. This is helpful for the screenshot: terminals are better hidden than represented as ligands.

select_cg() → Selection[source]: Select coarse grain atoms.

select_counter_ions(charge: str | None = None) → Selection[source]: Select counter ion atoms. WARNING: This logic is a bit guessy and it may fail for non-standard atom named structures.

select_dummy() → Selection[source]: Select dummy atoms.

select_heavy_atoms() → Selection[source]: Select heavy atoms.

select_ions() → Selection[source]: Select ions.

select_ligands(inchikey_map: list[dict]) → Selection[source]: Get a selection of all the ligand residues in the system based on the inchikey map.

select_lipids() → Selection[source]: Select lipids.

select_missing_bonds() → Selection[source]

select_nucleic() → Selection[source]: Select nucleic atoms.

select_pbc_guess() → Selection[source]: Return a selection of the typical PBC atoms: solvent, counter ions and lipids. WARNING: This is just a guess.

select_protein() → Selection[source]: Select protein atoms. WARNING: Note that there is a small difference between VMD protein and our protein. This function is improved to consider terminal residues as proteins. VMD considers protein any residue including N, C, CA and O while terminals may have OC1 and OC2 instead of O.

select_residue_indices(residue_indices: list[int]) → Selection[source]: Set a function to make selections using residue indices.

select_water() → Selection[source]: Select water atoms. WARNING: This logic is a bit guessy and it may fail for non-standard residue named structures.

select_water_and_counter_ions() → Selection[source]: Select both water and counter ions.

set_bonds(bonds: list[list[int]])[source]: Set structure bonds.

set_new_atom(atom: Atom)[source]: Set a new atom in the structure.

set_new_chain(chain: Chain)[source]: Set a new chain in the structure. WARNING: Residues and atoms must be set already before setting chains.

set_new_coordinates(new_coordinates: list[tuple[float, float, float]])[source]: Set new coordinates.

set_new_residue(residue: Residue)[source]: Set a new residue in the structure. WARNING: Atoms must be set already before setting residues.

set_selection_chain_name(selection: Selection, chain_name: str)[source]: Given an atom selection, set the chain for all these atoms. Note that the chain is changed in every whole residue, no matter if only one atom was selected.

sort_residues()[source]: Coherently sort residues according to the indices of the atoms they hold.

class mddb_workflow.utils.structures.Chain(name: str | None = None, classification: str | None = None)[source]

Bases: object

A chain of residues.

add_atom(new_atom: Atom)[source]: Add an atom to the chain.

add_residue(residue: Residue)[source]: Add a residue to the chain.

property atom_count: int: Number of atoms in the chain (read only)

property atom_indices: list[int]: Atom indices for all atoms in the chain (read only)

property atoms: list[int]: Atoms in the chain (read only)

property classification: str: Classification of the chain (manual or automatic)

clone() → Chain[source]

copy() → Chain[source]: Make a copy of the current chain.

find_or_create_residue(name: str, number: int, icode: str = '') → Residue[source]: Find a residue by its number and insertion code or create it if does not exist.

find_residue(number: int, icode: str = '', name: str = None) → Residue | None[source]: Find a residue by its number and insertion code. Name is optional.

get_atom_count() → int[source]: Get the number of atoms in the chain (read only).

get_atom_indices() → list[int][source]: Get atom indices for all atoms in the chain (read only). In order to change atom indices they must be changed in their corresponding residues.

get_atoms() → list[int][source]: Get the atoms in the chain (read only). In order to change atoms they must be changed in their corresponding residues.

get_classification() → str[source]: Get the chain classification.

get_index() → int | None[source]: Get the residue index according to parent structure residues (read only).

get_residue_count() → int[source]: Get the number of residues in the chain (read only).

get_residue_indices() → list[int][source]

get_residues() → list[Residue][source]: Get the residues in this chain. If residues are set then make changes in all the structure to make this change coherent.

get_selection() → Selection[source]: Generate a selection for this chain.

get_sequence() → str[source]: Get the residues sequence in one-letter code.

get_structure() → Structure | None[source]: Get the parent structure (read only).

has_cg() → bool[source]: Ask if the current chain has at least one coarse grain atom/residue.

property index: int | None: The residue index according to parent structure residues (read only)

is_missing_any_bonds() → bool[source]

remove_atom(current_atom: Atom)[source]: Remove an atom from the chain.

remove_residue(residue: Residue)[source]: Remove a residue from the chain. WARNING: Note that this function does not trigger the set_residue_indices.

property residue_count: int: Number of residues in the chain (read only)

property residue_indices: list[int]: The residue indices according to parent structure residues for residues in this residue

property residues: list[Residue]: The residues in this chain

set_classification(classification: str)[source]: Force the chain classification.

set_index(index: int)[source]: Set the residue index according to parent structure residues.

set_residue_indices(new_residue_indices: list[int])[source]

set_residues(new_residues: list[Residue])[source]: Find indices for new residues and set their indices as the new residue indices. Note that residues must be set in the structure already.

property structure: Structure | None: The parent structure (read only)

class mddb_workflow.utils.structures.Residue(name: str | None = None, number: int | None = None, icode: str | None = None)[source]

Bases: object

A residue class.

add_atom(new_atom: Atom)[source]: Add an atom to the residue.

property atom_count: int: The number of atoms in the residue (read only)

property atom_indices: list[int]: The atom indices according to parent structure atoms for atoms in this residue

property atoms: list[Atom]: The atoms in this residue

property chain: Chain | None: The residue chain

property chain_index: int: The residue chain index according to parent structure chains

property classification: str

Get the residue biochemistry classification.

WARNING: Note that this logic will not work in a structure without hydrogens.

Available classifications: - protein - dna - rna - carbohydrate - fatty - steroid - ion - solvent - acetyl - amide - other

clone() → Residue[source]

copy() → Residue[source]: Make a copy of the current residue.

find_rings(max_ring_size: int) → list[list[Atom]][source]: Find rings in the residue.

get_atom_by_name(atom_name: str) → Atom[source]: Get a residue atom given its name.

get_atom_count() → int[source]: Get the number of atoms in the residue.

get_atom_indices() → list[int][source]: Get the atom indices according to parent structure atoms for atoms in this residue. If atom indices are set then make changes in all the structure to make this change coherent.

get_atoms() → list[Atom][source]: Get the atoms in this residue. If atoms are set then make changes in all the structure to make this change coherent.

get_bonded_atom_indices() → list[int][source]: Get atom indices from atoms bonded to this residue.

get_bonded_atoms() → list[Atom][source]: Get atoms bonded to this residue.

get_bonded_residue_indices() → list[int][source]: Get residue indices from residues bonded to this residue.

get_bonded_residues() → list[Residue][source]: Get residues bonded to this residue.

get_chain() → Chain | None[source]

get_chain_index() → int[source]: Get the chain index of the residue according to parent structure.

get_classification() → str[source]

Get the residue biochemistry classification.

WARNING: Note that this logic will not work in a structure without hydrogens.

Available classifications: - protein - dna - rna - carbohydrate - fatty - steroid - ion - solvent - acetyl - amide - other

get_classification_by_name() → str[source]: Set an alternative function to “try” to classify the residues according only to its name. This is useful for corase grain residues whose atoms may not reflect the real atoms. WARNING: This logic is very limited and will return “unknown” most of the times. WARNING: This logic relies mostly in atom names, which may be not standard.

get_formula() → str[source]: Get the formula of the residue.

get_index() → int | None[source]: Get the residue index according to parent structure residues (read only). This value is set by the structure itself.

get_label() → str[source]: Get a standard label.

get_letter() → str[source]: Get the residue equivalent single letter code. Note that this makes sense for aminoacids and nucleotides only. Non recognized residue names return ‘X’.

get_selection() → Selection[source]: Generate a selection for this residue.

get_structure() → Structure | None[source]: Get the parent structure (read only). This value is set by the structure itself.

property index: int | None: The residue index according to parent structure residues (read only)

is_bonded_with_residue(other: Residue) → bool[source]: Given another residue, check if it is bonded with this residue.

is_cg() → bool[source]: Ask if the current residue is in coarse grain. Note that we assume there may be not hybrid aa/cg residues.

is_coherent() → bool[source]: Make sure atoms within the residue are all bonded.

is_large_aa() → bool[source]: Check if this residue is a large aminoacid.

is_missing_any_bonds() → bool[source]

property label: str: The residue standard label (read only)

remove_atom(current_atom: Atom)[source]: Remove an atom from the residue.

set_atom_indices(new_atom_indices: list[int])[source]

set_atoms(new_atoms: list[Atom])[source]

set_chain(new_chain: Chain | str)[source]

set_chain_index(new_chain_index: int)[source]

set_index(index)[source]

split(first_residue_atom_indices: list[int], second_residue_atom_indices: list[int], first_residue_name: str | None = None, second_residue_name: str | None = None, first_residue_number: int | None = None, second_residue_number: int | None = None, first_residue_icode: str | None = None, second_residue_icode: str | None = None) → tuple[Residue, Residue][source]: Split this residue in 2 residues and return them in a tuple. Keep things coherent in the structure (renumerate all residues below this one). Note that all residue atoms must be covered by the splits.

split_by_atom_names(first_residue_atom_names: list[str], second_residue_atom_names: list[str], first_residue_name: str | None = None, second_residue_name: str | None = None, first_residue_number: int | None = None, second_residue_number: int | None = None, first_residue_icode: str | None = None, second_residue_icode: str | None = None) → tuple[Residue, Residue][source]: Parse atom names to atom indices and then call the split function.

property structure: Structure | None: The parent structure (read only)

class mddb_workflow.utils.structures.Atom(name: str | None = None, element: str | None = None, coords: tuple[float, float, float] | None = None)[source]

Bases: object

An atom class.

property bonds: list[int] | None: Atoms indices of atoms in the structure which are covalently bonded to this atom

property chain: Chain | None: The atom chain

property chain_index: int | None: The atom chain index according to parent structure chains

clone() → Atom[source]

copy() → Atom[source]: Make a copy of the current atom.

get_bonded_atoms() → list[Atom][source]: Get bonded atoms.

get_bonds(skip_ions: bool = False, skip_dummies: bool = False, only_residue: bool = False, safe: bool = True) → list[int] | None[source]: Get indices of other atoms in the structure which are covalently bonded to this atom.

get_chain() → Chain | None[source]: The atom chain (read only). In order to change the chain it must be changed in the atom residue.

get_chain_index() → int | None[source]: Get the atom chain index according to parent structure chains.

get_index() → int | None[source]: Get the residue index according to parent structure residues (read only). This value is set by the structure itself.

get_label() → str[source]: Get a standard label.

get_name_suggested_element() → str[source]: Guess an atom element from its name only.

get_residue() → Residue | None[source]: The atom residue. If residue is set then make changes in all the structure to make this change coherent.

get_residue_index() → int[source]: Get the atom residue index according to parent structure residues. If residue index is set then make changes in all the structure to make this change coherent.

get_selection() → Selection[source]: Generate a selection for this atom.

get_structure() → Structure | None[source]: Get the parent structure (read only). This value is set by the structure itself.

guess_element() → str[source]: Guess an atom element from its name and number of bonds.

property index: int | None: The residue index according to parent structure residues (read only)

is_carbohydrate_candidate() → bool[source]: Check if this atom meets specific criteria: 1. It is a carbon. 2. It is connected only to other carbons, hydrogens or oxygens. 3. It is connected to 1 or 2 carbons. 4. It is connected to 1 oxygen.

is_cg() → bool[source]: Ask if the current atom is in coarse grain.

is_fatty_candidate() → bool[source]: Check if this atom meets specific criteria: 1. It is a carbon. 2. It is connected only to other carbons and hydrogens. 3. It is connected to 1 or 2 carbons.

is_ion() → bool[source]: Check if it is an ion by checking if it has no bonds with other atoms.

property label: str: Get a standard label.

property residue: Residue | None: The atom residue

property residue_index: int: The atom residue index according to parent structure residues

set_chain(new_chain: Chain | str)[source]

set_chain_index(new_chain_index: int)[source]

set_residue(new_residue: Residue)[source]: Find the new residue index and set it as the atom residue index. Note that the residue must be set in the structure already.

set_residue_index(new_residue_index: int)[source]

property structure: Structure | None: The parent structure (read only)