Skip to content

Basic loading of atoms, images, and volumes¤

cryojax.io implements reading/writing of basic cryo-EM data formats, such as MRC and PDB/PDBx formats.

Loading atomic structures¤

cryojax.io.read_atoms_from_pdb ¤

read_atoms_from_pdb(filename: str | pathlib.Path, *, loads_properties: bool = False, loads_b_factors: bool = False, center: bool = True, selection_string: str = 'all', model_index: int | None = None, stack_models: bool = False, standardizes_names: bool = True, topology: mdtraj.Topology | None = None) -> tuple[Float[ndarray, '... n_atoms 3'], Int[ndarray, '... n_atoms']] | tuple[Float[ndarray, '... n_atoms 3'], Int[ndarray, 'n_atoms'], dict | numpy.ndarray]

Load relevant atomic information for simulating cryo-EM images from a PDB or mmCIF file. This function wraps the function mmdf_to_atoms.

Info

The selection_string argument enables usage of mdtraj atom selection syntax.

Arguments:

  • filename: The name of the PDB/PDBx file to open.
  • center: If True, center the model so that its center of mass coincides with the origin.
  • loads_properties: If True, return a dictionary of the atom properties.
  • selection_string: A selection string in mdtraj's format.
  • model_index: An optional index for grabbing a particular model stored in the PDB. If None, grab all models, where atom_positions has a leading dimension for the model if stack_models = True or concatenates all models if stack_models = False.
  • stack_models: If True, model_index = None, and there are multiple models in the PDB, assume that each model is of the same protein and return atom positions and properties with a stacked leading dimension.
  • standardizes_names: If True, non-standard atom names and residue names are standardized. If set to False, this step is skipped.
  • topology: If None, use the function mmdf_to_topology to build a topology on-the-fly. If stack_models = True, model_index = None, and there are multiple models in the PDB, use the first model index to build the topology.

Returns:

A tuple whose first element is a numpy array of coordinates containing atomic positions, and whose second element is an array of atomic numbers. To be clear,

atom_positons, atomic_numbers = read_atoms_from_pdb(...)

Info

If your PDB has multiple models, arrays such as the atom positions are loaded with a leading dimension for each model. To load a single model at index 0,

atom_positons, atomic_numbers = read_atoms_from_pdb(..., model_index=0)

Reading and writing MRC files¤