PDBNucleicAcids¶
PDBNucleicAcids is a Biopython based package that can parse all nucleic acids in a PDB structure, with a special focus on base-pair representation.
Free software: MIT license
Documentation: https://pdbnucleicacids.readthedocs.io.
Get Started¶
The official release is found in the Python Package Index (PyPI)
$ pip install pdbnucleicacids
You can parse single stranded and double stranded nucleic acids.
from Bio.PDB.MMCIFParser import MMCIFParser
from PDBNucleicAcids.NucleicAcid import DSNABuilder
# parse and build structure with Biopython
parser = MMCIFParser()
structure = parser.get_structure(
structure_id="1A02", filename="1a02-assembly1.cif"
)
# extract all double strand nucleic acids
builder = DSNABuilder()
dsna_list = builder.build_double_strands(structure)
# take the first double strand nucleic acid as an example
dsna = dsna_list[0]
# extract base-pairs data from double stranded nucleic acid
df = dsna.get_dataframe()
df.head()
i_chain_id i_residue_index ... j_residue_index j_chain_id
0 A 4003 ... 5020 B
1 A 4004 ... 5019 B
2 A 4005 ... 5018 B
3 A 4006 ... 5017 B
4 A 4007 ... 5016 B
Check the official documentation for more information.
TODO¶
in
search_paired_basemaybe add a scoring function instead of simple distancein
search_paired_baseadd a warning if there is more than one candidate or maybe more than one candidate with similar dist or scorein
BasePairget other information: shear, stretch, buckle, propeller, openingexplore the
is_nucleic(non_standard)and maybe check if it needs updatingProper tests (WIP)
Credits¶
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.