===== Usage ===== To use PDBNucleicAcids in a project: .. code-block:: python import PDBNucleicAcids Build All Strands of Nucleic Acids ---------------------------------- PDBNucleicAcids can parse all strands of nucleic acids in a Biopython structure. .. code-block:: python from Bio.PDB.PDBList import PDBList from Bio.PDB.MMCIFParser import MMCIFParser from PDBNucleicAcids.NucleicAcid import NABuilder # retrive file from PDB using Biopython pdbl = PDBList() pdbl.retrieve_pdb_file(pdb_code="1A02", pdir=".") pdbl.retrieve_assembly_file(pdb_code="1A02", assembly_num=1, pdir=".") # ... or else use your own # parse and build structure with Biopython parser = MMCIFParser() structure = parser.get_structure( structure_id="1A02", filename="1a02-assembly1.cif" ) # build all nucleic acids builder = NABuilder() na_list = builder.build_nucleic_acids(structure) na_list .. code-block:: console [, ] Every nucleic acid is like a Python list: .. code-block:: python na = na_list[0] na[:5] .. code-block:: console [, , , , ] PDBNucleicAcids can get a nucleic acid sequence: .. code-block:: python na.get_seq() .. code-block:: console Seq('TTGGAAAATTTGTTTCATAG') PDBNucleicAcids can also get a nucleic acid chain id, nucleic acid type and all atoms: .. code-block:: python print(na.get_chain_id(), na.get_na_type()) print(na.get_atoms()[:5]) .. code-block:: console A DNA [, , , , ] Build All Double-Stranded Nucleic Acids --------------------------------------- PDBNucleicAcids can parse all double-stranded nucleic acids in a Biopython structure. .. code-block:: python from PDBNucleicAcids.NucleicAcid import DSNABuilder builder = DSNABuilder() dsna_list = builder.build_double_strands(structure) dsna_list .. code-block:: console [] Get All Base-Pairs ------------------ PDBNucleicAcids can extract all base-pairs object in a double-stranded nucleic acid. Double straded nucleic acids are like a list of base-pairs: .. code-block:: python dsna = dsna_list[0] dsna[:5] .. code-block:: console [, , , , ] PDBNucleicAcids can extract all base-pairs data in a double-stranded nucleic acid. .. code-block:: python dsna = dsna_list[0] df = dsna.get_dataframe() df.head() .. code-block:: console i_chain_id i_residue_index ... j_residue_index j_chain_id 0 A 4003 ... 5020 B 1 A 4004 ... 5019 B 2 A 4005 ... 5018 B 3 A 4006 ... 5017 B 4 A 4007 ... 5016 B Search Individual Pair Bases ---------------------------- PDBNucleicAcids can search for paired nucleotide, given an input nucleotide. .. code-block:: python from PDBNucleicAcids.NucleicAcid import search_paired_base # input nucleotide base = structure[0]["A"][4003] # DG # search for paired nucleotide paired_base = search_paired_base(base) paired_base .. code-block:: console PDBNucleicAcids will recognize unpaired bases. .. code-block:: python # input nucleotide base = structure[0]["A"][4001] # DT # search for paired nucleotide paired_base = search_paired_base(base) print(paired_base) .. code-block:: console None DNA-RNA Complexes ----------------- PDBNucleicAcids base-pairing can be used for DNA-RNA base-pairs. .. code-block:: python from Bio.PDB.PDBList import PDBList from Bio.PDB.MMCIFParser import MMCIFParser from PDBNucleicAcids.NucleicAcid import search_paired_base # retrive file from PDB using Biopython pdbl = PDBList() pdbl.retrieve_assembly_file(pdb_code="9K7R", assembly_num=1, pdir=".") # parse and build structure with Biopython parser = MMCIFParser() structure = parser.get_structure( structure_id="9K7R", filename="9k7r-assembly1.cif" ) # input nucleotide base = structure[0]["B"][8] # DT # search for paired nucleotide paired_base = search_paired_base(base) # paired base is RNA base paired_base .. code-block:: console Custom Rules for Base-Pairing ----------------------------- PDBNucleicAcids base-pairing can be expanded, by changing parameters used in the base-pairing rules. .. code-block:: python from PDBNucleicAcids.BasePairRules import dsDNAWatsonCrickBasePairRules parser = MMCIFParser() structure = parser.get_structure( structure_id="1A02", filename="1a02-assembly1.cif" ) # custom base pairing rules my_rules = dsDNAWatsonCrickBasePairRules( max_distance = 3.5, max_angle = 60, max_stagger = 2.0, ) # input nucleotide base = structure[0]["A"][4003] # DG # search for paired nucleotide paired_base = search_paired_base(base, pairing_rules=my_rules) PDBNucleicAcids base-pairing can be expanded even further by creating your own base-pairing rules. .. code-block:: python from PDBNucleicAcids.BasePairRules import WatsonCrickBasePairRules parser = MMCIFParser() structure = parser.get_structure( structure_id="1A02", filename="1a02-assembly1.cif" ) # input nucleotide base = structure[0]["A"][1] # G # search for paired nucleotide with default rules pairing_rules = WatsonCrickBasePairRules() paired_base = search_paired_base(base, pairing_rules=pairing_rules) # this returns None because it binds a non-standard DNA base: 5CM # to circumvent this we can code our own rules class MyRules(WatsonCrickBasePairRules): def __init__(self): super().__init__() self.complementary_pairs += [("5CM", "G"), ("G", "5CM")] self.pyrimidines.append("5CM") self.accepted_nucleotides.append("5CM") # search for paired nucleotide with custom base pairing rules pairing_rules = MyRules() paired_base = search_paired_base(base, pairing_rules=pairing_rules) paired_base .. code-block:: console Limitations ----------- PDBNucleicAcids doesn't support yet recognition of flipped bases, gaps and nicks.