Hi everybody!
I’d like to build a pipeline pilot protocol that computes overall as well as binding site (e.g. denoted as sequence parts x Angstrom off a given ligand) sequence identity for different proteins each of different species. The desired result is a heatmap matrix displaying the sequence identity of the binding site in one half of the map and the sequence identity of the overall sequence of the other, both halfs separated by the diagonal:
Protein1 Protein2
Species1 Species2 Species3 Species1 Species2 Species3
Species1 (1) [id_bindingsite] [id_bindingsite]
Protein1 Species2 [id_overall] (1) [id_bindingsite]
Species3 [id_overall] [id_overall] (1)
Species1
Protein2 Species2
Species3
Looking forward to any tips and help!
Kind regards,
Bonina