Sequence similarity of overall protein sequence as well as binding site

AB 2015-02-04

Hi everybody!

I’d like to build a pipeline pilot protocol that computes overall as well as binding site (e.g. denoted as sequence parts x Angstrom off a given ligand) sequence identity for different proteins each of different species. The desired result is a heatmap matrix displaying the sequence identity of the binding site in one half of the map and the sequence identity of the overall sequence of the other, both halfs separated by the diagonal:

Protein1 Protein2

Species1 Species2 Species3 Species1 Species2 Species3

Species1 (1) [id_bindingsite] [id_bindingsite]

Protein1 Species2 [id_overall] (1) [id_bindingsite]

Species3 [id_overall] [id_overall] (1)

Species1

Protein2 Species2

Species3

Looking forward to any tips and help!

Kind regards,

Bonina