TABLE 1.

Data sets of DUBs used in this study

Protein familyCharacteristic domainNo. of human DUBsNo. of verified DUBsaNo. of orthologsNo. of DUBs in pattern search setbNo. of DUBs in HMM training setd
C-boxH-box
USPPeptidase 194537453642
UCHLPeptidase 12441519108
OTUOTU859171011
MJDJosephin3191277
JAMMcMPN+82293421
  • a That is, the number of DUBs with experimentally verified enzymatic activity.

  • b Due to the small number of identified sequences in the human UCH, OTU, JAMM, and MJD subfamilies, orthologs from other species were included in the data set.

  • c The JAMM metalloproteases contain only H-box domains, and sequence homologs with catalytic. His residues were considered for deriving the patterns.

  • d Only C- and H-boxes with <90% sequence identity were used for training the HMMs.