/usr/share/EMBOSS/test/data/prosite.doc is in emboss-test 6.6.0-1.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 | {PDOC00000}
{BEGIN}
**********************************
*** PROSITE documentation file ***
**********************************
Release : 16.0 of July 1999
Copyright: Amos Bairoch
Swiss Institute of Bioinformatics (SIB)
CMU
University of Geneva
1, Rue Michel Servet, 1211 Geneva 4
Switzerland
Email : bairoch@medecine.unige.ch
Telephone: +41-22-702 54 77
Fax : +41-22-702 55 02
Acknowledgements:
- To all those mentioned in this document who have reviewed the entry(ies)
for which they are listed as experts. With specific thanks to Rein Aasland,
Mark Boguski, Peer Bork, Josh Cherry, Andre Chollet, Frank Kolakowski,
David Landsman, Bernard Henrissat, Eugene Koonin, Steve Henikoff, Manuel
Peitsch and Jonathan Reizer.
- Brigitte Boeckmann is the author of the PDOC00691, PDOC00703, PDOC00829,
PDOC00796, PDOC00798, PDOC00799, PDOC00906, PDOC00907, PDOC00908,
PDOC00912, PDOC00913, PDOC00924, PDOC00928, PDOC00929, PDOC00955,
PDOC00961, PDOC00966, PDOC00988 and PDOC50020 entries.
- Philipp Bucher is the author of the PDOC50001 and PDOC50002 entries.
- Kay Hofmann is the author of the PDOC50003, PDOC50006, PDOC50007 and
PDOC50017 entries.
- Keith Robison is the author of the PDOC00830 and PDOC00861 entries.
- Chantal Hulo is the author of the PDOC00987 entry.
- Vivienne Baillie Gerritsen for undertaking the major task of correcting the
grammar and style of this document.
------------------------------------------------------------------------
PROSITE is copyright. It is produced by the Swiss Institute of
Bioinformatics (SIB). There are no restrictions on its use by non-profit
institutions as long as its content is in no way modified. Usage by and
for commercial entities requires a license agreement. For information
about the licensing scheme see: http://www.isb-sib.ch/announce/ or send
an email to license@isb-sib.ch.
------------------------------------------------------------------------
{END}
{PDOC00210}
{PS00237; G_PROTEIN_RECEPTOR}
{BEGIN}
*****************************************
* G-protein coupled receptors signature *
*****************************************
G-protein coupled receptors [1 to 4,E1,E2] (also called R7G) are an extensive
group of hormones, neurotransmitters, odorants and light receptors which
transduce extracellular signals by interaction with guanine nucleotide-
binding (G) proteins. The receptors that are currently known to belong to this
family are listed below.
- 5-hydroxytryptamine (serotonin) 1A to 1F, 2A to 2C, 4, 5A, 5B, 6 and 7 [5].
- Acetylcholine, muscarinic-type, M1 to M5.
- Adenosine A1, A2A, A2B and A3 [6].
- Adrenergic alpha-1A to -1C; alpha-2A to -2D; beta-1 to -3 [7].
- Angiotensin II types I and II.
- Bombesin subtypes 3 and 4.
- Bradykinin B1 and B2.
- c3a and C5a anaphylatoxin.
- Cannabinoid CB1 and CB2.
- Chemokines C-C CC-CKR-1 to CC-CKR-8.
- Chemokines C-X-C CXC-CKR-1 to CXC-CKR-4.
- Cholecystokinin-A and cholecystokinin-B/gastrin.
- Dopamine D1 to D5 [8].
- Endothelin ET-a and ET-b [9].
- fMet-Leu-Phe (fMLP) (N-formyl peptide).
- Follicle stimulating hormone (FSH-R) [10].
- Galanin.
- Gastrin-releasing peptide (GRP-R).
- Gonadotropin-releasing hormone (GNRH-R).
- Histamine H1 and H2 (gastric receptor I).
- Lutropin-choriogonadotropic hormone (LSH-R) [10].
- Melanocortin MC1R to MC5R.
- Melatonin.
- Neuromedin B (NMB-R).
- Neuromedin K (NK-3R).
- Neuropeptide Y types 1 to 6.
- Neurotensin (NT-R).
- Octopamine (tyramine), from insects.
- Odorants [11].
- Opioids delta-, kappa- and mu-types [12].
- Oxytocin (OT-R).
- Platelet activating factor (PAF-R).
- Prostacyclin.
- Prostaglandin D2.
- Prostaglandin E2, EP1 to EP4 subtypes.
- Prostaglandin F2.
- Purinoreceptors (ATP) [13].
- Somatostatin types 1 to 5.
- Substance-K (NK-2R).
- Substance-P (NK-1R).
- Thrombin.
- Thromboxane A2.
- Thyrotropin (TSH-R) [10].
- Thyrotropin releasing factor (TRH-R).
- Vasopressin V1a, V1b and V2.
- Visual pigments (opsins and rhodopsin) [14].
- Proto-oncogene mas.
- A number of orphan receptors (whose ligand is not known) from mammals and
birds.
- Caenorhabditis elegans putative receptors C06G4.5, C38C10.1, C43C3.2,
T27D1.3 and ZC84.4.
- Three putative receptors encoded in the genome of cytomegalovirus: US27,
US28, and UL33.
- ECRF3, a putative receptor encoded in the genome of herpesvirus saimiri.
The structure of all these receptors is thought to be identical. They have
seven hydrophobic regions, each of which most probably spans the membrane.
The N-terminus is located on the extracellular side of the membrane and is
often glycosylated, while the C-terminus is cytoplasmic and generally
phosphorylated. Three extracellular loops alternate with three intracellular
loops to link the seven transmembrane regions. Most, but not all of these
receptors, lack a signal peptide. The most conserved parts of these proteins
are the transmembrane regions and the first two cytoplasmic loops. A conserved
acidic-Arg-aromatic triplet is present in the N-terminal extremity of the
second cytoplasmic loop [15] and could be implicated in the interaction with G
proteins.
To detect this widespread family of proteins we have developed a pattern that
contains the conserved triplet and that also spans the major part of the third
transmembrane helix.
-Consensus pattern: [GSTALIVMFYWC]-[GSTANCPDE]-{EDPKRH}-x(2)-[LIVMNQGA]-x(2)-
[LIVMFT]-[GSTANC]-[LIVMFYWSTAC]-[DENH]-R-[FYWCSH]-x(2)-
[LIVM]
-Sequences known to belong to this class detected by the pattern: the majority
of receptors. About 5% are not detected.
-Other sequence(s) detected in SWISS-PROT: 50.
-Expert(s) to contact by email:
Attwood T.K.; attwood@bsm.bioc.ucl.ac.uk
Kolakowski L.F. Jr.; kolakowski@uthsca.edu
-Last update: July 1998 / Text revised.
[ 1] Strosberg A.D.
Eur. J. Biochem. 196:1-10(1991).
[ 2] Kerlavage A.R.
Curr. Opin. Struct. Biol. 1:394-401(1991).
[ 3] Probst W.C., Snyder L.A., Schuster D.I., Brosius J., Sealfon S.C.
DNA Cell Biol. 11:1-20(1992).
[ 4] Savarese T.M., Fraser C.M.
Biochem. J. 283:1-9(1992).
[ 5] Branchek T.
Curr. Biol. 3:315-317(1993).
[ 6] Stiles G.L.
J. Biol. Chem. 267:6451-6454(1992).
[ 7] Friell T., Kobilka B.K., Lefkowitz R.J., Caron M.G.
Trends Neurosci. 11:321-324(1988).
[ 8] Stevens C.F.
Curr. Biol. 1:20-22(1991).
[ 9] Sakurai T., Yanagisawa M., Masaki T.
Trends Pharmacol. Sci. 13:103-107(1992).
[10] Salesse R., Remy J.J., Levin J.M., Jallal B., Garnier J.
Biochimie 73:109-120(1991).
[11] Lancet D., Ben-Arie N.
Curr. Biol. 3:668-674(1993).
[12] Uhl G.R., Childers S., Pasternak G.
Trends Neurosci. 17:89-93(1994).
[13] Barnard E.A., Burnstock G., Webb T.E.
Trends Pharmacol. Sci. 15:67-70(1994).
[14] Applebury M.L., Hargrave P.A.
Vision Res. 26:1881-1895(1986).
[15] Attwood T.K., Eliopoulos E.E., Findlay J.B.C.
Gene 98:153-159(1991).
{END}
{PDOC00559}
{PS00649; G_PROTEIN_RECEP_F2_1}
{PS00650; G_PROTEIN_RECEP_F2_2}
{BEGIN}
***************************************************
* G-protein coupled receptors family 2 signatures *
***************************************************
A number of peptide hormones bind to G-protein coupled receptors that, while
structurally similar to the majority of G-protein coupled receptors (R7G) (see
the relevant entry <PDOC00210>), do not show any similarity at the level of
their sequence, thus representing a new family whose current known members
[1,2] are listed below:
- Calcitonin receptor.
- Calcitonin gene-related peptide receptor.
- Corticotropin releasing factor receptor types 1 and 2.
- Gastric inhibitory polypeptide receptor.
- Glucagon receptor.
- Glucagon-like peptide 1 receptor.
- Growth hormone-releasing hormone receptor.
- Parathyroid hormone / parathyroid hormone-related peptide types 1 and 2.
- Pituitary adenylate cyclase activating polypeptide receptor.
- Secretin receptor.
- Vasoactive intestinal peptide receptor types 1 and 2.
- Insects diuretic hormone receptor.
In addition to the above characterized receptors, this family also includes:
- Caenorhabditis elegans putative receptor C13B9.4.
- Caenorhabditis elegans putative receptor ZK643.3.
- Human leucocyte antigen CD97, a protein that contains, in its N-terminal
section, 3 EGF-like domains (see <PDOC00021>).
- Human cell surface glycoprotein EMR1, a protein that contains, in its N-
terminal section, 6 EGF-like domains (see <PDOC00021>).
- Mouse cell surface glycoprotein F4/80, a protein that contains, in its N-
terminal section, 7 EGF-like domains (see <PDOC00021>).
All the characterized receptors are coupled to G-proteins which activate both
adenylyl cyclase and the phosphatidylinositol-calcium pathway.
Like classical R7G they seem to contain seven transmembrane regions. Their
N-terminus is probably located on the extracellular side of the membrane and
potentially glycosylated, while their C-terminus is probably cytoplasmic. But
apart from these topological similarities they do share any region of sequence
similarity and are therefore probably not evolutionary related.
Every receptor gene in this family is encoded on multiple exons, and several
of these genes are alternatively spliced to yield functionally distinct
products.
The N-terminal extracellular domain of these receptors contains five conserved
cysteines residues which could be involved in disulfide bonds; we have
developed a pattern in the region that spans the first three cysteines.
One of the most highly conserved regions spans the C-terminal part of the last
transmembrane region and the beginning of the adjacent intracellular region.
We have used this region as a second signature pattern.
-Consensus pattern: C-x(3)-[FYWLIV]-D-x(3,4)-C-[FW]-x(2)-[STAGV]-x(8,9)-C-[PF]
-Sequences known to belong to this class detected by the pattern: ALL, except
for CD97, EMR1 and F4/80.
-Other sequence(s) detected in SWISS-PROT: NONE.
-Consensus pattern: Q-G-[LMFCA]-[LIVMFT]-[LIV]-x-[LIVFST]-[LIF]-[VFYH]-C-
[LFY]-x-N-x(2)-V
-Sequences known to belong to this class detected by the pattern: ALL.
-Other sequence(s) detected in SWISS-PROT: NONE.
-Expert(s) to contact by email:
Kolakowski L.F. Jr.; kolakowski@uthsca.edu
-Last update: July 1998 / Patterns and text revised.
[ 1] Jueppner H., Abou-Samra A.-B., Freeman M., Kong X.-F., Schipani E.,
Richards J., Kolakowski L.F. Jr., Hock J., Potts J.T. Jr.,
Kronenberg H.M., Segre G.V.
Science 254:1024-1026(1991).
[ 2] Hamann J., Hartmann E., Van Lier R.A.W.
Genomics 32:144-147(1996).
{END}
{PDOC00754}
{PS00979; G_PROTEIN_RECEP_F3_1}
{PS00980; G_PROTEIN_RECEP_F3_2}
{PS00981; G_PROTEIN_RECEP_F3_3}
{BEGIN}
***************************************************
* G-protein coupled receptors family 3 signatures *
***************************************************
Glutamate and calcium bind to G-protein coupled receptors that, while
structurally similar to the majority of G-protein coupled receptors (R7G) (see
the relevant entry <PDOC00210>), do not show any similarity at the level of
their sequence, thus representing a new family whose current known members are
listed below:
- The metabotropic glutamate receptors which evoke a variety of function,
such as long-tern potentiation, memory acquisition and learning, etc.,
through the modulation of intracellular effectors [1,2,3]. Currently there
are eight known subtypes of metabotropic glutamate receptors; mGluR1 to
mGluR8. The subtypes mGluR1 and mGluR5 are coupled to the stimulation of
the phosphatidylinositol-calcium second messenger system while mGluR2,
mGluR3, mGluR4, mGluR6, mGluR7 and mGluR8 are coupled to G proteins that
inhibit adenylate cyclase activity.
- The extracellular calcium-sensing receptor [4] which sense changes in the
extracellular concentration of calcium ions. The activity of this receptor
is coupled to the stimulation of the phosphatidylinositol-calcium second
messenger system.
- Caenorhabditis elegans hypothetical protein ZC506.4.
Structurally these receptors are composed of:
a) A signal sequence;
b) A very large hydrophilic extracellular region of about 540 to 600 amino
acid residues. This region contains 17 conserved cysteines which could be
involved in disulfide bonds;
c) A region of about 250 residues that seem to contain seven transmembrane
domains;
d) A C-terminal cytoplasmic domain of variable length (50 to 350 residues).
There are quite a number of regions of high sequence conservation both in the
N-terminal domain and in the region containing the transmembrane domains. We
have selected three of these conserved regions as signature patterns. The
first one corresponds to a highly conserved hydrophobic segment in the central
part of the N-terminal extracellular region. The second corresponds to a
section that contains a cluster of six cysteines in the C-terminal part of the
extracellular domain. The last one corresponds to the C-terminal part of the
cytoplasmic loop between the fifth and sixth transmembrane domains.
-Expert(s) to contact by email:
Kolakowski L.F. Jr.; kolakowski@uthsca.edu
-Consensus pattern: [LV]-x-N-[LIVM](2)-x-L-F-x-I-[PA]-Q-[LIVM]-[STA]-x-
[STA](3)-[STAN]
-Sequences known to belong to this class detected by the pattern: ALL.
-Other sequence(s) detected in SWISS-PROT: NONE.
-Consensus pattern: C-C-[FYW]-x-C-x(2)-C-x(4)-[FYW]-x(2,4)-[DN]-x(2)-[STAH]-C-
x(2)-C
-Sequences known to belong to this class detected by the pattern: ALL.
-Other sequence(s) detected in SWISS-PROT: NONE.
-Consensus pattern: F-N-E-[STA]-K-x-I-[STAG]-F-[ST]-M
-Sequences known to belong to this class detected by the pattern: ALL.
-Other sequence(s) detected in SWISS-PROT: NONE.
-Last update: July 1998 / Patterns and text revised.
[ 1] Tanabe Y., Masu M., Ishii T., Shigemoto R., Nakanishi S.
Neuron 8:169-179(1992).
[ 2] Okamoto N., Hori S., Akazawa C., Hayashi Y., Shigemoto R., Mizuno N.,
Nakanishi S.
J. Biol. Chem. 269:1231-1236(1994).
[ 3] Duvoisin R.M., Zhang C., Ramonell K.
J. Neurosci. 15:3075-3083(1995).
[ 4] Brown E.M., Gamba G., Riccardi D., Lombardi M., Butters R., Kifor O.,
Sun A., Hediger M.A., Lytton J., Hebert S.C.
Nature 366:575-580(1993).
{END}
{PDOC00211}
{PS00238; OPSIN}
{BEGIN}
*************************************************
* Visual pigments (opsins) retinal binding site *
*************************************************
Visual pigments [1,2] are the light-absorbing molecules that mediate vision.
They consist of an apoprotein, opsin, covalently linked to the chromophore
cis-retinal. Vision is effected through the absorption of a photon by cis-
retinal which is isomerized to trans-retinal. This isomerization leads to a
change of conformation of the protein. Opsins are integral membrane proteins
with seven transmembrane regions that belong to family 1 of G-protein coupled
receptors (see <PDOC00210>).
In vertebrates four different pigments are generally found. Rod cells, which
mediate vision in dim light, contain the pigment rhodopsin. Cone cells, which
function in bright light, are responsible for color vision and contain three
or more color pigments (for example, in mammals: red, blue and green).
In Drosophila, the eye is composed of 800 facets or ommatidia. Each
ommatidium contains eight photoreceptor cells (R1-R8): the R1 to R6 cells are
outer cells, R7 and R8 inner cells. Each of the three types of cells (R1-R6,
R7 and R8) expresses a specific opsin.
Proteins evolutionary related to opsins include squid retinochrome, also known
as retinal photoisomerase, which converts various isomers of retinal into 11-
cis retinal and mammalian retinal pigment epithelium (RPE) RGR [3], a protein
that may also act in retinal isomerization.
The attachment site for retinal in the above proteins is a conserved lysine
residue in the middle of the seventh transmembrane helix. The pattern we
developed includes this residue.
-Consensus pattern: [LIVMWAC]-[PGAC]-x(3)-[SAC]-K-[STALIMR]-[GSACPNV]-[STACP]-
x(2)-[DENF]-[AP]-x(2)-[IY]
[K is the retinal binding site]
-Sequences known to belong to this class detected by the pattern: ALL.
-Other sequence(s) detected in SWISS-PROT: NONE.
-Last update: July 1998 / Pattern and text revised.
[ 1] Applebury M.L., Hargrave P.A.
Vision Res. 26:1881-1895(1986).
[ 2] Fryxell K.J., Meyerowitz E.M.
J. Mol. Evol. 33:367-378(1991).
[ 3] Shen D., Jiang M., Hao W., Tao L., Salazar M., Fong H.K.W.
Biochemistry 33:13117-13125(1994).
{END}
|