This file is indexed.

/var/lib/mobyle/programs/imogene.xml is in mobyle-programs 5.1.2-2.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
<?xml version="1.0" encoding="ISO-8859-1"?>
<!-- This file has been adapted for the needs of imogene by                      -->
<!-- Hervé Rouault <rouault@lps.ens.fr>                                          -->
<!--                                                                             -->
<!--                                                                             -->
<!-- XML Authors: Corinne Maufrais                                               -->
<!-- 'Biological Software and Databases' Group, Institut Pasteur, Paris.         -->
<!-- Distributed under LGPLv2 License. Please refer to the COPYING.LIB document. -->
<program>
  <head xmlns:xi="http://www.w3.org/2001/XInclude">
    <name>imogene</name>
    <version>1.0-253</version>
    <doc>
      <title>imogene</title>
      <description>
        <text lang="en">Genome-wide identification of cis-regulatory
                    motifs and modules</text>
      </description>
      <authors>H. Rouault, M. Santolini</authors>
      <reference> H. Rouault, M. Santolini, F. Schweisguth and V. Hakim
                (2013), Imogene: identification of motifs and cis regulatory
                modules underlying gene co-regulation, submitted. </reference>
      <reference> H. Rouault, K. Mazouni, L. Couturier, V. Hakim and F.
                Schweisguth (2010), Genome-wide identification of
                cis-regulatory motifs and modules underlying gene coregulation
                using statistics and phylogeny. Proc. Natl. Acad. Sci. U.S.A.,
                107:14615-14620. </reference>
      <doclink>https://github.com/hrouault/Imogene</doclink>
      <comment>
        <div xmlns="http://www.w3.org/1999/xhtml">
          <p><em>Imogene</em> has two main purposes:
            <ul><li>it detects statistically overrepresented motifs,
                    considering closely related genomes, in a set of training
                    sequences. These sequences are generally selected for their
                    biological relationships, ie cis-regulatory role.</li>
                <li>it predicts novel cis-regulatory sequences based
                    on user-selected motifs at the whole genome scale.</li>
            </ul>
          </p>
          <p>Currently, <em>Imogene</em> supports two genome sets: mammals and
             drosophilae, but we are open to integrating other sets if you
             feel interested.</p>
          <p>The source code of Imogene is available on <em>github</em>:
            <a href="https://github.com/hrouault/Imogene">https://github.com/hrouault/Imogene</a>.
            It contains the main source code, as well as a documentation and
            local installation instructions.  An online manual can also be
            found <a href="http://hrouault.github.com/Imogene">here</a>. Any
            trouble concerning its installation or bugs should be addressed on
            <em>github</em>.</p> <p>Please, feel free to contact us for any
            question concerning <em>Imogene</em>: Herve Rouault
            <a href="mailto:herve.rouault@pasteur.fr">herve.rouault@pasteur.fr</a>,
            Marc Santolini <a href="mailto:marc.santolini@lps.ens.fr">marc.santolini@lps.ens.fr</a>.</p>
        </div>
      </comment>
    </doc>
    <category>sequence:nucleic:regulation</category>
    <xi:include href="../../../Local/Services/Programs/Env/imogene_env.xml" xpointer="xpointer(//env)">
      <xi:fallback/>
    </xi:include>
  </head>
  <parameters>
    <parameter ismandatory="1" issimple="1">
      <name>model</name>
      <prompt lang="en">Execution mode</prompt>
      <type>
        <datatype>
          <class>Choice</class>
        </datatype>
      </type>
      <vdef>
        <value>null</value>
      </vdef>
      <vlist>
        <velem undef="1">
          <value>null</value>
          <label>Choice</label>
        </velem>
        <velem>
          <value>genmot</value>
          <label>genmot: Generate motifs from a training set</label>
        </velem>
        <velem>
          <value>scangen</value>
          <label>scangen: Find instances of a list of motifs in the
            genome</label>
        </velem>
      </vlist>
      <comment>
        <div xmlns="http://www.w3.org/1999/xhtml"><em>Imogene</em> has two
          different modes of execution that should be performed sequentially:
          <ul><li><strong>genmot</strong>: generates motifs from a
                  set of functionally related CRMs</li>
              <li><strong>scangen</strong>: predicts enhancers given
                  the set of motifs generated using the <em>genmot</em> mode.</li>
          </ul>
        </div>
      </comment>
    </parameter>
    <paragraph>
      <name>general</name>
      <prompt lang="en">General options</prompt>
      <argpos>1</argpos>
      <parameters>
        <parameter ismandatory="1" issimple="1">
          <name>species</name>
          <prompt lang="en">Family of species to consider</prompt>
          <type>
            <datatype>
              <class>Choice</class>
            </datatype>
          </type>
          <vdef>
            <value>null</value>
          </vdef>
          <vlist>
            <velem undef="1">
              <value>null</value>
              <label>Choice</label>
            </velem>
            <velem>
              <value>droso</value>
              <label>Drosophilae</label>
            </velem>
            <velem>
              <value>eutherian</value>
              <label>Eutherians</label>
            </velem>
          </vlist>
          <comment>
            <div xmlns="http://www.w3.org/1999/xhtml">
              Two families of genomes are currently supported by
              <em>Imogene</em>:
              <ul><li><strong>Drosophilae</strong>: genomes are taken from the
                    12 genomes project (A. G. Clark, <em>et al</em>. Evolution of
                    genes and genomes on the Drosophila phylogeny.
                    <em>Nature</em>, 450(7167):203-218, Nov 2007.)</li>
                  <li><strong>Eutherian</strong>: genomes are taken from the
                    ensembl project (<a href="http://www.ensembl.org/">http://www.ensembl.org/</a>,
                    folder epo_12_eutherian)</li>
              </ul>
            </div>
          </comment>
        </parameter>
        <parameter ismandatory="1">
          <name>width</name>
          <prompt lang="en">Width of the motifs</prompt>
          <type>
            <datatype>
              <class>Integer</class>
            </datatype>
          </type>
          <vdef>
            <value>10</value>
          </vdef>
          <comment>
            <div xmlns="http://www.w3.org/1999/xhtml"> Motifs represent
              the affinity of transcription factors for binding sites (BS).
              Please indicate here the <em>width</em> of BSs in number of
              nucleotides (nts). Typical values range between 8 and 12 nts.
              Note that motifs of larger width can model motifs with a specific
              core of lower width and flanking nts with background frequencies
              (non-specific non-coding sequences). However, large motifs are
              more difficult to infer when containing many degenerate bases.
            </div>
          </comment>
        </parameter>
        <parameter ismandatory="1">
          <name>extent</name>
          <prompt lang="en">Allowed shift of a binding site position in
              orthologous species</prompt>
          <type>
            <datatype>
              <class>Integer</class>
            </datatype>
          </type>
          <vdef>
            <value>20</value>
          </vdef>
          <comment>
            <div xmlns="http://www.w3.org/1999/xhtml">
              <p>In order to be robust to mistakes in the alignments,
                <em>Imogene</em> allows for local shifts of the binding sites
                in orthologous sequences.</p>
              <p>Considering the first base of a binding site in the reference
                species, an alignment will make it correspond to a position
                <em>p</em> in an orthologous genome. The algorithm will then
                search for binding sites in this orthologous species at
                positions <em>(p-s,p+w+s)</em>, where <em>s</em> is the shift
                indicated here and <em>w</em> is the width of the motif.</p>
            </div>
          </comment>
        </parameter>
      </parameters>
    </paragraph>
    <paragraph>
      <name>genmot</name>
      <prompt lang="en">Genmot options</prompt>
      <precond>
        <code proglang="perl">$model == 'genmot'</code>
        <code proglang="python">model == 'genmot'</code>
      </precond>
      <argpos>2</argpos>
      <parameters>
        <parameter ismandatory="1">
          <name>evolutionary</name>
          <prompt lang="en">Evolutionary model used for motif generation</prompt>
          <type>
            <datatype>
              <class>Choice</class>
            </datatype>
          </type>
          <vdef>
            <value>1</value>
          </vdef>
          <vlist>
            <velem>
              <value>1</value>
              <label>Felsenstein model</label>
            </velem>
            <velem>
              <value>2</value>
              <label>Halpern-Bruno model</label>
            </velem>
          </vlist>
          <comment>
            <div xmlns="http://www.w3.org/1999/xhtml">
              <p><em>Imogene</em> makes use of the evolution of binding sites
                to infer motifs. In order to function, <em>Imogene</em>
                consider the mutation rate of bases under selection thanks
                to an evolutionary model.</p>
              <p>Two models are currently supported by <em>Imogene</em> (see
                refs at the end of the page for a detailed description):
                <ul><li>Felsentein: this model is simple and runs faster</li>
                    <li>Halpern-Bruno: this model has a deeper theoretical
                        ground and better predicts the speed of evolution of
                        the motifs but takes more time to run.</li>
                </ul></p>
            </div>
          </comment>
        </parameter>
        <parameter ismandatory="1">
          <name>Sg</name>
          <prompt lang="en">Threshold used for motif generation</prompt>
          <type>
            <datatype>
              <class>Float</class>
            </datatype>
          </type>
          <vdef>
            <value>11.0</value>
          </vdef>
          <comment>
            <div xmlns="http://www.w3.org/1999/xhtml">
              <p>The <em>threshold</em>, expressed in bits, corresponds to the
                level of specificity of a motif.  As an order of magnitude,
                a threshold equal to 10 corresponds approximatively to one
                binding site every 1000 bases on average in the whole genome.
                Since we use conservation to define motifs instances, this
                number is actually an upper bound. Common values, that give
                reasonable results are comprised between 10 and 13 bits.</p>
            </div>
          </comment>
        </parameter>
        <parameter ismandatory="1">
          <name>Ss1</name>
          <prompt lang="en">Threshold used to scan training set sequences for
            display</prompt>
          <type>
            <datatype>
              <class>Float</class>
            </datatype>
          </type>
          <vdef>
            <value>8.0</value>
          </vdef>
          <comment>
            <div xmlns="http://www.w3.org/1999/xhtml">
              <p>Note that this scanning threshold is only used for the final
                display and does not interviene in motifs generation. In
                particular, a lower threshold is useful to check for cryptic
                sites in the sequences. </p>
            </div>
          </comment>
        </parameter>
        <parameter ismandatory="1">
          <name>gennbmots</name>
          <prompt lang="en">Number of motifs to display at maximum (between 1
            and 10)</prompt>
          <type>
            <datatype>
              <class>Integer</class>
            </datatype>
          </type>
          <vdef>
            <value>5</value>
          </vdef>
          <comment>
            <div xmlns="http://www.w3.org/1999/xhtml">
              <p>The algorithm output consists of many motifs on general.
                Although, they are all present in the text file output, they
                cannot all be displayed and shown on the input sequences. It is
                usually fine to consider between 5 et 10 motifs. </p>
            </div>
          </comment>
        </parameter>
        <parameter ismandatory="1" issimple="1">
          <name>coord_file</name>
          <prompt lang="en">Training set sequences coordinates</prompt>
          <type>
            <datatype>
              <class>Coordinates</class>
              <superclass>AbstractText</superclass>
            </datatype>
          </type>
          <comment>
            <div xmlns="http://www.w3.org/1999/xhtml">
              <p> List of sequence coordinates in the BED4 format:
                <pre>chrN  start stop  seq_name </pre>
                Below we give example training sets than can be copy/pasted in
                the following form.  </p>
              <!--<example>-->
              <ul>
                <li> Mouse example training set (mm9 assembly, enhancers
                  specific of the neural tube):</li>
                <pre>
chr8  91462919 91464123 CYLD-SALL1
chr4  99040833 99042291 APG4C-FOXD3
chr14 118834760   118836087   SOX21-ABCC4
chr18 69658816 69660452 TCF4(intragenic)
chr6  138199417   138201368   MGST1-LMO3
chr12 51291542 51292872 FOXG1B-PRKD1
chr4  73149468 73150526 FLJ46321-RASEF
chrX  57972482 57973750 LOC347487-SOX3
chr5  42914188 42915270 FAM44A-CPEB2
chr9  91261697 91263041 ZIC4-ZIC1
chr18 82027407 82028703 GALR1-SALL3
chr13 73170587 73173631 IRX4-IRX2</pre>
                <li> Drosophila example training set (release 5 assembly, enhancers expressed in mesoderm and somatic muscle):</li>
                <pre>
chr3R   10415381 10415780 seq2300
chr3R 17228198 17228402 seq2821
chr2R 18937351 18937843 seq1357
chr2L 21853080 21853669 seq6531
chr3R 17392100 17392639 seq2837
chr2R 19526211 19526451 seq1396
chr2R 19526657 19527033 seq1397
chr3R 17227574 17227958 seq2820
chr2R 5025930  5026695  seq257</pre>
              </ul>
              <!--</example> -->
            </div>
          </comment>
        </parameter>
        <parameter ishidden="1">
          <name>cmd_1</name>
          <type>
            <datatype>
              <class>String</class>
            </datatype>
          </type>
          <format>
            <code proglang="python">"imogene extract -s " + species + " -i " \
              + coord_file +" &amp;&amp; "</code>
          </format>
          <argpos>1</argpos>
        </parameter>
        <parameter ishidden="1">
          <name>cmd_2</name>
          <type>
            <datatype>
              <class>String</class>
            </datatype>
          </type>
          <format>
            <code proglang="python">" imogene genmot -s " + species + " -w " \
              + str(width) + " -t " + str(Sg) + " -x " + str(extent) + " -e " \
              + str(evolutionary) + " -a align &amp;&amp;" </code>
          </format>
          <argpos>2</argpos>
        </parameter>
        <parameter ishidden="1">
          <name>cmd_3</name>
          <type>
            <datatype>
              <class>String</class>
            </datatype>
          </type>
          <format>
            <code proglang="python">" imogene display -t " + str(Ss1) + " -s "\
              + species + " -n " + str(gennbmots)\
              + " -m motifs.txt -a align --logos --pdf --png &amp;&amp;"</code>
          </format>
          <argpos>3</argpos>
        </parameter>
        <parameter ishidden="1">
          <name>cmd_4</name>
          <type>
            <datatype>
              <class>String</class>
            </datatype>
          </type>
          <format>
            <code proglang="python">" imogene display -t "+ str(Ss1) +" -s " + \
              species + " -n " + str(gennbmots) + " -m motifs.txt -a align --html-ref"</code>
          </format>
          <argpos>4</argpos>
        </parameter>
        <parameter isout="1">
          <name>motifs</name>
          <prompt>Motif file</prompt>
          <type>
            <datatype>
              <class>ScangenMotifDefinition</class>
              <superclass>AbstractText</superclass>
            </datatype>
          </type>
          <filenames>
            <code proglang="perl">"motifs.txt"</code>
            <code proglang="python">"motifs.txt"</code>
          </filenames>
          <comment>
            <div xmlns="http://www.w3.org/1999/xhtml">
              <p>Predicted motifs in text file format. This file can be used as
                input for the <em>scangen</em> mode</p>
            </div>
          </comment>
        </parameter>
        <parameter isout="1">
          <name>genmotoutput</name>
          <prompt>General output</prompt>
          <type>
            <datatype>
              <class>GenmotReport</class>
              <superclass>Report</superclass>
            </datatype>
            <dataFormat>HTML</dataFormat>
          </type>
          <filenames>
              <code proglang="python">"results_genmot.html"</code>
          </filenames>
          <comment>
            <div xmlns="http://www.w3.org/1999/xhtml">
              <p>Predicted motifs in pretty printed format (xhtml).  This page
                contains motifs logos.</p>
            </div>
          </comment>
        </parameter>
        <parameter isout="1">
          <name>motif_png_out</name>
          <type>
            <datatype>
              <class>Logo</class>
              <superclass>Binary</superclass>
            </datatype>
            <dataFormat>png</dataFormat>
          </type>
          <filenames>
            <code proglang="python">"Motif*.png"</code>
          </filenames>
        </parameter>
        <parameter isout="1">
          <name>motif_pdf_out</name>
          <type>
            <datatype>
              <class>Logo</class>
              <superclass>Binary</superclass>
            </datatype>
            <dataFormat>pdf</dataFormat>
          </type>
          <filenames>
            <code proglang="python">"Motif*.pdf"</code>
          </filenames>
        </parameter>
      </parameters>
    </paragraph>
    <paragraph>
      <name>scangen</name>
      <prompt lang="en">Scangen
              options</prompt>
      <precond>
        <code proglang="perl">$model == 'scangen'</code>
        <code proglang="python">model == 'scangen'</code>
      </precond>
      <argpos>2</argpos>
      <parameters>
        <parameter ismandatory="1">
          <name>Ss2</name>
          <prompt lang="en">Threshold used to scan the genome </prompt>
          <type>
            <datatype>
              <class>Float</class>
            </datatype>
          </type>
          <vdef>
            <value>8.0</value>
          </vdef>
          <comment>
            <div xmlns="http://www.w3.org/1999/xhtml">
              <p> Usually less or equal to the generation threshold Sg. </p>
            </div>
          </comment>
        </parameter>
        <parameter ismandatory="1">
          <name>crmwidth</name>
          <prompt lang="en">Width of selected enhancers</prompt>
          <type>
            <datatype>
              <class>Integer</class>
            </datatype>
          </type>
          <vdef>
            <value>1000</value>
          </vdef>
          <comment>
            <div xmlns="http://www.w3.org/1999/xhtml">
              <p>Width of the predicted enhancers in number of nucleotides.</p>
            </div>
          </comment>
        </parameter>
        <parameter ismandatory="1">
          <name>nbmots</name>
          <prompt lang="en">Number of motifs to consider at maximum</prompt>
          <type>
            <datatype>
              <class>Integer</class>
            </datatype>
          </type>
          <vdef>
            <value>5</value>
          </vdef>
          <comment>
            <div xmlns="http://www.w3.org/1999/xhtml">
              <p>Enhancers are predicted on the basis of the presence of
                binding sites corresponding to chosen motifs. You must input an
                ordered list of motifs generated either by yourself (with the
                appropriate syntax) or by the <em>genmot</em> mode. The number
                of considered motifs will be the top <em>n</em>, where
                <em>n</em> is the integer indicated here.</p>
            </div>
          </comment>
        </parameter>
        <parameter ismandatory="1" issimple="1">
          <name>motifs_file</name>
          <prompt lang="en">File containing a list of motif
            definitions</prompt>
          <type>
            <datatype>
              <class>ScangenMotifDefinition</class>
              <superclass>AbstractText</superclass>
            </datatype>
          </type>
          <comment>
            <div xmlns="http://www.w3.org/1999/xhtml">
              <p>List of motif definition in format identical to the output of
                the <em>genmot</em> mode.</p>
            </div>
          </comment>
        </parameter>
        <parameter ishidden="1">
          <name>cmdscangen_1</name>
          <type>
            <datatype>
              <class>String</class>
            </datatype>
          </type>
          <format>
            <code proglang="python">"imogene scangen -s " + species + " -t " \
              + str(Ss2) + " -x " + str(extent) + " -m " + motifs_file       \
              + " --scanwidth=" + str(crmwidth) + "&amp;&amp; "</code>
          </format>
          <argpos>1</argpos>
        </parameter>
        <parameter ishidden="1">
          <name>cmdscangen_2</name>
          <type>
            <datatype>
              <class>String</class>
            </datatype>
          </type>
          <format>
            <code proglang="python">" imogene display -e result*.dat"</code>
          </format>
          <argpos>2</argpos>
        </parameter>
        <parameter isout="1">
          <name>scangenOuthtml</name>
          <prompt>Scangen output</prompt>
          <type>
            <datatype>
              <class>ScangenReport</class>
              <superclass>Report</superclass>
            </datatype>
            <dataFormat>HTML</dataFormat>
          </type>
          <filenames>
            <code proglang="python">"results_scangen.html"</code>
          </filenames>
          <comment>
            <div xmlns="http://www.w3.org/1999/xhtml">
              <p>List of predicted enhancers, from the most concentrated in
                motifs to the less enriched. 
              </p>
            </div>
          </comment>
        </parameter>
      </parameters>
    </paragraph>
  </parameters>
</program>