This file is indexed.

/usr/share/doc/mcl/html/mcxload.html is in mcl-doc 1:14-137-1.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<!-- Copyright (c) 2014 Stijn van Dongen -->
<head>
<meta name="keywords" content="manual">
<style type="text/css">
/* START aephea.base.css */
body
{ text-align: justify;
margin-left: 0%;
margin-right: 0%;
}
a:link { text-decoration: none; }
a:active { text-decoration: none; }
a:visited { text-decoration: none; }
a:link { color: #1111aa; }
a:active { color: #1111aa; }
a:visited { color: #111166; }
a.local:link { color: #11aa11; }
a.local:active { color: #11aa11; }
a.local:visited { color: #116611; }
a.intern:link { color: #1111aa; }
a.intern:active { color: #1111aa; }
a.intern:visited { color: #111166; }
a.extern:link { color: #aa1111; }
a.extern:active { color: #aa1111; }
a.extern:visited { color: #661111; }
a.quiet:link { color: black; }
a.quiet:active { color: black; }
a.quiet:visited { color: black; }
div.verbatim
{ font-family: monospace;
margin-top: 1em;
margin-bottom: 1em;
font-size: 10pt;
margin-left: 2em;
white-space: pre;
}
div.indent
{ margin-left: 8%;
margin-right: 0%;
}
.right { text-align: right; }
.left { text-align: left; }
.nowrap { white-space: nowrap; }
.item_leader
{ position: relative;
margin-left: 8%;
}
.item_compact { position: absolute; vertical-align: baseline; }
.item_cascade { position: relative; }
.item_leftalign { text-align: left; }
.item_rightalign
{ width: 2em;
text-align: right;
}
.item_compact .item_rightalign
{ position: absolute;
width: 52em;
right: -2em;
text-align: right;
}
.item_text
{ position: relative;
margin-left: 3em;
}
.smallcaps { font-size: smaller; text-transform: uppercase }
/* END aephea.base.css */
body { font-family: "Garamond", "Gill Sans", "Verdana", sans-serif; }
body
{ text-align: justify;
margin-left: 8%;
margin-right: 8%;
}
</style>
<title>The mcxload manual</title>
</head>
<body>
<p style="text-align:right">
16 May 2014&nbsp;&nbsp;&nbsp;
<a class="local" href="mcxload.ps"><b>mcxload</b></a>
14-137
</p>
<div class=" itemize " style="margin-top:1em; font-size:100%">
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">1.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#name">NAME</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">2.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#synopsis">SYNOPSIS</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">3.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#started">GETTING STARTED</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">4.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#description">DESCRIPTION</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">5.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#options">OPTIONS</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">6.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#author">AUTHOR</a>
</div>
<div class=" item_compact"><div class=" item_rightalign nowrap " style="right:-3em">7.</div></div>
<div class=" item_text " style="margin-left:4em">
<a class="intern" href="#seealso">SEE ALSO</a>
</div>
</div>

<a name="name"></a>
<h2>NAME</h2>
<p style="margin-bottom:0" class="asd_par">
mcxload &mdash; load matrices and tab files from label format</p>

<a name="synopsis"></a>
<h2>SYNOPSIS</h2>
<p style="margin-bottom:0" class="asd_par">
<b>mcxload</b> <a class="intern" href="#opt-abc"><b>-abc</b> &lt;fname&gt; (<i>label file</i>)</a>
<a class="intern" href="#opt-o"><b>-o</b> &lt;fname&gt; (<i>output file</i>)</a></p>
<p style="margin-bottom:0" class="asd_par">
<a class="intern" href="#opt-abc"><b>[-abc</b> &lt;fname&gt; (<i>label file</i>)<b>]</b></a>
<a class="intern" href="#opt-123"><b>[-123</b> &lt;fname&gt; (<i>identifier file</i>)<b>]</b></a>
<a class="intern" href="#opt-o"><b>[-o</b> &lt;fname&gt; (<i>output file</i>)<b>]</b></a>
<a class="intern" href="#opt--stream-mirror"><b>[--stream-mirror</b> (<i>symmetrify, same domain</i>)<b>]</b></a>
<a class="intern" href="#opt--stream-split"><b>[--stream-split</b> (<i>assume different domains</i>)<b>]</b></a>
<a class="intern" href="#opt-re"><b>[-re</b> &lt;mode&gt; (<i>edge deduplication mode</i>)<b>]</b></a>
<a class="intern" href="#opt-ri"><b>[-ri</b> &lt;mode&gt; (<i>image symmetrification mode</i>)<b>]</b></a>
<a class="intern" href="#opt-sif"><b>[-sif</b> &lt;fname&gt; (<i>SIF label file</i>)<b>]</b></a>
<a class="intern" href="#opt-etc"><b>[-etc</b> &lt;fname&gt; (<i>'etc' label file</i>)<b>]</b></a>
<a class="intern" href="#opt-etc-ai"><b>[-etc-ai</b> &lt;fname&gt; (<i>leaderless 'etc' label file</i>)<b>]</b></a>
<a class="intern" href="#opt--expect-values"><b>[--expect-values</b> (<i>expect label:weight format</i>)<b>]</b></a>
<a class="intern" href="#opt-235"><b>[-235</b> &lt;fname&gt; (<i>leader '235' label file</i>)<b>]</b></a>
<a class="intern" href="#opt-235-ai"><b>[-235-ai</b> &lt;fname&gt; (<i>leaderless '235' label file</i>)<b>]</b></a>
<a class="intern" href="#opt-packed"><b>[-packed</b> &lt;fname&gt; (<i>file/stream in binary format</i>)<b>]</b></a>
<a class="intern" href="#opt-pack-cnum"><b>[-pack-cnum</b> &lt;num&gt; (<i>set column range</i>)<b>]</b></a>
<a class="intern" href="#opt-pack-rnum"><b>[-pack-rnum</b> &lt;num&gt; (<i>set row range</i>)<b>]</b></a>
<a class="intern" href="#opt-123-max"><b>[-123-max</b> &lt;int&gt; (<i>set domain range</i>)<b>]</b></a>
<a class="intern" href="#opt-123-maxc"><b>[-123-maxc</b> &lt;int&gt; (<i>set column range</i>)<b>]</b></a>
<a class="intern" href="#opt-123-maxr"><b>[-123-maxr</b> &lt;int&gt; (<i>set row range</i>)<b>]</b></a>
<a class="intern" href="#opt-write-tab"><b>[-write-tab</b> &lt;fname&gt; (<i>save domain tab</i>)<b>]</b></a>
<a class="intern" href="#opt-write-tabc"><b>[-write-tabc</b> &lt;fname&gt; (<i>save column tab</i>)<b>]</b></a>
<a class="intern" href="#opt-write-tabr"><b>[-write-tabr</b> &lt;fname&gt; (<i>save row tab</i>)<b>]</b></a>
<a class="intern" href="#opt-strict-tab"><b>[-strict-tab</b> &lt;fname&gt; (<i>tab universe</i>)<b>]</b></a>
<a class="intern" href="#opt-strict-tabc"><b>[-strict-tabc</b> &lt;fname&gt; (<i>tabc universe</i>)<b>]</b></a>
<a class="intern" href="#opt-strict-tabr"><b>[-strict-tabr</b> &lt;fname&gt; (<i>tabr universe</i>)<b>]</b></a>
<a class="intern" href="#opt-restrict-tab"><b>[-restrict-tab</b> &lt;fname&gt; (<i>tab world</i>)<b>]</b></a>
<a class="intern" href="#opt-restrict-tabc"><b>[-restrict-tabc</b> &lt;fname&gt; (<i>tabc world</i>)<b>]</b></a>
<a class="intern" href="#opt-restrict-tabr"><b>[-restrict-tabr</b> &lt;fname&gt; (<i>tabr world</i>)<b>]</b></a>
<a class="intern" href="#opt-extend-tab"><b>[-extend-tab</b> &lt;fname&gt; (<i>tab launch</i>)<b>]</b></a>
<a class="intern" href="#opt-extend-tabc"><b>[-extend-tabc</b> &lt;fname&gt; (<i>tabc launch</i>)<b>]</b></a>
<a class="intern" href="#opt-extend-tabr"><b>[-extend-tabr</b> &lt;fname&gt; (<i>tabr launch</i>)<b>]</b></a>
<a class="intern" href="#opt--stream-log"><b>[--stream-log</b> (<i>log transform stream values</i>)<b>]</b></a>
<a class="intern" href="#opt--stream-neg-log"><b>[--stream-neg-log</b> (<i>negative log transform stream values</i>)<b>]</b></a>
<a class="intern" href="#opt--stream-neg-log10"><b>[--stream-neg-log10</b> (<i>negative log-10 transform stream values</i>)<b>]</b></a>
<a class="intern" href="#opt-stream-tf"><b>[-stream-tf</b> (<i>transform stream values</i>)<b>]</b></a>
<a class="intern" href="#opt-tf"><b>[-tf</b> &lt;tf-spec&gt; (<i>transform (not so) final matrix</i>)<b>]</b></a>
<a class="intern" href="#opt--transpose"><b>[--transpose</b> (<i>transpose</i>)<b>]</b></a>
<a class="intern" href="#opt--write-binary"><b>[--write-binary</b> (<i>output binary format</i>)<b>]</b></a>
<a class="intern" href="#opt--debug"><b>[--debug</b> (<i>debug</i>)<b>]</b></a>
<a class="intern" href="#opt-h"><b>[-h</b> (<i>print synopsis, exit</i>)<b>]</b></a>
<a class="intern" href="#opt--apropos"><b>[--apropos</b> (<i>print synopsis, exit</i>)<b>]</b></a>
<a class="intern" href="#opt--version"><b>[--version</b> (<i>print version, exit</i>)<b>]</b></a>
</p>

<a name="started"></a>
<h2>GETTING STARTED</h2>
<div class="verbatim">   mcxload --stream-mirror -abc data1.txt -o data1.mci -write-tab data1.tab
   mcxload --stream-mirror -etc data2.txt -o data2.mci -write-tab data2.tab
   mcxload --stream-mirror -sif data3.txt -o data3.mci -write-tab data3.tab</div>
<p style="margin-top:0em; margin-bottom:0em">
When the output should be an undirected graph it is safest to always use
the <tt>--stream-mirror</tt> option. Edges are stored bidirectionally as two arcs,
and this option instructs <a class="local sibling" href="mcxload.html">mcxload</a> to ensure that both arcs are present.
In the above examples three different types of format are read. In all formats,
the basic unit of specification is that of an arc specified by a source node,
a destination node, and optionally a weight. All formats are line based,
with <b>-abc</b> specifying a single arc and <b>-etc</b> and <b>-sif</b>
specifying multiple arcs corresponding to a shared source node.
For <b>-abc</b> the format is</p>
<div class="verbatim">&lt;source-label&gt;    &lt;destination-label&gt;     [&lt;weight&gt;]</div>
<p style="margin-bottom:0" class="asd_par">
The last field, specifying the arc weight, is optional. If not present the arc weight will be
set to the default weight of 1.0.</p> For <b>-sif</b> the format is
<div class="verbatim">&lt;source-label&gt;    &lt;relation-type&gt;   &lt;destination-label&gt;   &lt;destination-label&gt;  ...</div>
<p style="margin-bottom:0" class="asd_par">
There can be an arbitrary number of destination labels. The relation type field
in the second column is required but will be ignored. As an extension it is possible
to specify weights, requiring the use of the <b>--expect-values</b> option.
Weights are specified by tagging them onto the destination label separated by a colon:</p>
<div class="verbatim">&lt;source-label&gt;    &lt;relation-type&gt;   &lt;destination-label&gt;:&lt;weight&gt;   &lt;destination-label&gt;:&lt;weight&gt;  ...</div>
<p style="margin-bottom:0" class="asd_par">
Finally, the format for the <b>-etc</b> option is the same, except that the relation type
column is dropped.</p>

<a name="description"></a>
<h2>DESCRIPTION</h2>
<p style="margin-top:0em; margin-bottom:0em">
<b>mcxload</b> reads label input from a file. The format of the file should be
line-based, each line containing two white-space separated strings (labels)
and optionally a number separated from the second label by whitespace. In
the absence of a value, mcxload will use the default value 1.0. If a tab is
present on an input line, mcxload will assume that the tab character is the
separator for that line. Lines for which the first non-whitespace character
is an octothorpe ('<tt>#</tt>') are skipped.</p>
<p style="margin-bottom:0" class="asd_par">
<b>mcxload</b> will transform the labels into mcl numerical identifiers and the
pairs of labels into graph edges or equivalently matrix entries. The weight
of an edge is the value associated with the associated labels. mcxload
constructs dictionaries (sometimes just one) that map labels onto mcl
identifiers as it goes along. It can optionally write these to file. In MCL
(family) parlance, such a dictionary written to file is called a <i>tab
file</i>.</p>
<p style="margin-bottom:0" class="asd_par">
It is possible to specify numerical identifiers directly with
the <a class="intern" href="#opt-123"><b>-123</b></a> option. In this case <b>mcxload</b> assumes a canonical
domain (cf <a class="local sibling" href="mcxio.html">mcxio</a>) and will create the minimal canonical
domain that supports the data. Also bear in mind the caveat further
below.</p>
<p style="margin-bottom:0" class="asd_par">
It is possible to effectively predeclare labels and thus enforce
an a-priori known mapping of labels onto numerical identifiers.
Labels receive an identifier in the order in which they occur
in the input. Predeclaring labels can be achieved by
having them appear in the desired order and setting the edge
weight to zero.</p>
<p style="margin-bottom:0" class="asd_par">
A major mcxload modality is whether the input refers to a single
domain or to two separate domains. An example of the first is where
labels are names of people and the value is the extent to which they
like one another. This encodes a <i>likability</i> graph where all
the nodes represent people. The reasonable thing to do in this
case is to create a single dictionary with all names wherever
they occur. All <b>tab</b> options (as opposed to <b>tabc</b> and <b>tabr</b>)
pertain to this scenario and likewise for the options <a class="intern" href="#opt--graph"><b>--graph</b></a>
and <a class="intern" href="#opt--stream-mirror"><b>--stream-mirror</b></a>.</p>
<p style="margin-bottom:0" class="asd_par">
An example of the second mode is where the first label is again the name of
a person, the second label is the name of an animal species, and the value
is the extent to which that person appreciates the species. In this case,
the reasonable thing to do is to create two dictionaries, one for persons
and one for species. All <b>tabc</b> and <b>tabr</b> options pertain to
this scenario. The <b>tabc</b> options <i>always refer to the first label</i>
and the <b>tabr</b> options <i>always refer to the second label</i>.
The letters <b>c</b> and <b>r</b> refer to <i>column</i> and <i>row</i> respectively.
The latter are the names of the matrix domains corresponding
to the input domains. Refer to <a class="local sibling" href="mcxio.html">mcxio</a>.</p>
<p style="margin-bottom:0" class="asd_par">
A further mcxload modality is whether it constructs dictionaries
on the fly, or whether it proceeds from a tab file already
available.
By default mcxload will construct dictionaries on the fly. You
need to save them with the appropriate <b>-write</b> option(s).
All the <b>strict</b> options read a tab file
and require any labels in the <a class="intern" href="#opt-abc"><b>-abc</b>&nbsp;<i>label input</i></a>
to be present in the corresponding tab file. mcxload will then fail in
the face of absent labels.
All the <b>restrict</b> options simply ignore labels that are
not found in the corresponding tab file.
The <b>extend</b> options extend the existing tab file with
labels that are not found.
It presumably only makes sense to do so if the corresponding
<b>-write</b> options are used as well.</p>
<p style="margin-bottom:0" class="asd_par">
The input stream is deduplicated on a per-node neighbourhood basis
using the <a class="intern" href="#opt-re"><b>-re</b></a> option.</p>
<p style="margin-bottom:0" class="asd_par">
mcxload has a few options to transform or select based on
the values in the input stream and the values in the
constructed matrix. These are
<a class="intern" href="#opt--stream-log"><b>--stream-log</b></a>,
<a class="intern" href="#opt--stream-neg-log"><b>--stream-neg-log</b></a>,
<a class="intern" href="#opt--stream-neg-log10"><b>--stream-neg-log10</b></a>,
<a class="intern" href="#opt-stream-tf"><b>-stream-tf</b></a> and
<a class="intern" href="#opt-tf"><b>-tf</b></a>.
Refer to <a class="local sibling" href="mcxio.html">mcxio</a> for a description of the syntax accepted
by the latter two options &mdash; it is a syntax accepted
by a few more mcl siblings.
Finally it is possible to transpose the final result
using the <a class="intern" href="#opt--transpose"><b>--transpose</b></a> option. Keep in mind that
mcxload does not accordingly change its idea of row and
column domains.</p>
<p style="margin-bottom:0" class="asd_par">
The final matrix can be symmetrified using the <a class="intern" href="#opt-ri"><b>-ri</b></a> option.</p>
<p style="margin-bottom:0" class="asd_par">
The <a class="intern" href="#opt-etc"><b>-etc</b>, <b>-235</b></a> and <b>-sif</b> options
assume a format where all entries for a given
column (or equivalently all neighbours for a given node) are joined onto a
single line. This can be useful e.g. to read in externally generated
clusterings. The <b>-etc</b> and <b>-sif</b> options expect label
input, whereas the <b>-235</b> options expects numbers in the input that
are mapped directly onto mcl numerical identifiers.
The <span class="smallcaps">SIF</span> format expected by <b>-sif</b> requires a <i>relationship type</i>
in the second field on each line; this is ignored.
As an extension to the <span class="smallcaps">SIF</span> format
weights may optionally follow the labels, separated from them with a colon character.
</p>
<p style="margin-bottom:0"><b>CAVEAT</b><br>
Please note that by feeding the line '1000000000 1' to <b>mcxload</b> with either
of the <b>-235</b> or <b>-123</b> options it will try to allocate a
matrix with one billion columns. This is most likely not what is wanted.
Assuming that the input contains fewer than one billion unique labels, one
should use the label options as described above and below.
</p>
<p style="margin-bottom:0"><b>STAGES</b><br>
Conceptually, input matrix creation consists of the following stages</p>
<div class=" itemize " style="margin-top:1em; font-size:100%">
<div class=" item_compact"><div class=" item_rightalign " style="right:-2em">i</div></div>
<div class=" item_text " style="margin-left:4em">
<p style="margin-top:0em; margin-bottom:0em">
Read the input stream, apply <a class="intern" href="#opt-stream-tf"><b>-stream-tf</b></a> transformation
specification, and optionally push reverse elements
(<a class="intern" href="#opt--stream-mirror"><b>--stream-mirror</b></a>).</p>
</div>
<div class=" item_compact"><div class=" item_rightalign " style="right:-2em">ii</div></div>
<div class=" item_text " style="margin-left:4em">
<p style="margin-top:0em; margin-bottom:0em">
Deduplicate edges in the context of all edges/arcs originating from
a given node according to the <a class="intern" href="#opt-re"><b>-re</b></a> option.</p>
</div>
<div class=" item_compact"><div class=" item_rightalign " style="right:-2em">iii</div></div>
<div class=" item_text " style="margin-left:4em">
<p style="margin-top:0em; margin-bottom:0em">
Apply transpose symmetrification according to the
<a class="intern" href="#opt-ri"><b>-ri</b></a> option, if used.</p>
</div>
<div class=" item_compact"><div class=" item_rightalign " style="right:-2em">iv</div></div>
<div class=" item_text " style="margin-left:4em">
<p style="margin-top:0em; margin-bottom:0em">
Apply <a class="intern" href="#opt-tf"><b>-tf</b></a> transformation specification.</p>
</div>
</div>

<a name="options"></a>
<h2>OPTIONS</h2>
<div class=" itemize " style="margin-top:1em; font-size:100%">
<div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-abc"></a><b>-abc</b> &lt;fname&gt; (<i>label file</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
The file to read label data from. Labels are separated by white-space. The
labels may optionally be followed by a value (again separated by
white-space), which is taken as the edge weight between the nodes
corresponding with the labels. If a tab is present on an input line it is
presumed to be the separator for that line, including the value if present.
Lines for which the first non-blank character is the octothorpe ('<tt>#</tt>')
are skipped.
</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-123"></a><b>-123</b> &lt;fname&gt; (<i>identifier file</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
The file to read numerical data from. The format is the same as
for label data, but the identifiers are directly mapped onto mcl identifiers
as described earlier.
</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-o"></a><b>-o</b> &lt;fname&gt; (<i>output file</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">The output file where the constructed matrix is written.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt--stream-mirror"></a><b>--stream-mirror</b> (<i>symmetrify, same domain</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Whenever <i>label1</i> <i>label2</i> <i>value</i>
is encountered in the input, mcxload inserts
<i>label2</i> <i>label1</i> <i>value</i> in the input
stream as well. This option implies that both labels
belong to the same domain.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt--stream-split"></a><b>--stream-split</b> (<i>assume different domains</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
This tells mcxload that the two labels belong to different domains.
The program will create two tab files, one for columns and one
for rows. This can be used for example to create a logical mapping of
gene identifiers to species identifiers.
</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-re"></a><b>-re</b> &lt;max|add|mul|first|last&gt; (<i>deduplication mode</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
This specifies how mcxload should collapse repeated entries, that is edges
for which a value is specified multiple times. This is done relative to a
single node at a time, taking into account all neighbours assembled from the
input stream. Note that <a class="intern" href="#opt--stream-mirror"><b>--stream-mirror</b></a> will result in
duplicated entries if the input contains edge specifications in both ways.
Also note that <b>first</b> and <b>last</b> might not result in
symmetric input if only <b>--stream-mirror</b> is used.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-write-tab"></a><b>-write-tab</b> &lt;fname&gt; (<i>save domain tab</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Write the domain to file. It applies to both label types.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-write-tabc"></a><b>-write-tabc</b> &lt;fname&gt; (<i>save column tab</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Write the column domain to file. It applies to the first label found
on each input line.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-write-tabr"></a><b>-write-tabr</b> &lt;fname&gt; (<i>save row tab</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Write the column domain to file. It applies to the second label found
on each input line.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-strict-tab"></a><b>-strict-tab</b> &lt;fname&gt; (<i>tab universe</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Read a dictionary from file and require each label to be present in the
dictionary. mcxload will exit on absentees.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-strict-tabc"></a><b>-strict-tabc</b> &lt;fname&gt; (<i>tabc universe</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Read a dictionary from file and require the first label on each line
to be present in the dictionary. mcxload will exit on absentees.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-strict-tabr"></a><b>-strict-tabr</b> &lt;fname&gt; (<i>tabr universe</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Read a dictionary from file and require the second label on each line
to be present in the dictionary. mcxload will exit on absentees.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-restrict-tab"></a><b>-restrict-tab</b> &lt;fname&gt; (<i>tab world</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Read a dictionary from file and only accept input lines (edges)
for which both labels are present in the dictionary.
mcxload will ignore absentees.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-restrict-tabc"></a><b>-restrict-tabc</b> &lt;fname&gt; (<i>tabc world</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Read a dictionary from file and ignore input lines
for which the first label is absent from the dictionary.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-restrict-tabr"></a><b>-restrict-tabr</b> &lt;fname&gt; (<i>tabr world</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Read a dictionary from file and ignore input lines
for which the second label is absent from the dictionary.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-extend-tab"></a><b>-extend-tab</b> &lt;fname&gt; (<i>tab launch</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Read a dictionary from file and extend it with any
label from the input not yet present in the dictionary.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-extend-tabc"></a><b>-extend-tabc</b> &lt;fname&gt; (<i>tabc launch</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Read a dictionary from file and extend it with all
first labels from the input not yet present in the dictionary.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-extend-tabr"></a><b>-extend-tabr</b> &lt;fname&gt; (<i>tabr launch</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Read a dictionary from file and extend it with all
second labels from the input not yet present in the dictionary.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-123-max"></a><b>-123-max</b> &lt;int&gt; (<i>set domain range</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-123-maxc"></a><b>-123-maxc</b> &lt;int&gt; (<i>set column range</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-123-maxr"></a><b>-123-maxr</b> &lt;int&gt; (<i>set row range</i>)</div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
These options limit the domain ranges accepted by the <b>-123</b> option.
Numbers starting from <i>&lt;int&gt;</i> will be ignored, and the domain(s)
will range from zero up to one less than <i>&lt;int&gt;</i>.
The first, <b>-123-max</b> governs both domains, and <b>-123-maxc</b>
and <b>-123-maxr</b> respectively govern the column and row domain.
</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt--stream-log"></a><b>--stream-log</b> (<i>log transform stream values</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Replace each entry by its natural logarithm.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt--stream-neg-log"></a><b>--stream-neg-log</b> (<i>negative log transform stream values</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt--stream-neg-log10"></a><b>--stream-neg-log10</b> (<i>negative log-10 transform stream values</i>)</div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Replace each entry by the negative of its natural logarithm and
log-10 representation, respectively.
This is for example useful to convert scores that denote probabilities
or p-values such as BLAST scores.
</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-stream-tf"></a><b>-stream-tf</b> (<i>transform stream values</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Transform the stream values as they are read in according
to the syntax described in <a class="local sibling" href="mcxio.html">mcxio</a>.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-tf"></a><b>-tf</b> &lt;tf-spec&gt; (<i>transform (not so) final matrix</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Transform the matrix values after deduplication and symmetrification
according to the syntax described in <a class="local sibling" href="mcxio.html">mcxio</a>.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt-ri"></a><b>-ri</b> (<i>&lt;max|add|mul&gt;</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
After the initial matrix has been assembled, it can be symmetrified by
either of these options. They indicate the operation used to combine the
entries of the transposed matrix and the original matrix. <b>mul</b>
is special in that it treats missing entries (which are normally considered
zero in mcl matrix operations) as one.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt--transpose"></a><b>--transpose</b> (<i>transpose</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Write the transposed matrix to file. This is obviously not useful
when a symmetric matrix has been generated.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-etc"></a><b>-etc</b> &lt;fname&gt; (<i>'etc' label file</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-etc-ai"></a><b>-etc-ai</b> &lt;fname&gt; (<i>leaderless 'etc' label file</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-235"></a><b>-235</b> &lt;fname&gt; (<i>'235' label file</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-235-ai"></a><b>-235-ai</b> &lt;fname&gt; (<i>leaderless '235' label file</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-sif"></a><b>-sif</b> &lt;fname&gt; (<i>SIF label file</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt--expect-values"></a><b>--expect-values</b> (<i>expect label:weight format</i>)</div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
The input is read in lines; each line is split on whitespace into labels.
For <b>-etc</b> the first label is interpreted as the source node. All
other labels are interpreted as destination nodes.
Weights may optionally follow the labels, separated from them with a colon character.
It is in this case necessary to use the <b>--expect-values</b> option.
The <span class="smallcaps">SIF</span> (Simple Interaction File) format expected by <b>-sif</b> is
similar except that it contains an additional field. In this format the
second column denotes the <i>relationship type</i>. It is ignored by <a class="local sibling" href="mcxload.html">mcxload</a>.
For <b>-etc-ai</b> (<i>auto-increment</i>) all labels are interpreted as
destination nodes and mcxload automatically creates a source node for each
line it reads. This option can be useful to read in files encoding a
clustering, where each line represents a cluster of white-space separated
labels.
</p>
<p style="margin-bottom:0" class="asd_par">
The <b>-235</b> options are similar except that the input is not
interpreted as labels but must consist of numbers that explicitly
specify the matrix to be built.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-packed"></a><b>-packed</b> &lt;fname&gt; (<i>file/stream in binary format</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-pack-cnum"></a><b>-pack-cnum</b> &lt;num&gt; (<i>set column range</i>)</div><div class=" item_cascade item_leftalign nowrap" ><a name="opt-pack-rnum"></a><b>-pack-rnum</b> &lt;num&gt; (<i>set row range</i>)</div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
The <b>-packed</b> option allows to read machine-readable data
directly. The data has to correspond to the data types for indexes
and values with with MCL was compiled. The use of <b>-pack-cnum</b>
and <b>-pack-rnum</b> is required to set the limits of
the ranges of indices that will be read.
</p>
<p style="margin-bottom:0" class="asd_par">
The <tt>/scripts</tt> directory of the MCL software contains scripts
<tt>packed-example.sh</tt> and <tt>packed-example2.sh</tt>. The first shows the simple
binary format that is accepted by <b>-packed</b>. It also documents the
required include files and library and the method by which they can be
referenced and linked to. The second expands on the first example by
multiplexing binary output onto multiple output streams. Each output stream
is read and loaded by an independent <i>mcxload</i> instance. The final result
is obtained by summing the individual matrices. This can be used to speed up
the loading of large data by parallelisation.
</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt--write-binary"></a><b>--write-binary</b> (<i>output binary format</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
The output matrix is written in native binary format &mdash; refer to
<a class="local sibling" href="mcxio.html">mcxio</a>.</p>
</div>
<div style="margin-top:0em">&nbsp;</div><div class=" item_cascade"><div class=" item_leftalign nowrap " ><a name="opt--debug"></a><b>--debug</b> (<i>debug</i>)</div></div>
<div class=" item_text " style="margin-left:2em">
<p style="margin-top:0em; margin-bottom:0em">
Among other things, this turns on warnings when <b>restrict</b> tab
files are used and labels are found to be missing.</p>
</div>
</div>

<a name="author"></a>
<h2>AUTHOR</h2>
<p style="margin-top:0em; margin-bottom:0em">
Stijn van Dongen.</p>

<a name="seealso"></a>
<h2>SEE ALSO</h2>
<p style="margin-top:0em; margin-bottom:0em">
<a class="local sibling" href="mcxio.html">mcxio</a>,
<a class="local sibling" href="mcxdump.html">mcxdump</a>,
<a class="local sibling" href="mcl.html">mcl</a>,
<a class="local sibling" href="mclfaq.html">mclfaq</a>,
and <a class="local sibling" href="mclfamily.html">mclfamily</a> for an overview of all the documentation
and the utilities in the mcl family.</p>
</body>
</html>