/usr/share/doc/doxygen/html/arch.html is in doxygen-doc 1.8.13-10.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=9"/>
<meta name="generator" content="Doxygen 1.8.13"/>
<meta name="viewport" content="width=device-width, initial-scale=1"/>
<title>Doxygen: Doxygen's Internals</title>
<link href="tabs.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="jquery.js"></script>
<script type="text/javascript" src="dynsections.js"></script>
<link href="navtree.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="resize.js"></script>
<script type="text/javascript" src="navtreedata.js"></script>
<script type="text/javascript" src="navtree.js"></script>
<script type="text/javascript">
$(document).ready(initResizable);
</script>
<link href="doxygen_manual.css" rel="stylesheet" type="text/css" />
</head>
<body>
<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
<div id="titlearea">
<table cellspacing="0" cellpadding="0">
<tbody>
<tr style="height: 56px;">
<td id="projectalign" style="padding-left: 0.5em;">
<div id="projectname">Doxygen
</div>
</td>
</tr>
</tbody>
</table>
</div>
<!-- end header part -->
<!-- Generated by Doxygen 1.8.13 -->
</div><!-- top -->
<div id="side-nav" class="ui-resizable side-nav-resizable">
<div id="nav-tree">
<div id="nav-tree-contents">
<div id="nav-sync" class="sync"></div>
</div>
</div>
<div id="splitbar" style="-moz-user-select:none;"
class="ui-resizable-handle">
</div>
</div>
<script type="text/javascript">
$(document).ready(function(){initNavTree('arch.html','');});
</script>
<div id="doc-content">
<div class="header">
<div class="headertitle">
<div class="title">Doxygen's Internals </div> </div>
</div><!--header-->
<div class="contents">
<div class="textblock"><h3>Doxygen's internals</h3>
<p><b>Note that this section is still under construction!</b></p>
<p>The following picture shows how source files are processed by doxygen.</p>
<div class="image">
<img src="archoverview.gif" alt="archoverview.gif"/>
<div class="caption">
Data flow overview</div></div>
<p>The following sections explain the steps above in more detail.</p>
<h3>Config parser</h3>
<p>The configuration file that controls the settings of a project is parsed and the settings are stored in the singleton class <code>Config</code> in <code>src/config.h</code>. The parser itself is written using <code>flex</code> and can be found in <code>src/config.l</code>. This parser is also used directly by <code>doxywizard</code>, so it is put in a separate library.</p>
<p>Each configuration option has one of 5 possible types: <code>String</code>, <code>List</code>, <code>Enum</code>, <code>Int</code>, or <code>Bool</code>. The values of these options are available through the global functions <code>Config_getXXX()</code>, where <code>XXX</code> is the type of the option. The argument of these function is a string naming the option as it appears in the configuration file. For instance: <code>Config_getBool</code>("GENERATE_TESTLIST") returns a reference to a boolean value that is <code>TRUE</code> if the test list was enabled in the config file.</p>
<p>The function <code>readConfiguration()</code> in <code>src/doxygen.cpp</code> reads the command line options and then calls the configuration parser.</p>
<h3>C Preprocessor</h3>
<p>The input files mentioned in the config file are (by default) fed to the C Preprocessor (after being piped through a user defined filter if available).</p>
<p>The way the preprocessor works differs somewhat from a standard C Preprocessor. By default it does not do macro expansion, although it can be configured to expand all macros. Typical usage is to only expand a user specified set of macros. This is to allow macro names to appear in the type of function parameters for instance.</p>
<p>Another difference is that the preprocessor parses, but not actually includes code when it encounters a <code>#include</code> (with the exception of <code>#include</code> found inside { ... } blocks). The reasons behind this deviation from the standard is to prevent feeding multiple definitions of the same functions/classes to doxygen's parser. If all source files would include a common header file for instance, the class and type definitions (and their documentation) would be present in each translation unit.</p>
<p>The preprocessor is written using <code>flex</code> and can be found in <code>src/pre.l</code>. For condition blocks (<code>#if</code>) evaluation of constant expressions is needed. For this a <code>yacc</code> based parser is used, which can be found in <code>src/constexp.y</code> and <code>src/constexp.l</code>.</p>
<p>The preprocessor is invoked for each file using the <code>preprocessFile()</code> function declared in <code>src/pre.h</code>, and will append the preprocessed result to a character buffer. The format of the character buffer is</p>
<pre class="fragment">0x06 file name 1
0x06 preprocessed contents of file 1
...
0x06 file name n
0x06 preprocessed contents of file n
</pre><h3>Language parser</h3>
<p>The preprocessed input buffer is fed to the language parser, which is implemented as a big state machine using <code>flex</code>. It can be found in the file <code>src/scanner.l</code>. There is one parser for all languages (C/C++/Java/IDL). The state variables <code>insideIDL</code> and <code>insideJava</code> are uses at some places for language specific choices.</p>
<p>The task of the parser is to convert the input buffer into a tree of entries (basically an abstract syntax tree). An entry is defined in <code>src/entry.h</code> and is a blob of loosely structured information. The most important field is <code>section</code> which specifies the kind of information contained in the entry.</p>
<p>Possible improvements for future versions:</p><ul>
<li>Use one scanner/parser per language instead of one big scanner.</li>
<li>Move the first pass parsing of documentation blocks to a separate module.</li>
<li>Parse defines (these are currently gathered by the preprocessor, and ignored by the language parser).</li>
</ul>
<h3>Data organizer</h3>
<p>This step consists of many smaller steps, that build dictionaries of the extracted classes, files, namespaces, variables, functions, packages, pages, and groups. Besides building dictionaries, during this step relations (such as inheritance relations), between the extracted entities are computed.</p>
<p>Each step has a function defined in <code>src/doxygen.cpp</code>, which operates on the tree of entries, built during language parsing. Look at the "Gathering information" part of <code>parseInput()</code> for details.</p>
<p>The result of this step is a number of dictionaries, which can be found in the doxygen "namespace" defined in <code>src/doxygen.h</code>. Most elements of these dictionaries are derived from the class <code>Definition</code>; The class <code>MemberDef</code>, for instance, holds all information for a member. An instance of such a class can be part of a file ( class <code>FileDef</code> ), a class ( class <code>ClassDef</code> ), a namespace ( class <code>NamespaceDef</code> ), a group ( class <code>GroupDef</code> ), or a Java package ( class <code>PackageDef</code> ).</p>
<h3>Tag file parser</h3>
<p>If tag files are specified in the configuration file, these are parsed by a SAX based XML parser, which can be found in <code>src/tagreader.cpp</code>. The result of parsing a tag file is the insertion of <code>Entry</code> objects in the entry tree. The field <code>Entry::tagInfo</code> is used to mark the entry as external, and holds information about the tag file.</p>
<h3>Documentation parser</h3>
<p>Special comment blocks are stored as strings in the entities that they document. There is a string for the brief description and a string for the detailed description. The documentation parser reads these strings and executes the commands it finds in it (this is the second pass in parsing the documentation). It writes the result directly to the output generators.</p>
<p>The parser is written in C++ and can be found in src/docparser.cpp. The tokens that are eaten by the parser come from src/doctokenizer.l. Code fragments found in the comment blocks are passed on to the source parser.</p>
<p>The main entry point for the documentation parser is <code>validatingParseDoc()</code> declared in <code>src/docparser.h</code>. For simple texts with special commands <code>validatingParseText()</code> is used.</p>
<h3>Source parser</h3>
<p>If source browsing is enabled or if code fragments are encountered in the documentation, the source parser is invoked.</p>
<p>The code parser tries to cross-reference to source code it parses with documented entities. It also does syntax highlighting of the sources. The output is directly written to the output generators.</p>
<p>The main entry point for the code parser is <code>parseCode()</code> declared in <code>src/code.h</code>.</p>
<h3>Output generators</h3>
<p>After data is gathered and cross-referenced, doxygen generates output in various formats. For this it uses the methods provided by the abstract class <code>OutputGenerator</code>. In order to generate output for multiple formats at once, the methods of <code>OutputList</code> are called instead. This class maintains a list of concrete output generators, where each method called is delegated to all generators in the list.</p>
<p>To allow small deviations in what is written to the output for each concrete output generator, it is possible to temporarily disable certain generators. The OutputList class contains various <code>disable()</code> and <code>enable()</code> methods for this. The methods <code>OutputList::pushGeneratorState()</code> and <code>OutputList::popGeneratorState()</code> are used to temporarily save the set of enabled/disabled output generators on a stack.</p>
<p>The XML is generated directly from the gathered data structures. In the future XML will be used as an intermediate language (IL). The output generators will then use this IL as a starting point to generate the specific output formats. The advantage of having an IL is that various independently developed tools written in various languages, could extract information from the XML output. Possible tools could be:</p><ul>
<li>an interactive source browser</li>
<li>a class diagram generator</li>
<li>computing code metrics.</li>
</ul>
<h3>Debugging</h3>
<p>Since doxygen uses a lot of <code>flex</code> code it is important to understand how <code>flex</code> works (for this one should read the <code>man</code> page) and to understand what it is doing when <code>flex</code> is parsing some input. Fortunately, when flex is used with the <code>-d</code> option it outputs what rules matched. This makes it quite easy to follow what is going on for a particular input fragment.</p>
<p>To make it easier to toggle debug information for a given flex file I wrote the following perl script, which automatically adds or removes <code>-d</code> from the correct line in the <code>Makefile:</code> </p>
<pre class="fragment">#!/usr/bin/perl
$file = shift @ARGV;
print "Toggle debugging mode for $file\n";
if (!-e "../src/${file}.l")
{
print STDERR "Error: file ../src/${file}.l does not exist!";
exit 1;
}
system("touch ../src/${file}.l");
unless (rename "src/CMakeFiles/_doxygen.dir/build.make","src/CMakefiles/_doxygen.dir/build.make.old") {
print STDERR "Error: cannot rename src/CMakeFiles/_doxygen.dir/build.make!\n";
exit 1;
}
if (open(F,"<src/CMakeFiles/_doxygen.dir/build.make.old")) {
unless (open(G,">src/CMakefiles/_doxygen.dir/build.make")) {
print STDERR "Error: opening file build.make for writing\n";
exit 1;
}
print "Processing build.make...\n";
while (<F>) {
if ( s/flex \$\(LEX_FLAGS\) -P${file}YY/flex \$(LEX_FLAGS) -d -P${file}YY/ ) {
print "Enabling debug info for $file.l\n";
}
elsif ( s/flex \$\(LEX_FLAGS\) -d -P${file}YY/flex \$(LEX_FLAGS) -P${file}YY/ ) {
print "Disabling debug info for $file\n";
}
print G "$_";
}
close F;
unlink "src/CMakeFiles/_doxygen.dir/build.make.old";
}
else {
print STDERR "Warning file src/CMakeFiles/_doxygen.dir/build.make does not exist!\n";
}
# touch the file
$now = time;
utime $now, $now, $file
</pre><p> Another way to get rules matching / debugging information from the <code>flex</code> code is setting LEX_FLAGS with <code>make</code> (<code>make LEX_FLAGS=-d</code>).</p>
<p>Note that by running doxygen with <code>-d lex</code> you get information about which <code>flex codefile</code> is used.</p>
<p>
Return to the <a href="index.html">index</a>.
</p>
</div></div><!-- contents -->
</div><!-- doc-content -->
<!-- start footer part -->
<div id="nav-path" class="navpath"><!-- id is needed for treeview function! -->
<ul>
<li class="footer">Generated by
<a href="http://www.doxygen.org/index.html">
<img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.13 </li>
</ul>
</div>
</body>
</html>
|