/usr/share/gretl/gretlgui.hlp.pt

# add Tests "Acrescentar variáveis ao modelo"

As variáveis seleccionadas são acrescentadas ao modelo anterior e o novo modelo é estimado. É apresentada a estatística de teste para a significância conjunta das variáveis acrescentadas, assim como o respectivo p-value. A estatística de teste no caso de estimação pelos mínimos quadrados (OLS) é a F, ou o qui-quadrado assimptótico de Wald nos restantes casos. Um p-value abaixo de 0,05 significa que os coeficientes são conjuntamente significantes num nível de 5 porcento. 

Menu path: Janela do modelo, /Testes/Acrescentar variáveis

Script command: <@ref="add">

# adf Tests "Teste de Dickey-Fuller aumentado"

Este comando requer uma ordem 'lag' inteira; se a ordem for 0 é executado um teste Dickey–Fuller padrão (não aumentado). 

Determina estatísticas para um conjunto de testes de Dickey–Fuller sobre a variável especificada, com a hipótese nula de que a variável tem uma raiz unitária. (Mas se a opção de diferenciação tiver sido dada, a primeira diferença da variável é obtida, e a discussão abaixo deve ser interpretada como sendo referente à variável transformada.) 

Por omissão, são apresentadas três variantes do teste: uma baseada na regressão contendo uma constante, uma usando uma constante e uma tendência linear, e uma usando uma constante e uma tendência quadrática. Você pode controlar as variantes que são apresentadas ao especificar uma ou mais opções. 

Em todos os casos a variável dependente é a primeira diferença da variável especificada, y, e a variável independente chave é o primeiro 'lag' de y. O modelo é construido de modo a que o coeficiente do 'lag' de y iguale 1 menos a raiz em questão. Por exemplo, o modelo com uma constante pode ser escrito como 

  <@fig="adf1">

Se a ordem de 'lag', k, é maior que 0, então k 'lags' da variável dependente são incluidos no lado direito das regressões de teste, sujeitos à seguinte qualificação. Se estiver seleccionada a opção "testar para baixo a partir da ordem máxima de "lag"", a ordem de 'lag' seleccionada é considerada como sendo o 'lag' máximo e a ordem de 'lag' efectivamente usada é obtida testando para baixo, de acordo com o seguinte algoritmo: 

<indent>
1. Estimar a regressão de Dickey–Fuller com k 'lags' da variável dependente. 
</indent>

<indent>
2. O último 'lag' é significante? Se sim, executar o teste com com a ordem de 'lag', k. Senão, fazer k = k – 1; se k for igual a 0, executar o teste com a ordem de 'lag' 0, senão saltar para o passo 1. 
</indent>

No contexto do passo 2 acima, "significante" quer dizer que para o último 'lag', a estatística-t, que segue uma distribuição normal, tem um <@itl="p">-value bilateral assimptótico menor ou igual a 0,10. 

Os <@itl="p">-values para os testes de Dickey–Fuller baseiam-se em MacKinnon (1996). O código relevante é incluído com a generosa permissão do autor. 

Menu path: /Variável/Teste de Dickey-Fuller aumentado

Script command: <@ref="adf">

# anova Statistics "ANOVA"

Analysis of Variance: <@var="response"> is a series measuring some effect of interest and <@var="treatment"> must be a discrete variable that codes for two or more types of treatment (or non-treatment). For two-way ANOVA, the <@var="block"> variable (which should also be discrete) codes for the values of some control variable. 

The null hypothesis for the <@itl="F">-test is that the mean response is invariant with respect to the treatment type, or in words that the treatment has no effect. Strictly speaking, the test is valid only if the variance of the response is the same for all treatment types. 

Note that the results shown by this command are in fact a subset of the information given by the following procedure, which is easily implemented in gretl. Create a set of dummy variables coding for all but one of the treatment types. For two-way ANOVA, in addition create a set of dummies coding for all but one of the "blocks". Then regress <@var="response"> on a constant and the dummies using <@ref="ols">. For a one-way design the ANOVA table is printed via the <@opt="--anova"> option to <@lit="ols">. In the two-way case the relevant <@itl="F">-test is found by using the <@ref="omit"> command. For example (assuming <@lit="y"> is the response, <@lit="xt"> codes for the treatment, and <@lit="xb"> codes for blocks): 

<code>          
	# one-way
	list dxt = dummify(xt)
	ols y 0 dxt --anova
	# two-way
	list dxb = dummify(xb)
	ols y 0 dxt dxb
	# test joint significance of dxt
	omit dxt --quiet
</code>

Menu path: /Model/Other linear models/ANOVA

Script command: <@ref="anova">

# ar Estimation "Estimação autoregressiva"

Determina estimativas para os parâmetros usando o procedimento iteractivo e generalizado de Cochrane–Orcutt (ver a Secção 9.5 de Ramanathan, 2002). A iteração termina quando os erros das somas de quadrados sucessivos não difiram em mais que 0,005 porcento ou após 20 iterações. 

A "lista de 'lags' AR" especifica a estrutura do processo de erro. Por exemplo, a entrada "1 3 4" corresponde ao processo: 

  <@fig="arlags">

Menu path: /Modelo/Série temporal/Estimação autoregressiva

Script command: <@ref="ar">

# ar1 Estimation "AR(1) estimation"

Computes feasible GLS estimates for a model in which the error term is assumed to follow a first-order autoregressive process. 

The default method is the Cochrane–Orcutt iterative procedure (see, for example, Section 9.4 of Ramanathan, 2002). Iteration is terminated when successive estimates of the autocorrelation coefficient do not differ by more than 0.001 or after 20 iterations. 

If the <@lit="--hilu"> option is given, the Hildreth–Lu search procedure is used. The results are then fine-tuned using the Cochrane–Orcutt method, unless the <@lit="--no-corc"> flag is specified. (The latter option is ignored if <@lit="--hilu"> is not specified.) 

If the <@lit="--pwe"> option is given, the Prais–Winsten estimator is used. This involves an an iteration similar to Cochrane–Orcutt; the difference is that while Cochrane–Orcutt discards the first observation, Prais–Winsten makes use of it. See, for example, Chapter 13 of Greene's <@itl="Econometric Analysis"> (2000) for details. 

Menu path: /Model/Time series/Cochrane-Orcutt
Menu path: /Model/Time series/Hildreth-Lu
Menu path: /Model/Time series/Prais-Winsten

Script command: <@ref="ar1">

# arbond Estimation "Modelos de painel dinâmico"

Executa a estimação de modelos de painel dinâmico (ou seja, modelos de painel que contenham um ou mais 'lags' da variável dependente) recorrendo ao método Método Generalizado dos Momentos (GMM) desenvolvido por Arellano e Bond (1991). 

A variável dependente deve ser dada na forma de níveis; ela será automaticamente diferenciada (pois este estimador usa diferenciação para anular os efeitos individuais). As variáveis independentes não são automaticamente diferenciadas; se você pretende usar diferenças (o que acontece em geral para variáveis quantitativas, mas não será, por exemplo, para variáveis auxiliares temporais), deve primeiro criar essas diferenças e depois especificar estas como sendo regressoras. 

No que diz respeito à manipulação de instrumentos, por favor consulte a documentação para a versão de 'script' deste comando. Actualmente você não pode especificar explicitamente os instrumentos no ambiente gráfico: todas as variáveis independentes são consideradas como sendo estritamente exógenas. 

Por omissão são apresentados os resultados da estimação 1-fase (com erros padrão robustos). Opcionalmente, você pode escolher estimação de 2-fases. Em ambos os casos são efectuados testes de autocorrelação de ordem 1 e 2, assim como o teste de sobre-identificação de Sargan e o teste de Wald para a significância conjunta dos regressores. Note-se que este modelo de diferenciação com autocorrelação de primeira-ordem não invalida o modelo, mas que a autocorrelação de segunda-ordem não respeita as assunções estatísticas presentes. 

Menu path: /Modelo/Painel/Arellano-Bond

Script command: <@ref="arbond">

# arch Tests "Teste ARCH"

Este comando requer uma ordem de 'lag' inteira. 

Testa o modelo em ARCH (Heterosquedicidade Condicional Autoregressiva) da ordem de 'lag' especificada. Se a estatística de teste LM tiver um p-value abaixo de 0,10, então a estimação ARCH também é executada. Se a variância predita de qualquer observação na regressão auxiliar não for positiva, então é usado o correspondente resíduo ao quadrado. Segue-se uma estimação por Mínimos Quadrados com Pesos sobre o modelo original. 

Ver também <@ref="garch">. 

Menu path: Janela do modelo, /Testes/ARCH

Script command: <@ref="arch">

# arima Estimation "Modelo ARMA"

Estima um modelo ARMA, com ou sem regressores exógenos. Se a ordem de diferenciação for superior a zero o modelo passa a ser ARIMA. Se os dados têm uma frequência superior a 1, é apresentada a opção de inclusão de uma componente sazonal. 

O normal é usar a funcionalidade "nativa" gretl ARMA, com estimação de Máxima Verosimilhança (ML) exacta (usando o filtro de Kalman). Outras opções são: código nativo, ML condicional; X-12-ARIMA, ML exacta; e X-12-ARIMA, ML condicional. (As últimas duas opções estão disponíveis apenas se o programa X-12-ARIMA estiver instalado.) Para detalhes sobre estas opções, veja por favor <@pdf="the Gretl User's Guide">. 

O valor AIC retornado em ligação com os modelos ARIMA é calculado conforme a definição usada no programa X-12-ARIMA, nomeadamente 

  <@fig="aic">

onde <@fig="ell"> é o logaritmo da verosimilhança e k é o número total de parâmetros estimados. Note-se que o programa X-12-ARIMA não produz critérios de informação tal como o AIC quando a estimação é por ML condicional. 

A imagem da "frequência" apresentada em ligação com as raízes AR e MA é valor λ que resolve 

  <@fig="lambda">

onde z é a raiz em questão e r o seu módulo. 

Menu path: /Modelo/Série temporal/ARIMA
Other access: Menu de contexto da janela principal (selecção singular)

Script command: <@ref="arima">

# bootstrap Tests "Bootstrap options"

In this dialog you get to choose: 

<indent>
• The variable/coefficient to examine. (You can test only one coefficient at a time using this method.) 
</indent>

<indent>
• The sort of analysis to perform. The default (95 percent) confidence interval is based directly on the quantiles of the bootstrap coefficient estimates. The "studentized" version is as per Davidson and MacKinnon's <@itl="Economic Theory and Methods"> (ETM), chapter 5: at each bootstrap replication a t-ratio is formed as (a) the difference between the current and the baseline coefficient estimate, divided by (b) the baseline estimated standard error. Then the confidence interval is formed based on the quantiles of this t-ratio, as explained in ETM. The P-value option is based on the distribution of the bootstrap t-ratio: it is the proportion of the replications where the absolute value of this statistic exceeds the absolute value of the baseline t-ratio. 
</indent>

<indent>
• Resampled residuals versus simulate normal errors. In the first case the original residuals (rescaled as suggested in ETM) are resampled with replacement. In the second case pseudo-random normal values are generated with the original residual variance. 
</indent>

<indent>
• The number of replications to perform. Note that when you're constructing a 95 percent confidence interval it is desirable that 0.05(B + 1)/2 is an integer (where B is the number of replications). So gretl may adjust the chosen number of replications to ensure this is the case. 
</indent>

<indent>
• Whether or not to produce a graph of the bootstrap distribution. This option employs gretl's kernel density estimation facility. 
</indent>

# boxplot Graphs "Gráficos caixa-com-bigodes"

Estes gráficos (criados por Tukey e Chambers) apresentam a distribuição de uma variável. A caixa central contém os 50 porcento dos dados centrais, i.e. está limitada pelos primeiro e terceiro quartis. Os "bigodes" estendem-se até aos valores mínimo e máximo. É desenhada uma linha que corta a caixa na mediana. 

No caso de caixas com entalhes, os entalhes representam os limites do intervalo de confiança para a mediana de cerca de 90 porcento. Isto é obtido usando o método 'bootstrap'. 

A seguir a cada variável indicada no comando caixa-com-bigodes, pode-se acrescentar uma expressão Booleana para restringir a variável em questão. Tem que se inserir um espaço entre o nome da variável ou número, e a expressão. Suponha que você dispõe de valores de salários (<@lit="salary">) para homens e mulheres, e que tem a variável auxiliar <@lit="GENDER"> com valor 1 para homens e 0 para mulheres. Nesse caso você podia ter gráficos caixa-com-bigodes comparativos com a seguinte linha na janela do comando: 

<code>          
	salary (GENDER=1) salary (GENDER=0)
</code>

Alguns detalhes das caixas-com-bigodes do gretl podem ser controlados por intermédio de um ficheiro (de texto simples) com o nome <@lit=".boxplotrc">. Para mais detalhes sobre isto veja <@pdf="the Gretl User's Guide">. 

Menu path: /Ver/Gráfico das variáveis/Caixa com bigodes

Script command: <@ref="boxplot">

# chow Tests "Teste de Chow"

Este comando requer um número de uma observação (ou uma data, em dados cronológicos). 

Tem que se seguir a uma regressão de Mínimos Quadrados (OLS). Cria uma variável auxiliar que é igual a 1 a partir do ponto especificado por <@var="obs"> até ao final da amostra, caso contrário é 0, e cria também termos de interacção entre esta variável auxiliar e as variáveis independentes originais. É executada uma regressão aumentada que inclui estes termos e é calculada a estatística F, considerando a regressão aumentada como não restringida e a original como restringida. Esta estatística é apropriada para testar a hipótese nula de não existência de quebra estrutural no ponto de separação dado. 

Menu path: Janela do modelo, /Testes/Teste de Chow

Script command: <@ref="chow">

# chow Tests "Chow test"

This command needs either an observation number (or date, with dated data), or the name of a dummy variable. 

Must follow an OLS regression. If an observation number or date is given, provides a test for the null hypothesis of no structural break at the given split point. The procedure is to create a dummy variable which equals 1 from the split point specified by <@var="obs"> to the end of the sample, 0 otherwise, and also interaction terms between this dummy and the original regressors. If a dummy variable is given, tests the null hypothesis of structural homogeneity with respect to that dummy. Again, interaction terms are added. In either case an augmented regression is run including the additional terms. 

By default an <@itl="F"> statistic is calculated, taking the augmented regression as the unrestricted model and the original as the restricted. But if the original model used a robust estimator for the covariance matrix, the test statistic is a Wald chi-square value based on a robust estimator of the covariance matrix for the augmented regression. 

Menu path: Model window, /Tests/Chow test

Script command: <@ref="chow">

# coeffsum Tests "Soma de coeficientes"

Este comando requer uma lista de variáveis, escolhidas a partir do conjunto de variáveis independentes de um dado modelo. 

Calcula a soma dos coeficientes nas variáveis indicadas na lista. Apresenta esta soma juntamente com o seu erro padrão e o p-value para a hipótese nula de que a soma é zero. 

Note-se a diferença entre este teste e <@ref="omit">, que testa a hipótese nula de que os coeficientes num conjunto especificado de variáveis independentes são <@itl="todos"> iguais a zero. 

Menu path: Janela do modelo, /Testes/Soma de coeficientes

Script command: <@ref="coeffsum">

# coint Tests "Teste de cointegração Engle-Granger"

O teste de cointegração Engle–Granger. O procedimento por omissão é: (1) efectuar testes de Dickey–Fuller (DF) segundo a hipótese nula de que cada variável listada tem uma raiz unitária; (2) estima a regressão de cointegração; e (3) executar um teste DF sobre os resíduos da regressão de cointegração. No entanto, se estiver seleccionada a opção "ignorar os testes iniciais DF" os testes no passo (1) são omitidos. 

Se a ordem de 'lag' k, é maior que zero, então são incluídos k 'lags' no lado direito de cada regressão de teste, excepto se estiver seleccionada a opção "Testar para baixo a partir do 'lag' de maior ordem": nesse caso ela é encarada como sendo o máximo 'lag' e a efectivamente usada em cada caso é obtida testando para baixo. Ver o comando <@ref="adf"> para detalhes sobre este procedimento. 

Por omissão, as regressões de cointegração incluem uma constante. Se você deseja suprimir a constante, ou juntar uma tendência linear ou quadrática, seleccione a opção apropriada a partir do conjunto de botões de rádio disponíveis na janela do teste de cointegração. 

Os <@itl="P-">values para este teste são baseados em MacKinnon (1996). O código relevante é incluído com a generosa permissão do autor. 

Menu path: /Modelo/Série temporal/Testes de Cointegração/Engle-Granger

Script command: <@ref="coint">

# coint2 Tests "Teste de cointegração de Johansen"

Executa o teste de Johansen para a cointegração entre as variáveis listadas para a dada ordem de 'lag'. Os valores críticos são determinados usando a aproximação gamma de J. Doornik (Doornik, 1998). Para detalhes sobre este teste ver o Capítulo 20 do livro de Hamilton, <@itl="Time Series Analysis"> (1994). 

A inclusão de termos determinísticos no modelo é controlada por intermédio dos botões das opções. Por omissão, se não tiver sido indicada nenhuma opção, será incluída uma "constante não restringida", o que permite a presença de um interceptor não-nulo nas relações cointegrantes assim como uma tendência nos níveis das variáveis endógenas. Na literatura derivada do trabalho de Johansen (ver por exemplo o livro dele de 1995) isto é frequentemente referido como sendo o "caso 3". Os outros quatro botões de opções produzem respectivamente os casos 1, 2, 4, e 5. O significado destes casos e os critérios para seleccionar um caso estão explicados no <@pdf="the Gretl User's Guide">. 

Se os dados são trimestrais ou mensais, fica disponível a selecção de uma opção que lhe permite incluir um conjunto de variáveis auxiliares sazonais centradas. Em todos os casos, a opção ("Mostrar detalhes") permite apresentar as regressões auxiliares que formam o ponto de partida do procedimento da estimação de máxima verosimilhança de Johansen. 

A seguinte tabela serve como uma guia à interpretação dos resultados apresentados pelo teste, num caso de 3-variáveis. <@lit="H0"> significa a hipótese nula, <@lit="H1"> a hipótese alternativa, e <@lit="c"> o número de relações cointegrantes. 

<code>          
                 Ordem    Teste Traço        Teste Lmax
                          H0     H1          H0     H1
                 ---------------------------------------
                  0      c = 0  c = 3       c = 0  c = 1
                  1      c = 1  c = 3       c = 1  c = 2
                  2      c = 2  c = 3       c = 2  c = 3
                 ---------------------------------------
</code>

Ver também o comando <@ref="vecm">. 

Menu path: /Modelo/Série temporal/Testes de Cointegração/Johansen

Script command: <@ref="coint2">

# compact Dataset "Compactar dados"

Quando você acrescenta a um conjunto de dados uma série temporal cuja frequência é superior, é necessário "compactar" a nova série. Por exemplo, uma série mensal terá que ser compactada para caber num conjunto de dados trimestral. 

Adicionalmente, por vezes você pode querer compactar um conjunto de dados para uma frequência mais baixa (eventualmente, antes de acrescentar uma variável de baixa-frequência ao conjunto de dados). 

Gretl oferece quatro opções de compactação: 

<indent>
• Média: O valor escrito no conjunto de dados será a média aritmética dos valores relevantes da série temporal. Por exemplo, o valor escrito para o primeiro trimestre de 1990 será a média dos valores de Janeiro, Fevereiro e Março de 1990. 
</indent>

<indent>
• Soma: O valor escrito no conjunto de dados será a soma dos valores de alta-frequência relevantes. Por exemplo, o valor escrito para o primeiro trimestre será a soma dos valores de Janeiro, Fevereiro e Março. 
</indent>

<indent>
• Valores fim-do-período: O valor escrito no conjunto de dados será o último valor relevante dos dados de alta-frequência. Por exemplo, o valor escrito para o primeiro trimestre de 1990 será o valor de Março de 1990. 
</indent>

<indent>
• Valores início-do-período: O valor escrito no conjunto de dados será o primeiro valor relevante dos dados de alta-frequência. Por exemplo, o valor escrito para o primeiro trimestre de 1990 será o valor de Janeiro de 1990. 
</indent>

Caso esteja a compactar a totalidade de um conjunto de dados, a escolha que você fizer neste diálogo, define o método por omissão. Mas se definiu um método de compactação individualmente para uma variável (no menu, "Variável/Editar características"), esse método será usado e não o por omissão. Se o método de compactação já estiver definido para todas as variáveis, não será apresentada a escolha do método. 

Menu path: /Dados/Compactar Dados

# controlled Graphs "Scatterplot with control"

This command requires the selection of three variables, one for the X axis, one for the Y axis, and one for which you wish to control (call it Z). The plot shows adjusted Y against adjusted X, where the adjusted version of the variable is the residual from an OLS regression on Z. 

Example: You have data on wages, experience and education level for a sample of people. You wish to plot wages against education, controlling for experience. In that case you select wages for the Y axis, education for the X axis, and experience as the control. The plot shows wages against education, with both variables "purged" of the effect of experience. 

# copy-formats Utilities "Formatos de exportação de dados"

Besides plain text, you have two options for copying data from this window. 

Tab separated: The data are copied as plain text, using tabs to separate the columns. This is a good choice for pasting data into a word processor. In MS Word or OpenOffice.org Writer, for example, you can select the copied material and covert it to a properly formatted table, using a menu item such as "Table autoformat" or "Convert text to table". 

Comma separated: The data are placed onto the clipboard as comma-separated values (CSV). Use this format if you want to paste the data into a spreadsheet. 

Menu path: /Ficheiro/Exportar Dados

# corrgm Statistics "Correlograma"

Apresenta os valores da função de autocorrelação para a <@var="variável">, que pode ser especificada por nome ou por número. Os valores são definidos pela equação <@fig="autocorr"> onde u<@sub="t"> é a t–ésima observação da variável u e s é o número de "lags". 

Também são apresentadas as autocorrelações parciais (obtidas segundo o algoritmo de Durbin–Levinson): estas constituem a rede dos efeitos dos "lags" intervenientes. O comando também produz o gráfico correlograma e apresenta a estatística de teste Q de Box–Pierce, para a hipótese nula de que a série temporal é "ruído branco": terá uma distribuição qui-quadrado assimptótico com os graus de liberdade iguais ao número de "lags" usados. 

Se o valor <@var="maxlag"> for especificado o comprimento do correlograma fica limitado a esse máximo número de "lags", senão o comprimento é determinado automáticamente, como uma função da frequência dos dados e do número de observações. 

Menu path: /Variável/Correlograma
Other access: Menu de contexto da janela principal (selecção singular)

Script command: <@ref="corrgm">

# cusum Tests "Teste CUSUM"

Tem que se seguir à estimação de um modelo por via de OLS. Executa o teste CUSUM —ou se for dada a opção <@lit="--squares">, o teste CUSUMSQ —para a estabilidade dos parâmetros. É obtida uma série temporal de erros de predição um passo-à-frente, pela execução de séries de regressões: a primeira regressão usa as primeiras k observações e é usada para gerar a predição da variável dependente na observação k + 1; a segunda usa a primeira predição para a observação k + 2, e por aí a diante (onde k é o número de parâmetros no modelo original). 

A soma acumulada dos erros de predição escalados, ou os quadrados desses erros, é mostrada e apresentada em gráfico. A hipótese nula para a estabilidade dos parâmetros é rejeitada ao nível de cinco porcento, se a soma acumulada se desviar do intervalo de confiança de 95 porcento. 

No caso do teste CUSUM, é também apresentada a estatística de teste t de Harvey–Collier, para a hipótese nula da estabilidade dos parâmetros. Ver o livro <@itl="Econometric Analysis"> de Greene para mais detalhes. Para o teste CUSUMSQ, o intervalo de confiança a 95 porcento é calculado de acordo com o algoritmo apresentado por Edgerton e Wells (1994). 

Menu path: Janela do modelo, /Testes/Teste CUSUM(SQ)

Script command: <@ref="cusum">

# datasort Dataset "Sorting data"

The selected variable is used as a sort key for the entire data set. The observations on all variables are re-ordered by increasing value of the key variable, or by decreasing value if you select the "Descending" option. 

# density Statistics "Kernel density estimation"

Kernel density estimation proceeds by defining a set of evenly spaced reference points, over a suitable range in relation to the range of the data, and attributing a density to each reference point based on the actual observations in the vicinity. 

The formula used to compute the estimated density at each reference point, <@itl="x">, is 

  <@fig="kernel1">

where <@itl="n"> denotes the number of data points, <@itl="h"> is a "bandwidth" parameter, and <@itl="k">() is the kernel function. The larger the value of the bandwidth parameter, the smoother the estimated density. 

You are given the choice of using a Gaussian kernel (the standard normal density) or the Epanechnikov kernel. By default, the bandwidth is that suggested as a rule of thumb by Silverman (1986), namely 

  <@fig="kernel2">

where <@itl="s"> denotes the standard deviation of the data and IQR denotes the inter-quartile range. You can widen or shrink the bandwidth via the "bandwidth adjustment factor": the actual bandwidth used is obtained by multiplying the Silverman value by the adjustment factor. 

For a good introductory discussion of kernel density estimation see Chapter 15 of Davidson and MacKinnon's <@itl="Econometric Theory and Methods">. 

# dfgls Tests "The ADF-GLS test"

The ADF-GLS test is a variant of the Dickey–Fuller test for a unit root, for the case where the variable to be tested is assumed to have a non-zero mean or to exhibit a linear trend. The difference is that the de-meaning or de-trending of the variable is done using the GLS procedure suggested by Elliott, Rothenberg and Stock (1996). This gives a test of greater power than the standard Dickey–Fuller approach. 

See also the <@ref="adf"> command and the <@opt="--gls"> option. 

Menu path: /Variable/ADF-GLS test

# dialog Estimation "Model dialog box"

To select the dependent variable, highlight a variable in the list on the left and press the "Choose" button pointing to the Dependent variable slot. If you check the "Set as default" box, the selected variable will be pre-selected as dependent when the model dialog is next opened. Short-cut: double-click on a variable on the left to select it as the dependent variable and also set it as the default. 

To select independent variables, highlight them on the left and press the "Add" button (or click the right mouse button). You can highlight several contiguous variables by dragging with the mouse. You can highlight a group of non-contiguous variables by clicking on them with the <@lit="Ctrl"> key pressed. 

# dpanel Estimation "Dynamic panel models"

Carries out estimation of dynamic panel data models (that is, panel models including one or more lags of the dependent variable) using either the GMM-DIF or GMM-SYS method. 

The dependent variable and regressors should be given in levels form; they will be differenced automatically (since this estimator uses differencing to cancel out the individual effects). 

As regards the handling of instruments, please see the documentation for the script version of this command. Currently you cannot specify instruments explicitly in the GUI: all the independent variables are taken to be strictly exogenous. 

By default the results of 1-step estimation are reported (with robust standard errors). You may select 2-step estimation as an option. In both cases tests for autocorrelation of orders 1 and 2 are provided, as well as the Sargan overidentification test and a Wald test for the joint significance of the regressors. Note that in this differenced model first-order autocorrelation is not a threat to the validity of the model, but second-order autocorrelation violates the maintained statistical assumptions. 

For further details and examples, please see <@pdf="the Gretl User's Guide">. 

Menu path: /Model/Panel/Dynamic panel model

Script command: <@ref="dpanel">

# expand Dataset "Expand data"

If you wish to add to a dataset a series that is of lower frequency, it is necessary to "expand" the new series. For instance, a quarterly series will have to be expanded to fit into a monthly dataset. In addition, you may sometimes want to expand an entire dataset to a higher frequency (perhaps, prior to adding a higher-frequency variable to the dataset). 

Expansion of data should be considered an "expert" option; you need to know what you are doing. When combining series of differing original frequencies within one dataset, you should probably consider compacting the higher-frequency data rather than expanding the lower-frequency series. 

That said, gretl offers two options: higher-frequency values can be interpolated using the method of Chow and Lin (1971), or the values of the lower-frequency series can be repeated as many times as required. 

The Chow-Lin method is regression-based, using a constant and quadratic trend and assuming a first-order autoregressive process for the disturbances. Four degrees of freedom are used up by this procedure. As for the repetition of values, suppose we have a quarterly series with the value 35.5 in 1990:1, the first quarter of 1990. On expansion to monthly, the value 35.5 will be assigned to the observations for January, February and March of 1990. The expanded variable is therefore useless for fine-grained time-series analysis, outside of the special case where you know that the variable in question does in fact remain constant over the sub-periods. 

# export Dataset "Export data"

You may export data in Comma-Separated Values (CSV) format: such data may be opened in spreadsheets and many other application programs. If you select this option you will get some further options regarding the specific format of the CSV file. 

You also have the option of exporting data in the form of a "native" gretl datafile, or (if the data are suitable) exporting to a gretl database. See <@url="gretl.sourceforge.net/gretl_data.html"> for an account of gretl databases. 

You may also export data in a format suitable for use with the following programs: 

<indent>
• GNU R (<@url="www.r-project.org">) 
</indent>

<indent>
• GNU octave (<@url="www.gnu.org/software/octave">) 
</indent>

<indent>
• JMulTi (<@url="www.jmulti.de">) 
</indent>

<indent>
• PcGive (<@url="www.pcgive.com">) 
</indent>

If you wish to export data by copying to the clipboard rather than writing to a file on disk, select the series you want to copy in the main window, right-click, and select "Copy to clibboard". (Only CSV format is supported in this context.) 

# factorized Graphs "Factorized plot"

This command requires the selection of three variables, the last of which must be a dummy variable (values 1 or 0). The Y variable is plotted against the X variable, with the data points colored differently depending on the value of the third. 

Example: You have data on wages and educational attainment for a sample of people; you also have a dummy variable with value 1 for men and 0 for women (as in Ramanathan's <@lit="data7-2">). A "factorized plot" of <@lit="WAGE"> against <@lit="EDUC"> using the <@lit="GENDER"> dummy as factor will show the data points for men in one color and those for women in another (with a legend to identify them). 

# fcast Prediction "Generate forecasts"

Must follow an estimation command. Forecasts are generated for the specified range of observations. Depending on the nature of the model, standard errors may also be generated (see below). 

The choice between a static and a dynamic forecast applies only in the case of dynamic models, with an autoregressive error process or including one or more lagged values of the dependent variable as regressors. Static forecasts are one step ahead, based on realized values from the previous period, while dynamic forecasts employ the chain rule of forecasting. For example, if a forecast for <@itl="y"> in 2008 requires as input a value of <@itl="y"> for 2007, a static forecast is impossible without actual data for 2007. A dynamic forecast for 2008 is possible if a prior forecast can be substituted for <@itl="y"> in 2007. 

The default is to give a static forecast for any portion of the forecast range that lies within the sample range over which the model was estimated, and a dynamic forecast (if relevant) out of sample. The <@lit="dynamic"> option requests a dynamic forecast from the earliest possible date, and the <@opt="static"> option requests a static forecast even out of sample. 

<code>          
	fcast --plot=fc.pdf
</code>

will generate a graphic in PDF format. Absolute pathnames are respected, otherwise files are written to the gretl working directory. 

The nature of the forecast standard errors (if available) depends on the nature of the model and the forecast. For static linear models standard errors are computed using the method outlined by Davidson and MacKinnon (2004); they incorporate both uncertainty due to the error process and parameter uncertainty (summarized in the covariance matrix of the parameter estimates). For dynamic models, forecast standard errors are computed only in the case of a dynamic forecast, and they do not incorporate parameter uncertainty. For nonlinear models, forecast standard errors are not presently available. 

Menu path: Model window, /Analysis/Forecasts

Script command: <@ref="fcast">

# fractint Statistics "Fractional integration"

Tests the specified series for fractional integration ("long memory"). The null hypothesis is that the integration order of the series is zero. By default the local Whittle estimator (Robinson, 1995) is used but if the <@opt="--gph"> option is given the GPH test (Geweke and Porter-Hudak, 1983) is performed instead. If the <@opt="--all"> flag is given then the results of both tests are printed. 

For details on this sort of test, see Phillips and Shimotsu (2004). 

If the optional <@var="order"> argument is not given the order for the test(s) is set automatically as the lesser of <@itl="T">/2 and <@itl="T"><@sup="0.6">. 

The results can be retrieved using the accessors <@lit="$test"> and <@lit="$pvalue">. These values are based on the Local Whittle Estimator unless the <@opt="--gph"> option is given. 

Menu path: /Variable/Unit root tests/Fractional integration

Script command: <@ref="fractint">

# freq Statistics "Frequency distribution"

In the frequency plot dialog box you can control the characteristics of the plot in either of two ways. 

First, you may choose the number of bins. In this case the width and placement of the bins are calculated automatically. 

Alternatively, you may specify the lower limit of the left-most bin, and the width of the bins. In this case the number of bins is calculated automatically. 

If you wish to align the bins on round numbers, here is one way to proceed: start by specifying the number of bins you want, and take a look at the plot that is produced. If it's not to your liking, take note of the modification that is required (for example, make the left-most bin start at 100 and impose a bin width of 200). Then make a second pass where you specify the left-hand limit and bin width. 

This dialog also allows you to select a theoretical distribution to be plotted against the data: either the normal or the gamma. If the normal option is selected the Doornik–Hansen test for normality is computed. If the gamma option is selected, gretl computes Locke's nonparametric test for the null hypothesis that the variable follows the gamma distribution. Note that the parameterization of the gamma distribution used in gretl is (shape, scale). 

Menu path: /Variable/Frequency distribution

Script command: <@ref="freq">

# garch Estimation "GARCH model"

Estimates a GARCH model (GARCH = Generalized Autoregressive Conditional Heteroskedasticity), either a univariate model or, if independent variables are selected, including the given exogenous variables. The conditional variance equation is shown below. 

  <@fig="garch_h">

The parameter <@var="p"> therefore represents the Generalized (or "AR") order, while <@var="q"> represents the regular ARCH (or "MA") order. If <@var="p"> is non-zero, <@var="q"> must also be non-zero otherwise the model is unidentified. However, you can estimate a regular ARCH model by setting <@var="q"> to a positive value and <@var="p"> to zero. The sum of <@var="p"> and <@var="q"> must be no greater than 5. 

By default native gretl code is used in estimation of GARCH models, but you also have the option of using the algorithm of Fiorentini, Calzolari and Panattoni (1996). The former uses the BFGS maximizer while the latter uses the information matrix to maximize the likelihood, with fine-tuning via the Hessian. 

Several variant estimates of the coefficient covariance matrix are available with this command. By default, the Hessian is used unless the "Robust standard errors" box is checked, in which case the QML (White) covariance matrix is used. Other possibilities (e.g. the information matrix, or the Bollerslev–Wooldridge estimator) can be specified using the <@ref="set"> command. 

The estimated conditional variance, along with the residuals and various other model statistics, can be accessed and added to the dataset using the "Model data" menu in the window where the model is displayed. If the box marked "Standardize the residuals" is checked, the residuals are divided by the square root of te conditional variance. 

Menu path: /Model/Time series/GARCH

Script command: <@ref="garch">

# genr Dataset "Generate a new variable"

Use this box to define a new variable, on the pattern <@var="name"> = <@var="formula">. The formula should be a well-formed combination of variable names, constants, operators and functions (details below). To ensure you get the type of variable you want, you can prefix the formula with a type-name, e.g. <@lit="scalar">, <@lit="series"> or <@lit="matrix">. For example, to create a series that has a constant value of 10, you can type 

<code>          
	series c = 10
</code>

(otherwise <@lit="c = 10"> would create a scalar variable). 

Supported <@itl="arithmetical operators"> are, in order of precedence: <@lit="^"> (exponentiation); <@lit="*">, <@lit="/"> and <@lit="%"> (modulus or remainder); <@lit="+"> and <@lit="-">. 

The available <@itl="Boolean operators"> are (again, in order of precedence): <@lit="!"> (negation), <@lit="&&"> (logical AND), <@lit="||"> (logical OR), <@lit=">">, <@lit="<">, <@lit="=">, <@lit=">="> (greater than or equal), <@lit="<="> (less than or equal) and <@lit="!="> (not equal). The Boolean operators can be used in constructing dummy variables: for instance <@lit="(x > 10)"> returns 1 if <@lit="x"> > 10, 0 otherwise. 

Built-in constants are <@lit="pi"> and <@lit="NA">. The latter is the missing value code: you can initialize a variable to the missing value with <@lit="scalar x = NA">. 

The <@lit="genr"> command supports a wide range of mathematical and statistical functions, including all the common ones plus several that are special to econometrics. In addition it offers access to numerous internal variables that are defined in the course of running regressions, doing hypothesis tests, and so on. For a listing of functions and accessors, see <@gfr="the Gretl function reference">. 

Besides the operators and functions noted above there are some special uses of <@lit="genr">: 

<indent>
• <@lit="genr time"> creates a time trend variable (1,2,3,…) called <@lit="time">. <@lit="genr index"> does the same thing except that the variable is called <@lit="index">. 
</indent>

<indent>
• <@lit="genr dummy"> creates dummy variables up to the periodicity of the data. In the case of quarterly data (periodicity 4), the program creates <@lit="dq1"> = 1 for first quarter and 0 in other quarters, <@lit="dq2"> = 1 for the second quarter and 0 in other quarters, and so on. With monthly data the dummies are named <@lit="dm1">, <@lit="dm2">, and so on. With other frequencies the names are <@lit="dummy_1">, <@lit="dummy_2">, etc. 
</indent>

<indent>
• <@lit="genr unitdum"> and <@lit="genr timedum"> create sets of special dummy variables for use with panel data. The first codes for the cross-sectional units and the second for the time period of the observations. 
</indent>

<@itl="Note">: In the command-line program, <@lit="genr"> commands that retrieve model-related data always reference the model that was estimated most recently. This is also true in the GUI program, if one uses <@lit="genr"> in the "gretl console" or enters a formula using the "Define new variable" option under the Variable menu in the main window. With the GUI, however, you have the option of retrieving data from any model currently displayed in a window (whether or not it's the most recent model). You do this under the "Model data" menu in the model's window. 

The special variable <@lit="obs"> serves as an index of the observations. For instance <@lit="genr dum = (obs=15)"> will generate a dummy variable that has value 1 for observation 15, 0 otherwise. You can also use this variable to pick out particular observations by date or name. For example, <@lit="genr d = (obs>1986:4)">, <@lit="genr d = (obs>"2008/04/01")">, or <@lit="genr d = (obs="CA")">. If daily dates or observation labels are used in this context, they should be enclosed in double quotes. Quarterly and monthly dates (with a colon) may be used unquoted. Note that in the case of annual time series data, the year is not distinguishable syntactically from a plain integer; therefore if you wish to compare observations against <@lit="obs"> by year you must use the function <@lit="obsnum"> to convert the year to a 1-based index value, as in <@lit="genr d = (obs>obsnum(1986))">. 

Scalar values can be pulled from a series in the context of a <@lit="genr"> formula, using the syntax <@var="varname"><@lit="["><@var="obs"><@lit="]">. The <@var="obs"> value can be given by number or date. Examples: <@lit="x[5]">, <@lit="CPI[1996:01]">. For daily data, the form <@var="YYYY/MM/DD"> should be used, e.g. <@lit="ibm[1970/01/23]">. 

An individual observation in a series can be modified via <@lit="genr">. To do this, a valid observation number or date, in square brackets, must be appended to the name of the variable on the left-hand side of the formula. For example, <@lit="genr x[3] = 30"> or <@lit="genr x[1950:04] = 303.7">. 

Menu path: /Add/Define new variable
Other access: Main window pop-up menu

Script command: <@ref="genr">

# genrand Programming "Generating random variables"

In this dialog you must give a name for the variable to be created, plus some additional information depending on the distribution. 

<indent>
• Uniform: the lower and upper bounds for the distribution. 
</indent>

<indent>
• Normal: the mean and (positive) standard deviation. 
</indent>

<indent>
• Chi-square and Student's t: the degrees of freedom, which must be positive. 
</indent>

<indent>
• F: both numerator and denominator degrees of freedom. 
</indent>

<indent>
• gamma: shape and scale parameters (both positive). 
</indent>

<indent>
• Binomial: the "success" probability and the integer number of trials. 
</indent>

<indent>
• Poisson: the positive mean (which also equals the variance). 
</indent>

If you want to generate repeatable sequences of pseudo-random numbers, you can set the seed, under the Tools menu. 

# genseed Programming "Setting the seed for random numbers"

The "seed" controls the starting point for the sequence of pseudo-random numbers generated in a given gretl session. By default the seed is set when the program is started, using the system time. This ensures that you get a different sequence of random numbers each time you run the program. If you want to obtain repeatable sequences, you need to set the seed manually (and take note of the value you used). 

Note that whenever you click "OK" in this dialog box, the generator is re-started, using the given seed. So, for example, if you (a) set the seed to (say) 147; (b) generate a series from the standard normal distribution; (c) revisit this dialog and click "OK" again with the seed still at 147; then (d) generate a second series from the standard normal distribution, the two generated series will be identical. 

# gmm Estimation "GMM estimation"

Performs Generalized Method of Moments (GMM) estimation using the BFGS (Broyden, Fletcher, Goldfarb, Shanno) algorithm. You must specify one or more commands for updating the relevant quantities (typically GMM residuals), one or more sets of orthogonality conditions, an initial matrix of weights, and a listing of the parameters to be estimated, all enclosed between the tags <@lit="gmm"> and <@lit="end gmm">. Any options should be appended to the <@lit="end gmm"> line. 

Please see <@pdf="the Gretl User's Guide"> for details on this command. Here we just illustrate with a simple example. 

<code>          
	gmm e = y - X*b
	  orthog e ; W
	  weights V
	  params b
	end gmm
</code>

In the example above we assume that <@lit="y"> and <@lit="X"> are data matrices, <@lit="b"> is an appropriately sized vector of parameter values, <@lit="W"> is a matrix of instruments, and <@lit="V"> is a suitable matrix of weights. The statement 

<code>          
	orthog e ; W
</code>

indicates that the residual vector <@lit="e"> is in principle orthogonal to each of the instruments composing the columns of <@lit="W">. 

Menu path: /Model/GMM

Script command: <@ref="gmm">

# graphing Graphs "Graphing"

Gretl calls a separate program, namely gnuplot, to generate graphs. Gnuplot is a very full-featured graphing program with myriad options. Gretl gives you direct access, via a graphical interface, to a subset of these options and it tries to choose sensible values for you; it also allows you to take complete control over graph details if you wish. 

With a graph displayed, you can click on the graph window for a pop-up menu with the following options: 

<indent>
• Save as postscript: save the graph in encapsulated postscript (EPS) format 
</indent>

<indent>
• Save as PNG: save in Portable Network Graphics format 
</indent>

<indent>
• Save to session as icon: the graph will appear in iconic form when you select "Icon view" from the Session menu 
</indent>

<indent>
• Zoom: lets you select an area within the graph for closer inspection 
</indent>

<indent>
• Print: (on the Gnome desktop and MS Windows only) lets you print the graph directly 
</indent>

<indent>
• Copy to clipboard: (MS Windows only) lets you paste the graph into Windows applications such as MS Word 
</indent>

<indent>
• Edit: opens a controller for the plot which lets you adjust various aspects of its appearance 
</indent>

<indent>
• Close: closes the graph window 
</indent>

If you know something about gnuplot and wish to get finer control over the appearance of a graph than is available via the graphical controller ("Edit" option), you have two further options: 

<indent>
• Once the graph is saved as a session icon, you can right-click on its icon for a further pop-up menu. One of the options here is "Edit plot commands", which opens an editing window with the actual gnuplot commands displayed. You can edit these commands and either save them for future processing or send them to gnuplot (with the execute toolbar icon in the plot commands editing window). 
</indent>

<indent>
• Another way to save the plot commands (or to save the displayed plot in formats other than EPS or PNG) is to use "Edit" item on a graph's pop-up menu to invoke the graphical controller, then click on the "Output to file" tab in the controller. You are then presented with a drop-down menu of formats in which to save the graph. 
</indent>

To find out more about gnuplot, see http://www.gnuplot.info 

# graphpg Graphs "Gretl graph page"

The session "graph page" will work only if you have the LaTeX typesetting system installed, and are able to generate and view PDF or PostScript output. 

In the session icon window, you can drag up to eight graphs onto the graph page icon. When you double-click on the graph page (or right-click and select "Display"), a page containing the selected graphs will be composed and opened in a suitable viewer. From there you should be able to print the page. 

To clear the graph page, right-click on its icon and select "Clear". 

In script (or console) mode, you can add a graph to the graph page by issuing the command <@lit="graphpg add"> after saving a named graph, as in 

<code>          
	grf1 <- gnuplot Y X
	graphpg add
</code>

Also in script mode you can call for display of the graph page using the command <@lit="graphpg show">, and can clear the page via <@lit="graphpg free">. 

Note that on systems other than MS Windows, you may have to adjust the setting for the program used to view PDF or PostScript files. Find that under the "Programs" tab in the gretl Preferences dialog box (under the Tools menu in the main window). 

Script command: <@ref="graphpg">

# 3-D Graphs "3-dimensional plots"

This feature works best if you have gnuplot 3.8 or higher installed. In that case you can manipulate the 3-D plot with the mouse (rotate it, and expand or shrink the axes). 

In composing a 3-D plot, note that the Z-axis will be shown as the vertical axis. Thus if you have some dependent variable that you think may be influenced by two independent variables, you should put the dependent variable on the Z-axis, and the independent variables on the X and Y axes. 

Unlike most other gretl graphs, 3-D plots are controlled by gnuplot rather than gretl itself. The gretl graph-editing menu is not available. 

# gui-htest Tests "Test statistic calculator"

Gretl's test calculator computes test statistics and p-values for various common hypothesis tests concerning one or two populations. The required input takes the form of sample statistics derived from one or two samples, depending on the test chosen. These statistics can be typed in as numerical values. Alternatively, if you have a data file open, you can get gretl to calculate sample statistics for a selected variable or variables (in the case of means and variances, but not in the case of proportions). 

If you want to base your test on a variable in the data set, first activate this option by checking the box titled "Use variable from dataset". Then the drop-down list of variables will become active and you can select a variable. When you select a variable from the list, the relevant statistics are automatically entered in the boxes below. 

In addition to the simple selection of a variable, you have the option of specifying a restriction on the selected variable (that is, defining a sub-sample). For example, suppose you have wage data in a variable called "wage" and you also have a dummy variable called "gender" that equals 1 for males and 0 for females (or vice versa). Then, in the test for the difference of two means, you could select "wage" in both slots, but add to the top slot "(gender=0)" and to the bottom "(gender=1)". This would then give you a test for the difference between mean male income and mean female income. Note that when you type a restriction in this way, you must then press the Enter key to have the sample statistics calculated. 

The sub-sampling restriction must be placed in parentheses following the selected variable, and in general the restriction takes the form "var2 op value," where var2 is the name of a variable in the current data set, val is a numerical value, and op is a comparison operator chosen from =, !=, <, >, <= or >= (respectively equality, inequality, less than, greater than, less than or equal, and greater than or equal). The spaces around the operator are optional. 

# gui-htest-np Tests "Nonparametric tests"

Under the "Difference test" tab you can carry out a nonparametric test for a difference between two populations or groups, the specific test depending on the option selected. 

Sign test: This test is based on the fact that if two samples, <@itl="x"> and <@itl="y">, are drawn randomly from the same distribution, the probability that <@itl="x"><@sub="i"> > <@itl="y"><@sub="i">, for each observation <@itl="i">, should equal 0.5. The test statistic is <@itl="w">, the number of observations for which <@itl="x"><@sub="i"> > <@itl="y"><@sub="i">. Under the null hypothesis this follows the Binomial distribution with parameters (<@itl="n">, 0.5), where <@itl="n"> is the number of observations. 

Rank sum test: The Wilcoxon rank-sum test is performed. This test proceeds by ranking the observations from both samples jointly, from smallest to largest, then finding the sum of the ranks of the observations from one of the samples. The two samples do not have to be of the same size, and if they differ the smaller sample is used in calculating the rank-sum. Under the null hypothesis that the samples are drawn from populations with the same median, the probability distribution of the rank-sum can be computed for any given sample sizes; and for reasonably large samples a close Normal approximation exists. 

Signed rank test: The Wilcoxon signed-rank test is performed. This is designed for matched data pairs such as, for example, the values of a variable for a sample of individuals before and after some treatment. The test proceeds by finding the differences between the paired observations, <@itl="x"><@sub="i"> – <@itl="y"><@sub="i">, ranking these differences by absolute value, then assigning to each pair a signed rank, the sign agreeing with the sign of the difference. One then calculates <@itl="W"><@sub="+">, the sum of the positive signed ranks. As with the rank-sum test, this statistic has a well-defined distribution under the null that the median difference is zero, which converges to the Normal for samples of reasonable size. 

Under the "Runs test" tab you can carry out a test for the randomness of a given variable, based on the number of runs of consecutive positive or negative values. If you select the option "Use first difference", the variable is differenced prior to the analysis and hence the runs are interpreted as runs of increasing or decreasing values of the original variable. The test statistic is based on a normal approximation to the distribution of the number of runs under the null of randomness. 

# hausman Tests "Panel diagnostics"

This test is available only after estimating an OLS model using panel data (see also <@lit="setobs">). It tests the simple pooled model against the principal alternatives, the fixed effects and random effects models. 

The fixed effects model allows the intercept of the regression to vary across the cross-sectional units. An <@itl="F">-test is reported for the null hypotheses that the intercepts do not differ. The random effects model decomposes the residual variance into two parts, one part specific to the cross-sectional unit and the other specific to the particular observation. (This estimator can be computed only if the number of cross-sectional units in the data set exceeds the number of parameters to be estimated.) The Breusch–Pagan LM statistic tests the null hypothesis that the pooled OLS estimator is adequate against the random effects alternative. 

The pooled OLS model may be rejected against both of the alternatives, fixed effects and random effects. Provided the unit- or group-specific error is uncorrelated with the independent variables, the random effects estimator is more efficient than the fixed effects estimator; otherwise the random effects estimator is inconsistent and the fixed effects estimator is to be preferred. The null hypothesis for the Hausman test is that the group-specific error is not so correlated (and therefore the random effects model is preferable). A low p-value for this test counts against the random effects model and in favor of fixed effects. 

Menu path: Model window, /Tests/Panel diagnostics

Script command: <@ref="hausman">

# hccme Estimation "Robust standard errors"

You are offered several variant calculations for standard errors that are robust in the presence of heteroskedasticity (and, in the case of the HAC estimator, autocorrelation). 

HC0 produces the original "White's standard errors"; HC1, HC2, HC3 and HC3a are subsequent variations that are generally reckoned to produce superior (more reliable) results. For details of the estimators, see MacKinnon and White (Journal of Econometrics, 1985) or Davidson and MacKinnon, Econometric Theory and Methods (Oxford, 2004). The labels given here are those used by Davidson and MacKinnon. Variant "HC3a" is the jackknife, as described in MacKinnon and White; HC3 is a close approximation to the jackknife. 

If you use the HAC estimator for OLS on time-series data, you are able to fine-tune the lag-length using the <@lit="set"> command. Please see the gretl manual or the script commands help file for details. 

When estimating a model via OLS using panel data, the default robust estimator of the covariance matrix is that given by Arellano. The alternative is Beck and Katz's Panel Corrected Standard Errors (PCSE). The latter take into account heteroskedasticity but not autocorrelation. 

Two robust estimators of the covariance matrix are offered for GARCH models: QML is the Quasi-Maximum Likelihood Estimator, and BW is the Bollerslev-Wooldridge estimator. 

# hsk Estimation "Heteroskedasticity-corrected estimates"

This command is applicable where heteroskedasticity is present in the form of an unknown function of the regressors which can be approximated by a quadratic relationship. In that context it offers the possibility of consistent standard errors and more efficient parameter estimates as compared with OLS. 

The procedure involves (a) OLS estimation of the model of interest, followed by (b) an auxiliary regression to generate an estimate of the error variance, then finally (c) weighted least squares, using as weight the reciprocal of the estimated variance. 

In the auxiliary regression (b) we regress the log of the squared residuals from the first OLS on the original regressors and their squares. The log transformation is performed to ensure that the estimated variances are non-negative. Call the fitted values from this regression <@itl="u"><@sup="*">. The weight series for the final WLS is then formed as 1/exp(<@itl="u"><@sup="*">). 

Menu path: /Model/Other linear models/Heteroskedasticity corrected

Script command: <@ref="hsk">

# hurst Statistics "Hurst exponent"

Calculates the Hurst exponent (a measure of persistence or long memory) for a time-series variable having at least 128 observations. 

The Hurst exponent is discussed by Mandelbrot. In theoretical terms it is the exponent, <@itl="H">, in the relationship 

  <@fig="hurst">

where RS is the "rescaled range" of the variable <@itl="x"> in samples of size <@itl="n"> and <@itl="a"> is a constant. The rescaled range is the range (maximum minus minimum) of the cumulated value or partial sum of <@itl="x"> over the sample period (after subtraction of the sample mean), divided by the sample standard deviation. 

As a reference point, if <@itl="x"> is white noise (zero mean, zero persistence) then the range of its cumulated "wandering" (which forms a random walk), scaled by the standard deviation, grows as the square root of the sample size, giving an expected Hurst exponent of 0.5. Values of the exponent significantly in excess of 0.5 indicate persistence, and values less than 0.5 indicate anti-persistence (negative autocorrelation). In principle the exponent is bounded by 0 and 1, although in finite samples it is possible to get an estimated exponent greater than 1. 

In gretl, the exponent is estimated using binary sub-sampling: we start with the entire data range, then the two halves of the range, then the four quarters, and so on. For sample sizes smaller than the data range, the RS value is the mean across the available samples. The exponent is then estimated as the slope coefficient in a regression of the log of RS on the log of sample size. 

Menu path: /Variable/Hurst exponent

Script command: <@ref="hurst">

# intreg Estimation "Interval regression model"

Estimates an interval regression model. This model arises when the dependent variable is imperfectly observed for some (possibly all) observations. In other words, the data generating process is assumed to be 

  <@itl="y* = x b + u">

but we only observe <@itl="m <= y* <= M"> (the interval may be left- or right-unbounded). Note that for some observations <@itl="m"> may equal <@itl="M">. The variables <@var="minvar"> and <@var="maxvar"> must contain <@lit="NA">s for left- and right-unbounded observations, respectively. 

In the model specification dialog, <@var="minvar"> and <@var="maxvar"> are indentified as the Lower bound variable and the Upper bound variable respectively. 

The model is estimated by maximum likelihood, assuming normality of the disturbance term. 

By default, standard errors are computed using the negative inverse of the Hessian. If the "Robust standard errors" box is checked, then QML or Huber–White standard errors are calculated instead. In this case the estimated covariance matrix is a "sandwich" of the inverse of the estimated Hessian and the outer product of the gradient. 

Menu path: /Model/Nonlinear models/Interval regression

Script command: <@ref="intreg">

# irfboot Graphs "Impulse response plots"

If you select the bootstrap option when plotting impulse responses, gretl computes a confidence interval for the responses using the bootstrap method. The residuals from the original VAR (or VECM) are resampled with replacement; an artificial dataset is constructed based on the original parameter estimates and the resampled residuals; the system is re-estimated and the impulse responses are re-evaluated. This is repeated 999 times and the α/2 and 1 – α/2 quantiles for the responses are found and plotted along with the point estimates. This option is not currently available for restricted VECMs. 

This dialog also supports reordering of the variables for the Cholesky decomposition of the cross-equation covariance matrix. The default is given by the order in which the variables are entered into the model specification, but the up and down arrows can be used to promote or demote a selected variable. 

# kalman Estimation "Kalman filter"

Opens a block of statements to set up a Kalman filter. This block should end with the line <@lit="end kalman">, to which the options shown above may be appended. The intervening lines specify the matrices that compose the filter. For example, 

<code>          
	kalman 
	  obsy y
	  obsymat H
	  statemat F
	  statevar Q
	end kalman
</code>

Please see <@pdf="the Gretl User's Guide"> for details. 

See also <@xrf="kfilter">, <@xrf="ksimul">, <@xrf="ksmooth">. 

Script command: <@ref="kalman">

# kpss Tests "KPSS stationarity test"

Computes the KPSS test (Kwiatkowski, Phillips, Schmidt and Shin, Journal of Econometrics, 1992) for stationarity of the given variable (or its first difference, if the differencing option is selected). The null hypothesis is that the variable in question is stationary, either around a level or, if the "include a trend" box is checked, around a deterministic linear trend. 

The selected lag order determines the size of the window used for Bartlett smoothing. If the "show regression results" box is checked the results of the auxiliary regression are printed, along with the estimated variance of the random walk component of the variable. 

The critical values shown for the test statistic are based on the response surfaces estimated by Sephton (Economics Letters, 1995), which are more accurate for small samples than the values given in the original KPSS article. When the test statistic lies between the 10 percent and 1 percent critical values a p-value is shown; this is obtained by linear interpolation and should not be taken too literally. 

Menu path: /Variable/Unit root tests/KPSS test

Script command: <@ref="kpss">

# lad Estimation "Least Absolute Deviation estimation"

Calculates a regression that minimizes the sum of the absolute deviations of the observed from the fitted values of the dependent variable. Coefficient estimates are derived using the Barrodale–Roberts simplex algorithm; a warning is printed if the solution is not unique. 

Standard errors are derived using the bootstrap procedure with 500 drawings. The covariance matrix for the parameter estimates, printed when the <@opt="--vcv"> flag is given, is based on the same bootstrap. 

Menu path: /Model/Robust estimation/Least Absolute Deviation

Script command: <@ref="lad">

# lags-dialog Estimation "Lag selection box"

In this dialog you can select the lag order for the independent variables in a time-series model, and in some cases for the dependent variable also. (But note that the common lag order for vector models such as VARs and VECMs is handled separately, via a selection spinner in the main model dialog box.) 

The spinners on the left let you select a range of consecutive lags for any given variable. To specify non-consecutive lags, click the check box next to the entry field titled "specific lags". This activates the entry box, into which you can type a list of lags, separated by spaces. 

The row marked "default" offers a quick way to set a common lag specification for all the independent variables: values set in that row are copied to all the others (apart from the dependent variable, if present). 

The dependent variable is treated specially: the minimum lag must be zero, which places the current value of the variable on the left-hand side of the model. Any higher lags appear with the independent variables on the right-hand side of the model. 

Values selected in this dialog are remembered for the duration of your session with a given dataset. 

# leverage Tests "Influential observations"

Must immediately follow an <@lit="ols"> command. Calculates the leverage (<@itl="h">, which must lie in the range 0 to 1) for each data point in the sample on which the previous model was estimated. Displays the residual (<@itl="u">) for each observation along with its leverage and a measure of its influence on the estimates, <@fig="influence">. "Leverage points" for which the value of <@itl="h"> exceeds 2<@itl="k">/<@itl="n"> (where <@itl="k"> is the number of parameters being estimated and <@itl="n"> is the sample size) are flagged with an asterisk. For details on the concepts of leverage and influence see Davidson and MacKinnon (1993), Chapter 2. 

DFFITS values are also shown: these are "studentized residuals" (predicted residuals divided by their standard errors) multiplied by <@fig="dffit">. For discussions of studentized residuals and DFFITS see chapter 12 of Maddala's Introduction to Econometrics or Belsley, Kuh and Welsch (1980). 

Briefly, a "predicted residual" is the difference between the observed value of the dependent variable at observation <@itl="t">, and the fitted value for observation <@itl="t"> obtained from a regression in which that observation is omitted (or a dummy variable with value 1 for observation <@itl="t"> alone has been added); the studentized residual is obtained by dividing the predicted residual by its standard error. 

The "+" icon at the top of the leverage test window brings up a dialog box that allows you to save one or more of the test variables to the current data set. 

Menu path: Model window, /Tests/Influential observations

Script command: <@ref="leverage">

# levinlin Tests "Levin-Lin-Chu test"

Carries out the panel unit-root test described by Levin, Lin and Chu (2002). The null hypothesis is that all of the individual time series exhibit a unit root, and the alternative is that none of the series has a unit root. (That is, a common AR(1) coefficient is assumed, although in other respects the statistical properties of the series are allowed to vary across individuals.) 

Menu path: /Variable/Unit root tests/Levin-Lin-Chu test

Script command: <@ref="levinlin">

# logistic Estimation "Logistic regression"

Logistic regression: carries out an OLS regression using the logistic transformation of the dependent variable, 

  <@fig="logistic1">

You are presented with a dialog box that allows you to specify a different maximum if you wish. The supplied <@itl="y"><@sup="*"> value must be greater than all of the observed values of the dependent variable. 

The fitted values and residuals from the regression are automatically transformed using 

  <@fig="logistic2">

where <@itl="x"> represents either a fitted value or a residual from the OLS regression using the transformed dependent variable. The reported values are therefore comparable with the original dependent variable. 

Note that if the dependent variable is binary, you should use the <@ref="logit"> command instead. 

Menu path: /Model/Nonlinear models/Logistic

Script command: <@ref="logistic">

# logit Estimation "Logit regression"

If the dependent variable is a binary variable (all values are 0 or 1) maximum likelihood estimates of the coefficients on <@var="indepvars"> are obtained via the "binary response model regression" (BRMR) method outlined by Davidson and MacKinnon (2004). As the model is nonlinear the slopes depend on the values of the independent variables. By default the slopes with respect to each of the independent variables are calculated (at the means of those variables) and these slopes replace the usual p-values in the regression output. This behavior can be suppressed my giving the <@opt="--p-values"> option. The chi-square statistic tests the null hypothesis that all coefficients are zero apart from the constant. 

By default, standard errors are computed using the negative inverse of the Hessian. If the "Robust standard errors" box is checked, then QML or Huber–White standard errors are calculated instead. In this case the estimated covariance matrix is a "sandwich" of the inverse of the estimated Hessian and the outer product of the gradient. See chapter 10 of Davidson and MacKinnon for details. 

If the dependent variable is not binary but is discrete, then by default it is interpreted as an ordinal response, and Ordered Logit estimates are obtained. However, if the <@opt="--multinomial"> option is given, the dependent variable is interpreted as an unordered response, and Multinomial Logit estimates are produced. (In either case, if the variable selected as dependent is not discrete an error is flagged.) In the multinomial case, the accessor <@lit="$mnlprobs"> is available after estimation, to get a matrix containing the estimated probabilities of the outcomes at each observation (observations in rows, outcomes in columns). 

If you want to use logit for analysis of proportions (where the dependent variable is the proportion of cases having a certain characteristic, at each observation, rather than a 1 or 0 variable indicating whether the characteristic is present or not) you should not use the <@lit="logit"> command, but rather construct the logit variable, as in 

<code>          
	genr lgt_p = log(p/(1 - p))
</code>

and use this as the dependent variable in an OLS regression. See chapter 12 of Ramanathan (2002). 

Menu path: /Model/Nonlinear models/Logit

Script command: <@ref="logit">

# mahal Statistics "Mahalanobis distances"

The Mahalanobis distance is the distance between two points in a <@itl="k">-dimensional space, scaled by the statistical variation in each dimension of the space. For example, if <@itl="p"> and <@itl="q"> are two observations on a set of <@itl="k"> variables with covariance matrix <@itl="C">, then the Mahalanobis distance between the observations is given by 

  <@fig="mahal">

where (<@itl="p"> – <@itl="q">) is a <@itl="k">-vector. This reduces to Euclidean distance if the covariance matrix is the identity matrix. 

The space for which distances are computed is defined by the selected variables. For each observation in the current sample range, the distance is computed between the observation and the centroid of the selected variables. This distance is the multidimensional counterpart of a standard <@itl="z">-score, and can be used to judge whether a given observation "belongs" with a group of other observations. 

If the number of variables selected is 4 or less, the covariance matrix and its inverse are printed. Clicking the "+" button at the top of the window displaying the distances give you the option of adding the distances to the dataset as a new variable. 

Menu path: /View/Mahalanobis distances

Script command: <@ref="mahal">

# markers Dataset "Add case markers"

This command needs the name of a file containing "case markers", that is, short identifying strings for the individual observations in the data set (for example, country or city names or codes). These marker strings should be no more than 8 characters long. The file should contain one marker per line, and there should be just as many markers as observations in the current dataset. If these conditions are met and the specified file is found, the case markers will be added; they will be visible when you choose "Display values" under gretl's Data menu. 

# meantest Tests "Difference of means"

By default the test statistic is calculated on the assumption that the variances are equal for the two variables; with the <@opt="--unequal-vars"> option the variances are assumed to be different. This will make a difference to the test statistic only if there are different numbers of non-missing observations for the two variables. 

Calculates the t statistic for the null hypothesis that the population means are equal for two selected variables, and shows its p-value. The command may be called with or without the assumption that the variances are equal for the two variables (although this will make a difference to the test statistic only if there are different numbers of non-missing observations for the two variables.) 

Menu path: /Model/Bivariate tests/Difference of means

Script command: <@ref="meantest">

# missing Dataset "Missing data values"

Set a numerical value that will be interpreted as "missing" or "not available", either for a particular data series (under the Variable menu) or globally for the entire data set (under the Sample menu). 

Gretl has its own internal coding for missing values, but sometimes imported data may employ a different code. For example, if a particular series is coded such that a value of -1 indicates "not applicable", you can select "Set missing value code" under the Variable menu and type in the value "-1" (without the quotes). Gretl will then read the -1s as missing observations. 

# mle Estimation "Maximum likelihood estimation"

Performs Maximum Likelihood (ML) estimation using the BFGS (Broyden, Fletcher, Goldfarb, Shanno) algorithm. You must specify the log-likelihood function; it is recommended that you also supply expressions for the derivatives of this function with respect to each of the parameters if possible. 

Simple example: Suppose we have a series <@lit="X"> with values 0 or 1 and we wish to obtain the maximum likelihood estimate of the probability, <@lit="p">, that <@lit="X"> = 1. (In this simple case we can guess in advance that the ML estimate of <@lit="p"> will simply equal the proportion of Xs equal to 1 in the sample.) 

The parameter <@lit="p"> must first be added to the dataset and given an initial value. This can be done using the genr command or via menu choices. Appropriate "genr" lines may be typed into the MLE specification window prior to the specification of the log-likelihood function. 

In the MLE window we type the following lines: 

<code>          
	loglik = X*log(p) + (1-X)*log(1-p)
	deriv p = X/p - (1-X)/(1-p)
</code>

The first line specifies the log-likelihood function, and the next line supplies the derivative of that function with respect to the parameter p. If no "deriv" lines are given, a numerical approximation to the derivatives is computed. 

If the parameter p was not previously declared we could preface the above lines with something like the following: 

<code>          
	genr p = 0.5
</code>

By default, standard errors are based on the Outer Product of the Gradient. If the robust standard errors box is checked, a QML estimator is used (namely, a sandwich of the negative inverse of the Hessian and the covariance matrix of the gradient). The Hessian is approximated numerically. 

Menu path: /Model/Maximum likelihood

Script command: <@ref="mle">

# modeltab Utilities "The model table"

In econometric research it is common to estimate several models with a common dependent variable—the models differing in respect of which independent variables are included, or perhaps in respect of the estimator used. In this situation it is convenient to present the regression results in the form of a table, where each column contains the results (coefficient estimates and standard errors) for a given model, and each row contains the estimates for a given variable across the models. 

Gretl provides a means of constructing such a table (and copying it in plain text, LaTeX or Rich Text Format). Here is how to do it: 

<indent>
1. Estimate a model which you wish to include in the table, and in the model display window, under the File menu, select "Save to session as icon" or "Save as icon and close". 
</indent>

<indent>
2. Repeat step 1 for the other models to be included in the table (up to a total of six models). 
</indent>

<indent>
3. When you are done estimating the models, open the icon view of your gretl session (by selecting "icon view" under the Session menu in the main gretl window, or by clicking the "session icon view" icon on the gretl toolbar). 
</indent>

<indent>
4. In session icon view, there is an icon labeled "Model table". Decide which model you wish to appear in the left-most column of the model table and add it to the table, either by dragging its icon onto the Model table icon, or by right-clicking on the model icon and selecting "Add to model table" from the pop-up menu. 
</indent>

<indent>
5. Repeat step 4 for the other models you wish to include in the table. The second model selected will appear in the second column from the left, and so on. 
</indent>

<indent>
6. When you are finished composing the model table, display it by double-clicking on its icon. Under the Edit menu in the window which appears, you have the option of copying the table to the clipboard in various formats. 
</indent>

<indent>
7. If the ordering of the models in the table is not what you wanted, right-click on the model table icon and select "Clear table". Then go back to step 4 above and try again. 
</indent>

Menu path: Session window, Model table icon

Script command: <@ref="modeltab">

# mpols Estimation "Multiple-precision OLS"

Computes OLS estimates for the specified model using multiple precision floating-point arithmetic. This command is available only if gretl is compiled with support for the Gnu Multiple Precision (GMP) library. By default 256 bits of precision are used for the calculations, but this can be increased via the environment variable <@lit="GRETL_MP_BITS">. For example, when using the bash shell one could issue the following command, before starting gretl, to set a precision of 1024 bits. 

<code>          
	export GRETL_MP_BITS=1024
</code>

Menu path: /Model/Other linear models/High precision OLS

Script command: <@ref="mpols">

# negbin Estimation "Negative Binomial regression"

Estimates a Negative Binomial model. The dependent variable is taken to represent a count of the occurrence of events of some sort, and must have only non-negative integer values. By default the model NegBin 2 is used, in which the conditional variance of the count is given by μ(1 + αμ), where μ denotes the conditional mean. But if the <@opt="--model1"> option is given the conditional variance is μ(1 + α). 

The optional <@lit="offset"> series works in the same way as for the <@ref="poisson"> command. The Poisson model is a restricted form of the Negative Binomial in which α = 0 by construction. 

By default, standard errors are computed using a numerical approximation to the Hessian at convergence. But if the <@opt="--opg"> option is given the covariance matrix is based on the Outer Product of the Gradient (OPG), or if the <@opt="--robust"> option is given QML standard errors are calculated, using a "sandwich" of the inverse of the Hessian and the OPG. 

Menu path: /Model/Nonlinear models/Count data...

Script command: <@ref="negbin">

# nls Estimation "Nonlinear Least Squares"

Performs Nonlinear Least Squares (NLS) estimation using a modified version of the Levenberg–Marquardt algorithm. You must supply a function specification; it is recommended but not required that you also supply expressions for the derivatives of this function with respect to each of the parameters if possible. If you do not supply derivatives you should instead give a list of the parameters to be estimated (separated by spaces or commas), preceded by the keyword <@lit="params">; these can be either scalars, or vectors, or any combination of the two. 

Example: Suppose we have a data set with variables <@itl="C"> and <@itl="Y"> (e.g. <@lit="greene11_3.gdt">) and we wish to estimate a nonlinear consumption function of the form 

  <@fig="greene_Cfunc">

The parameters alpha, beta and gamma must first be added to the dataset and given initial values. This can be done using the genr command or via menu choices. Appropriate "genr" lines may be typed into the NLS specification window prior to the function specification. 

In the NLS window we type the following lines: 

<code>          
	C = alpha + beta * Y^gamma
	deriv alpha = 1
	deriv beta = Y^gamma
	deriv gamma = beta * Y^gamma * log(Y)
</code>

The first line specifies the regression function, and the next three lines supply the derivatives of that function with respect to each of the parameters in turn. If the "deriv" lines are not given, a numerical approximation to the Jacobian is computed. 

If the parameters alpha, beta and gamma were not previously declared we could preface the above lines with something like the following: 

<code>          
	genr alpha = 1
	genr beta = 1
	genr gamma = 1
</code>

For further details on NLS estimation please see <@pdf="the Gretl User's Guide">. 

Menu path: /Model/Nonlinear models/Nonlinear Least Squares

Script command: <@ref="nls">

# normtest Tests "Normality test"

Carries out a test for normality for the given <@var="series">. The specific test is controlled by the option flags (but if no flag is given, the Doornik–Hansen test is performed). Note: the Doornik–Hansen and Shapiro–Wilk tests are recommended over the others, on account of their superior small-sample properties. 

The test statistic and its p-value may be retrieved using the accessors <@lit="$test"> and <@lit="$pvalue">. Please note that if the <@opt="--all"> option is given, the result recorded is that from the Doornik–Hansen test. 

Menu path: /Variable/Normality test

Script command: <@ref="normtest">

# nulldata Dataset "Creating a blank dataset"

Establishes a "blank" data set, containing only a constant and an index variable, with periodicity 1 and the specified number of observations. This may be used for simulation purposes: some of the <@lit="genr"> commands (e.g. <@lit="genr uniform()">, <@lit="genr normal()">) will generate dummy data from scratch to fill out the data set. This command may be useful in conjunction with <@lit="loop">. See also the "seed" option to the <@ref="set"> command. 

By default, this command cleans out all data in gretl's current workspace. If you give the <@opt="--preserve"> option, however, any currently defined matrices are retained. 

Menu path: /File/New data set

Script command: <@ref="nulldata">

# ols Estimation "Ordinary Least Squares"

Computes ordinary least squares (OLS) estimates for the specified model. 

Besides coefficient estimates and standard errors, the program also prints p-values for <@itl="t"> (two-tailed) and <@itl="F">-statistics. A p-value below 0.01 indicates statistical significance at the 1 percent level and is marked with <@lit="***">. <@lit="**"> indicates significance between 1 and 5 percent and <@lit="*"> indicates significance between the 5 and 10 percent levels. Model selection statistics (the Akaike Information Criterion or AIC and Schwarz's Bayesian Information Criterion) are also printed. The formula used for the AIC is that given by Akaike (1974), namely minus two times the maximized log-likelihood plus two times the number of parameters estimated. 

saves the residuals under the name <@lit="uh">. See the "accessors" section of the gretl function reference for details. 

Menu path: /Model/Ordinary Least Squares
Other access: Beta-hat button on toolbar

Script command: <@ref="ols">

# omit Tests "Omit variables"

This command re-estimates the given model after omitting the specified variables, or after sequentially omitting insignificant variables if the relevant box is available and is checked. Besides the usual model output, it prints a test for the joint significance of the omitted variables. The null hypothesis is that the true coefficients on all the omitted variables equal zero. 

Sequential elimination works as follows: at each step the variable with the highest p-value is omitted, until all remaining variables have a p-value no greater than some cutoff. The default cutoff is 10 percent (two-sided); this can be adjusted via the spin button. By default this process operates on all variables in the model (apart from the constant). If you want to confine it to a subset of the variables, check the box labeled "Test only selected variables" and make a selection. 

Menu path: Model window, /Tests/Omit variables

Script command: <@ref="omit">

# online Dataset "Access online databases"

Gretl is able to access databases at Wake Forest University (your computer must be connected to the internet for this to work). 

Under the "File, Browse databases" menu, select the item "on database server". A window should appear, showing a listing of the gretl databases available at Wake Forest. (Depending on your location and the speed of your internet connection, this may take a few seconds.) Along with the name of the database and a short description, there will appear a "Local status" entry: this shows whether you have the database installed locally (on the hard drive of your computer) and if so, whether or not it is up to date with the version on the server. 

If you have a given database installed locally, and it is up to date, there is no advantage in accessing it via the server. But for a database that is not already installed and up to date, you may wish to get a listing of the data series: click on "Get series listing". This brings up a further window, from which you can display the values of a chosen data series, graph those values, or import them into gretl's workspace. These tasks can be accomplished using the "Series" menu, or via the popup menu that appears when you click the right mouse button on a given series. You can also search the listing for a variable of interest (the "Find" menu item). 

If you want faster access to the data, or wish to access the database offline, then select the line showing the database you want, in the initial database window, and press the "Install" button. This will download the database in compressed format, then uncompress it and install it on your hard drive. Thereafter you should be able to find it under the "File, Browse databases, gretl native" menu. 

# panel Estimation "Panel models"

Estimates a panel model. By default the fixed effects estimator is used; this is implemented by subtracting the group or unit means from the original data. 

If the "Random effects" button is checked, random effects GLS estimates are computed, using the method of Swamy and Arora. 

For more details on panel estimation, please see <@pdf="the Gretl User's Guide">. 

Menu path: /Model/Panel

Script command: <@ref="panel">

# panel-between Estimation "Between groups model"

This dialog allows you to enter a specification for the "between model" in the context of panel data. This regression uses the group-means of the data, thereby ignoring the variation within the groups. This model is rarely of great interest in its own right, but may be useful for purposes of comparison (for example, with the fixed effects model). 

# panel-mode Dataset "Panel data organization"

This dialog offers up to three options with regard to defining a data set as a panel. The first two options require that the data set is already organized in a panel format (although this may not yet be recognized by gretl). The third option requires that the data set contains variables that represent the panel structure. 

<@itl="Stacked time series">: Let there be <@var="N"> cross-sectional units in the data set, and let <@var="T"> = the number of time-series observations per unit. By selecting this option you are telling gretl that the data set is currently composed of <@var="N"> consecutive blocks of <@var="T"> time-series observations, one for each cross-sectional unit. The next step will be to specify the value of <@var="N">. 

<@itl="Stacked cross sections">: You are telling gretl that the data set is currently composed of <@var="T"> consecutive blocks of <@var="N"> cross-sectional observations, one for each time period. The next step, again, will be to specify the value of <@var="N">. 

If the total number of observations in the current dataset is prime, the above options are not available. 

<@itl="Use index variables">: You are saying that the data set is currently organized any old way (it doesn't matter how), but that it contains two variables that index the cross-sectional units and the time periods respectively. The next step will be to select those two variables. Panel index variables must have nothing but non-negative integer values, with no missing values. If there are no such variables in the dataset this option is not available. 

# panel-wls Estimation "Groupwise weighted least squares"

Groupwise weighted least squares for panel data. Computes weighted least squares (WLS) estimates, with the weights based on the estimated error variances for the respective cross-sectional units in the sample. 

If the iteration option is selected, the procedure is iterated: at each round the residuals are re-computed using the current WLS parameter estimates, which gives rise to a new set of estimates of the error variances, and a hence a new set of weights. Iteration stops when the maximum difference in the parameter estimates from one round to the next falls below 0.0001 or the number of iterations reaches 20. If the iteration converges, the resulting estimates are Maximum Likelihood. 

# pca Statistics "Principal Components Analysis"

Principal Components Analysis. Prints the eigenvalues of the correlation matrix (or the covariance matrix if the option box is checked) for the variables in <@var="varlist">, along with the proportion of the joint variance accounted for by each component. Also prints the corresponding eigenvectors (or "component loadings"). 

In the window displaying the results, you have the option of saving the principal components to the dataset as series. 

Menu path: /View/Principal components
Other access: Main window pop-up (multiple selection)

Script command: <@ref="pca">

# pergm Statistics "Periodogram"

Computes and displays (and if not in batch mode, graphs) the spectrum of the specified series. By default the sample periodogram is given, but optionally a Bartlett lag window is used in estimating the spectrum (see, for example, Greene's <@itl="Econometric Analysis"> for a discussion of this). The default width of the Bartlett window is twice the square root of the sample size but this can be set manually using the <@var="bandwidth"> parameter, up to a maximum of half the sample size. 

If the <@opt="--log"> option is given the spectrum is represented on a logarithmic scale. 

The (mutually exclusive) options <@opt="--radians"> and <@opt="--degrees"> influence the appearance of the frequency axis when the periodogram is graphed. By default the frequency is scaled by the number of periods in the sample, but these options cause the axis to be labeled from 0 to π radians or from 0 to 180°, respectively. 

Menu path: /Variable/Periodogram
Other access: Main window pop-up menu (single selection)

Script command: <@ref="pergm">

# polyweights Transformations "Polynomial trend fitting"

In fitting a polynomial trend to a time series it may be desirable to give extra weight to the observations at the start and end of the sample. (Points in the middle of the sample range have neighbours on both sides that are likely to be pulling the fit in the same general direction.) 

The weighting schemes offered here (quadratic, cosine-bell and steps) can be used to this effect. If you select one of these schemes two additional settings must be chosen: first, what maximum weight should be used (the minimum, baseline weight is 1.0)? Second, what central fraction of the sample should be given a uniform (minimal) weighting? 

Suppose, for example, you select a maximum weight of 3.0 and a central fraction of 0.4. This means that the middle 40 percent of the data get a weight of 1.0. If the steps shape is selected the first and last 30 percent of the observations get a weight of 3.0; otherwise, for the first 30 percent of observations the weights decline gradually from 3.0 to 1.0; and for the last 30 percent the weights increase from 1.0 to 3.0. 

# poisson Estimation "Poisson estimation"

Estimates a poisson regression. The dependent variable is taken to represent the occurrence of events of some sort, and must take on only non-negative integer values. 

If a discrete random variable <@itl="Y"> follows the Poisson distribution, then 

  <@fig="poisson1">

for <@itl="y"> = 0, 1, 2,…. The mean and variance of the distribution are both equal to <@itl="v">. In the Poisson regression model, the parameter <@itl="v"> is represented as a function of one or more independent variables. The most common version (and the only one supported by gretl) has 

  <@fig="poisson2">

or in other words the log of <@itl="v"> is a linear function of the independent variables. 

Optionally, you may add an "offset" variable to the specification. This is a scale variable, the log of which is added to the linear regression function (implicitly, with a coefficient of 1.0). This makes sense if you expect the number of occurrences of the event in question to be proportional, other things equal, to some known factor. For example, the number of traffic accidents might be supposed to be proportional to traffic volume, other things equal, and in that case traffic volume could be specified as an "offset" in a Poisson model of the accident rate. The offset variable must be strictly positive. 

By default, standard errors are computed using the negative inverse of the Hessian. If the <@opt="--robust"> flag is given, then QML or Huber–White standard errors are calculated instead. In this case the estimated covariance matrix is a "sandwich" of the inverse of the estimated Hessian and the outer product of the gradient. 

See also <@ref="negbin">. 

Menu path: /Model/Nonlinear models/Count data...

Script command: <@ref="poisson">

# probit Estimation "Probit model"

If the dependent variable is a binary variable (all values are 0 or 1) maximum likelihood estimates of the coefficients on <@var="indepvars"> are obtained via the "binary response model regression" (BRMR) method outlined by Davidson and MacKinnon (2004). As the model is nonlinear the slopes depend on the values of the independent variables. By default the slopes with respect to each of the independent variables are calculated (at the means of those variables) and these slopes replace the usual p-values in the regression output. This behavior can be suppressed my giving the <@opt="--p-values"> option. The chi-square statistic tests the null hypothesis that all coefficients are zero apart from the constant. 

By default, standard errors are computed using the negative inverse of the Hessian. If the "Robust standard errors" box is checked, then QML or Huber–White standard errors are calculated instead. In this case the estimated covariance matrix is a "sandwich" of the inverse of the estimated Hessian and the outer product of the gradient. See chapter 10 of Davidson and MacKinnon for details. 

If the dependent variable is not binary but is discrete, then Ordered Probit estimates are obtained. (If the variable selected as dependent is not discrete, an error is flagged.) 

Probit for analysis of proportions is not implemented in gretl at this point. 

Menu path: /Model/Nonlinear models/Probit

Script command: <@ref="probit">

# qlrtest Tests "Quandt likelihood ratio test"

For a model estimated on time-series data via OLS, performs the Quandt likelihood ratio (QLR) test for a structural break at an unknown point in time, with 15 percent trimming at the beginning and end of the sample period. 

For each potential break point within the central 70 percent of the observations, a Chow test is performed (see <@ref="chow">). The QLR test statistic is the maximum of the <@itl="F"> values from these tests. It follows a non-standard distribution, the critical values of which are taken from Stock and Watson's <@itl="Introduction to Econometrics"> (2003). If the QLR statistic exceeds the critical value at the chosen level of significance, one can infer that the parameters of the model are not constant. This statistic can be used to detect forms of instability other than a single discrete break (such as multiple breaks or a slow drifting of the parameters). 

Menu path: Model window, /Tests/QLR test

Script command: <@ref="qlrtest">

# qqplot Graphs "Q-Q plot"

With just one series selected, displays a plot of the empirical quantiles of the given series against the quantiles of the normal distribution. The series must include at least 20 valid observations in the current sample range. By default the empirical quantiles are plotted against quantiles of the normal distribution having the same mean and variance as the sample data, but two alternatives are available: the data may be standardized (converted to z-scores) before plotting, or the "raw" empirical quantiles may be plotted against the quantiles of the standard normal distribution. 

Given two series arguments, <@var="y"> and <@var="x">, displays a plot of the empirical quantiles of <@var="y"> against those of <@var="x">. The data values are not standardized. 

Menu path: /Variable/Normal Q-Q plot
Menu path: /View/Graph specified vars/Q-Q plot

Script command: <@ref="qqplot">

# quantreg Estimation "Quantile regression"

Quantile regression. By default standard errors are computed according to the asymptotic formula given by Koenker and Bassett (<@itl="Econometrica">, 1978), but if the "robust" box is checked we use the heteroskedasticity-robust variant from Koenker and Zhao (<@itl="Journal of Nonparametric Statistics">, 1994). 

If the "Compute confidence intervals" option is checked gretl will calculate confidence intervals for the coefficients, in place of standard errors. The "robust" check-box still has an effect: if it is not checked, the intervals are computed on the assumption of IID errors; with it, gretl uses the robust estimator developed by Koenker and Machado (<@itl="Journal of the American Statistical Association">, 1999). Note that these intervals are not just "plus or minus so many standard errors"; in general, they are asymmetrical about the point estimates of the coefficients. 

You may give a list of quantiles (see the drop-down list for some pre-defined possibilities). In that case gretl will calculate quantile estimates and either standard errors or confidence intervals for each of the specified values. 

To Follow up on the references given above, please see <@pdf="the Gretl User's Guide">. 

Menu path: /Model/Robust estimation/Quantile regression

Script command: <@ref="quantreg">

# reset Tests "Ramsey's RESET"

Must follow the estimation of a model via OLS. Carries out Ramsey's RESET test for model specification (non-linearity) by adding the square and/or the cube of the fitted values to the regression and calculating the <@itl="F"> statistic for the null hypothesis that the parameters on the added terms are zero. 

Menu path: Model window, /Tests/Ramsey's RESET

Script command: <@ref="reset">

# restrict-model Tests "Restrictions on a model"

Each restriction in the set should be expressed as an equation, with a linear combination of parameters on the left and a numeric value to the right of the equals sign. Parameters may be referenced in the form <@lit="b["><@var="i"><@lit="]">, where <@var="i"> represents the position in the list of regressors (starting at 1), or <@lit="b["><@var="varname"><@lit="]">, where <@var="varname"> is the name of the regressor in question. 

The <@lit="b"> terms in the equation representing a restriction may be prefixed with a numeric multiplier, using <@lit="*"> to represent multiplication, for example <@lit="3.5*b[4]">. 

Here is an example of a set of restrictions: 

<code>          
	b[1] = 0
	b[2] - b[3] = 0
	b[4] + 2*b[5] = 1
</code>

# restrict-system Tests "Restrictions on a system of equations"

Each restriction in the set should be expressed as an equation, with a linear combination of parameters on the left and a numeric value to the right of the equals sign. Parameters are referenced using <@lit="b"> plus two numbers in square brackets. The leading number represents the position of the equation within the system and the second number indicates position in the list of regressors, starting at 1 in both cases. For example <@lit="b[2,1]"> denotes the first parameter in the second equation, and <@lit="b[3,2]"> the second parameter in the third equation. 

The <@lit="b"> terms in the equation representing a restriction may be prefixed with a numeric multiplier, using <@lit="*"> to represent multiplication, for example <@lit="3.5*b[1,4]">. 

Here is an example of a set of restrictions: 

<code>          
	b[1,1] = 0
	b[1,2] - b[2,2] = 0
	b[3,4] + 2*b[3,5] = 1
</code>

# restrict-vecm Tests "Restrictions on a VECM"

Use this command to place linear restrictions on the cointegrating relations (beta) and/or adjustment coefficients (alpha) in a vector error-correction model (VECM). 

Each restriction should be expressed as an equation, with a linear combination of parameters to the left of the equals sign and a numerical value on the right. Restrictions on beta may be non-homogeneous (non-zero on the right), but alpha restrictions must be homogeneous (zero on the right). 

If the VECM is of rank 1, the elements of beta are referenced in the form <@lit="b["><@var="i"><@lit="]">, where <@var="i"> represents position in the cointegrating vector, starting at 1. For example, <@lit="b[2]"> denotes the second element in beta. If the rank is greater than 1, use <@lit="b"> plus two numbers in square brackets. For example, <@lit="b[2,1]"> denotes the first element in the second cointegrating vector. 

To reference elements of alpha, use <@lit="a"> instead of <@lit="b">. 

The parameter identifiers in the equation representing a restriction may be prefixed with a numeric multiplier, using <@lit="*"> to represent multiplication, for example <@lit="3.5*b[4]">. 

Here is an example of a set of restrictions on a VECM of rank 1. 

<code>          
	b[1] + b[2] = 0
	b[1] + b[3] = 0
</code>

See also <@pdf="the Gretl User's Guide">. 

# rmplot Graphs "Range-mean plot"

Range–mean plot: this command creates a simple graph to help in deciding whether a time series, <@itl="y">(t), has constant variance or not. We take the full sample t=1,...,T and divide it into small subsamples of arbitrary size <@itl="k">. The first subsample is formed by <@itl="y">(1),...,<@itl="y">(k), the second is <@itl="y">(k+1), ..., <@itl="y">(2k), and so on. For each subsample we calculate the sample mean and range (= maximum minus minimum), and we construct a graph with the means on the horizontal axis and the ranges on the vertical. So each subsample is represented by a point in this plane. If the variance of the series is constant we would expect the subsample range to be independent of the subsample mean; if we see the points approximate an upward-sloping line this suggests the variance of the series is increasing in its mean; and if the points approximate a downward sloping line this suggests the variance is decreasing in the mean. 

Besides the graph, gretl displays the means and ranges for each subsample, along with the slope coefficient for an OLS regression of the range on the mean and the p-value for the null hypothesis that this slope is zero. If the slope coefficient is significant at the 10 percent significance level then the fitted line from the regression of range on mean is shown on the graph. The <@itl="t">-statistic for the null, and the corresponding p-value, are recorded and may be retrieved using the accessors <@lit="$test"> and <@lit="$pvalue"> respectively. 

Menu path: /Variable/Range-mean graph

Script command: <@ref="rmplot">

# runs Tests "Runs test"

Carries out the nonparametric "runs" test for randomness of the specified <@var="series">, where runs are defined as sequences of consecutive positive or negative values. If you want to test for randomness of deviations from the median, for a variable named <@lit="x1"> with a non-zero median, you can do the following: 

<code>          
	genr signx1 = x1 - median(x1)
	runs signx1
</code>

If the <@opt="--difference"> option is given, the variable is differenced prior to the analysis, hence the runs are interpreted as sequences of consecutive increases or decreases in the value of the variable. 

If the <@opt="--equal"> option is given, the null hypothesis incorporates the assumption that positive and negative values are equiprobable, otherwise the test statistic is invariant with respect to the "fairness" of the process generating the sequence, and the test focuses on independence alone. 

Menu path: /Tools/Nonparametric tests

Script command: <@ref="runs">

# sampling Dataset "Setting the sample"

The Sample menu offers several ways of selecting a sub-sample from the current dataset. 

If you choose "Sample/Restrict based on criterion..." you need to supply a Boolean (logical) expression, of the same sort that you would use to define a dummy variable. For example the expression "sqft > 1400" will select only cases for which the variable sqft has a value greater than 1400. Conditions may be concatenated using the logical operators "&&" (AND) and "||" (OR), and may be negated using "!" (NOT). If the dataset already contains dummy variables, you are also given the option of selecting one of these to define the sample (observations with a value of 1 for the selected dummy will be included, and others excluded). 

The menu item "Sample/Drop all obs with missing values" redefines the sample to exclude all observations for which values of one or more variables are missing (leaving only complete cases). 

To select observations for which a particular variable has no missing values, use "Restrict based on criterion..." and supply the Boolean condition "!missing(varname)" (replace "varname" with the name of the variable you want to use). 

If the observations are labeled, you can exclude particular observations using, for example, <@lit="obs!="France""> as the Boolean criterion. The observation name must be enclosed in double quotes. 

One point should be noted about defining a sample based on a dummy variable, a Boolean expression, or on the missing values criterion: Any "structural" information in the data header file (regarding the time series or panel nature of the data) is lost. You may reimpose structure with "Sample/Set frequency, startobs...". 

Please see <@pdf="the Gretl User's Guide"> for further details. 

# save-labels Utilities "Save or remove series labels"

If you choose Export here, gretl will write a file containing the descriptive labels of any series in the current dataset that have such labels. This is a plain text file with one line per variable. The line will be empty for variables that have no descriptive label. 

If you choose Remove, the descriptive labels will be removed for all series that have such labels. This would be appropriate only if the current labels have somehow been added in error. 

# add-labels Utilities "Add series labels"

If you choose Yes here, you are offered a file-open dialog box to select a plain text file containing descriptive labels for the series in the current dataset. The file should contain one label per line; a blank line means no label. Gretl will attempt to read as many labels as there are series in the dataset, excluding the constant. 

# save-script Utilities "Save commands?"

If you choose Yes here, gretl will write a file containing a record of the commands you executed in the current session. Most commands that you execute via "point and click" have a "script" counterpart, and it is these script commands that will be saved. You could take the file as the basis for writing a gretl command script. 

If you don't care to be prompted to save a record of commands on exit, uncheck the tick box in the save commands dialog. 

# save-session Utilities "Save this gretl session?"

If you choose Yes here, gretl will write a file containing a "snapshot" of the current session, including a copy of the working dataset along with any models, graphs or other objects that you have saved "as icons". You can re-open this file later to recreate the state of gretl as of the time you quit the session (see the "File/Session files" menu). 

If you mostly work with gretl using command scripts (which we recommend for "serious" econometric work) you probably don't need to save the session, but you should be sure to save any changes to your script that you wish to keep. You may also want to save any changes to your dataset, unless these are of a sort that can easily be recreated by running a script. 

If you work with scripts and don't care to be prompted to save your session on exit, uncheck the tick box in the save session dialog. 

# scatters Graphs "Multiple pairwise graphs"

Generates pairwise graphs of the selected "Y-axis variable" against each of the selected "X-axis variables" in turn. (Or you can select several variables for the Y-axis and one for the X-axis.) Scanning a set of such plots can be a useful step in exploratory data analysis. The maximum number of plots is six; any extra variables will be ignored. 

Menu path: /View/Multiple graphs

Script command: <@ref="scatters">

# setinfo Dataset "Edit attributes of variable"

In this dialog box you can: 

* Rename a (series) variable. 

* Add or edit a description of the variable: this appears next to the variable name in the gretl main window. 

* Add or edit the "display name" for the variable (if the variable is a series, not a scalar). This string (maximum 19 characters) is shown in place of the variable name when the variable is displayed in a graph. Thus for instance you can associate a more comprehensible string such as "T-bill rate" with a cryptically named variable such as "tb3". 

* (For time-series data) set the compaction method for the variable. This method will be used if you decide to reduce the frequency of the dataset, or if you update the variable by importing from a database where the variable is at a higher frequency than in the working dataset. 

* Mark a variable as discrete (for series with integer values only). This affects the way the variable is handled when you ask for a frequency plot. 

Menu path: /Variable/Edit attributes
Other access: Main window pop-up menu

Script command: <@ref="setinfo">

# setmiss Dataset "Missing value code"

Set a numerical value that will be interpreted as "missing" or "not applicable", either for a particular data series (under the Variable menu) or globally for the entire data set (under the Sample menu). 

Gretl has its own internal coding for missing values, but sometimes imported data may employ a different code. For example, if a particular series is coded such that a value of -1 indicates "not applicable", you can select "Set missing value code" under the Variable menu and type in the value "-1" (without the quotes). Gretl will then read the -1s as missing observations. 

Menu path: /Data/Set missing value code

Script command: <@ref="setmiss">

# spearman Statistics "Spearmans's rank correlation"

Prints Spearman's rank correlation coefficient for a specified pair of variables. The variables do not have to be ranked manually in advance; the function takes care of this. 

The automatic ranking is from largest to smallest (i.e. the largest data value gets rank 1). If you need to invert this ranking, create a new variable which is the negative of the original. For example: 

<code>          
	genr altx = -x
	spearman altx y
</code>

Menu path: /Model/Robust estimation/Rank correlation

Script command: <@ref="spearman">

# store Dataset "Save data"

Saves either the entire dataset or, if a <@var="varlist"> is supplied, a specified subset of the series in the current dataset, to the file given by <@var="filename">. 

By default the data are saved in "native" gretl format, but the option flags permit saving in several alternative formats. CSV (Comma-Separated Values) data may be read into spreadsheet programs, and can also be manipulated using a text editor. The formats of Octave, R and PcGive are designed for use with the respective programs. Gzip compression may be useful for large datasets. See <@pdf="the Gretl User's Guide"> for details on the various formats. 

The option flags <@opt="--omit-obs"> and <@opt="--no-header"> are applicable only when saving data in CSV format. By default, if the data are time series or panel, or if the dataset includes specific observation markers, the CSV file includes a first column identifying the observations (e.g. by date). If the <@opt="--omit-obs"> flag is given this column is omitted. The <@opt="--no-header"> flag suppresses the usual printing of the names of the variables at the top of the columns. 

The option of saving in gretl database format is intended to help with the construction of large sets of series, possibly having mixed frequencies and ranges of observations. At present this option is available only for annual, quarterly or monthly time-series data. If you save to a file that already exists, the default action is to append the newly saved series to the existing content of the database. In this context it is an error if one or more of the variables to be saved has the same name as a variable that is already present in the database. The <@opt="--overwrite"> flag has the effect that, if there are variable names in common, the newly saved variable replaces the variable of the same name in the original dataset. 

The <@opt="--comment"> option is available when saving data as a database or in CSV format. The required parameter is a double-quoted one-line string, attached to the option flag with an equals sign. The string is inserted as a comment into the database index file or at the top of the CSV output. 

Menu path: /File/Save data; /File/Export data

Script command: <@ref="store">

# system Estimation "Systems of equations"

In this window you can define a system of equations and choose an estimator for the system. Four sorts of statement may be given here, as follows: 

<indent>
• <@ref="equation">: specify an equation within the system. At least two such statements must be provided. 
</indent>

<indent>
• <@lit="instr">: for a system to be estimated via Three-Stage Least Squares, a list of instruments (by variable name or number). Alternatively, you can put this information into the <@lit="equation"> line using the same syntax as in the <@ref="tsls"> command. 
</indent>

<indent>
• <@lit="endog">: for a system of simultaneous equations, a list of endogenous variables. This is primarily intended for use with FIML estimation, but with Three-Stage Least Squares this approach may be used instead of giving an <@lit="instr"> list; then all the variables not identified as endogenous will be used as instruments. 
</indent>

<indent>
• <@lit="identity">: for use with FIML, an identity linking two or more of the variables in the system. This sort of statement is ignored when an estimator other than FIML is used. 
</indent>

Menu path: /Model/Simultaneous equations

Script command: <@ref="system">

# tobit Estimation "Tobit model"

Estimates a Tobit model, which may be appropriate when the dependent variable is "censored". For example, positive and zero values of purchases of durable goods on the part of individual households are observed, and no negative values, yet decisions on such purchases may be thought of as outcomes of an underlying, unobserved disposition to purchase that may be negative in some cases. 

By default it is assumed that the dependent variable is censored at zero on the left and is uncensored on the right. However you can use the entry boxes marked "left bound" and "right bound" to specify a different pattern of censoring. Enter either a numerical value or <@lit="NA"> for no censoring. 

The Tobit model is a special case of interval regression, which is supported via the <@ref="intreg"> command. 

Menu path: /Model/Nonlinear models/Tobit

Script command: <@ref="tobit">

# transpos Dataset "Transpose data"

Transposes the current data set. That is, each observation (row) in the current data set will be treated as a variable (column), and each variable as an observation. This command may be useful if data have been read from some external source in which the rows of the data table represent variables. 

See also <@ref="dataset">. 

Menu path: /Data/Transpose data

# tsls Estimation "Instrumental variables regression"

This command requires the selection of two lists of variables: the independent variables to appear in the given model and a set of instruments. Note that any exogenous regressors should appear in both lists. 

Output for two-stage least squares estimates includes the Hausman test and, if the model is over-identified, the Sargan over-identification test. In the Hausman test, the null hypothesis is that OLS estimates are consistent, or in other words estimation by means of instrumental variables is not really required. A model of this sort is over-identified if there are more instruments than are strictly required. The Sargan test is based on an auxiliary regression of the residuals from the two-stage least squares model on the full list of instruments. The null hypothesis is that all the instruments are valid, and suspicion is thrown on this hypothesis if the auxiliary regression has a significant degree of explanatory power. For a good explanation of both tests see chapter 8 of Davidson and MacKinnon (2004). 

For both TSLS and LIML estimation, an additional test result is shown provided that the model is estimated under the assumption of i.i.d. errors (that is, the <@opt="--robust"> option is not selected). This is a test for weakness of the instruments. Weak instruments can lead to serious problems in IV regression: biased estimates and/or incorrect size of hypothesis tests based on the covariance matrix, with rejection rates well in excess of the nominal significance level (Stock, Wright and Yogo, 2002). The test statistic is the first-stage <@itl="F">-test if the model contains just one endogenous regressor, otherwise it is the smallest eigenvalue of the matrix counterpart of the first stage <@itl="F">. Critical values based on the Monte Carlo analysis of Stock and Yogo (2003) are shown when available. 

The R-squared value printed for models estimated via two-stage least squares is the square of the correlation between the dependent variable and the fitted values. 

Menu path: /Model/Other linear models/Two-Stage Least Squares

Script command: <@ref="tsls">

# var Estimation "Vector Autoregression"

This command requires specification of: 

<indent>
• - the lag order, that is, the number of lags of each variable that should be included in the system; 
</indent>

<indent>
• - any exogenous variables (but note that a constant is included automatically unless you specify otherwise, a trend can be added using the trend checkbox, and seasonal dummy variables can be added using the seasonals checkbox); and 
</indent>

<indent>
• - a list of endogenous variables, lags of which will be included on the right-hand side of each equation (note: do not include lagged variables in this list -- they will be added automatically). 
</indent>

A separate regression will be run for each variable in the system. Output for each equation includes F-tests for zero restrictions on all lags of each of the variables and an F-test for the maximum lag, along with (optionally) forecast variance decompositions and impulse response functions. 

Forecast variance decompositions and impulse responses are based on the Cholesky decomposition of the contemporaneous covariance matrix, and in this context the order in which the (stochastic) variables are given matters. The first variable in the list is assumed to be "most exogenous" within-period. The horizon for variance decompositions and impulse responses can be set using the <@ref="set"> command. 

Menu path: /Model/Time series/Vector autoregression

Script command: <@ref="var">

# VAR-lagselect Tests "VAR lag-length selection"

In this dialog box you specify a VAR as usual, but use the lag order spin button to set the maximum number of lags to test. 

Output will consist of a table showing the values of the Akaike (AIC), Schwartz (BIC) and Hannan–Quinn (HQC) information criteria computed from VARs of order 1 to the chosen maximum. This is intended to help with the selection of the optimal lag order. 

# VAR-omit Tests "Test exogenous variables in VAR"

Use this dialog box to specify a subset of exogenous variables in a VAR. These variables will be omitted from the original VAR, and the system re-estimated. 

A Likelihood Ratio test is reported, where the null hypothesis is that the true parameter values are zero, in all equations of the VAR, for the omitted variables. The test is based on the difference between the log-determinant of the variance matrix for the unrestricted system, and that for the restricted system with the selected variables omitted. 

# vartest Tests "Difference of variances"

Calculates the <@itl="F"> statistic for the null hypothesis that the population variances are equal for the two selected variables, and shows its p-value. 

Menu path: /Model/Bivariate tests/Difference of variances

Script command: <@ref="vartest">

# vecm Estimation "Vector Error Correction Model"

A VECM is a form of vector autoregression or VAR (see <@ref="var">), applicable where the variables in the model are individually integrated of order 1 (that is, are random walks, with or without drift), but exhibit cointegration. This command is closely related to the Johansen test for cointegration (see <@ref="coint2">). 

The lag order selected in the VECM dialog box is that of the VAR system. The number of lags in the VECM itself (where the dependent variable is given as a first difference) is one less than this number. 

The "cointegration rank" represents the number of cointegrating vectors. This must be greater than zero and less than or equal to (generally, less than) the number of endogenous variables selected. 

In the "Endogenous variables" box you select the vector of endogenous variables, in levels. The inclusion of deterministic terms in the model is controlled by the option buttons. The default is to include an "unrestricted constant", which allows for the presence of a non-zero intercept in the cointegrating relations as well as a trend in the levels of the endogenous variables. In the literature stemming from the work of Johansen (see for example his 1995 book) this is often referred to as "case 3". The other four options produce cases 1, 2, 4 and 5 respectively. The meaning of these cases and the criteria for selecting a case are explained in <@pdf="the Gretl User's Guide">. 

In the "Exogenous variables" box you may add specific exogenous variables. By default these enter the model in unrestricted form (indicated by a <@lit="U"> next to the name of the variable). If you want a certain exogenous variable to be restricted to the cointegrating space, right-click on it and select "Restricted" from the pop-up menu. The symbol next to the variable will change to R. 

If the data are quarterly or monthly, a check box is shown that allows you to include a set of centered seasonal dummy variables. In all cases, an additional check box ("Show details") allows for the printing of the auxiliary regressions that form the starting point of the Johansen maximum likelihood estimation procedure. 

Menu path: /Model/Time series/VECM

Script command: <@ref="vecm">

# wls Estimation "Weighted Least Squares"

Let "wtvar" denote the variable selected in the "Weight variable" box. An OLS regression is run, where the dependent variable is the product of the positive square root of wtvar and the selected dependent variable, and the independent variables are also multiplied by the square root of wtvar. Statistics such as <@itl="R">-squared are based on the weighted data. If wtvar is a dummy variable, weighted least squares estimation is equivalent to eliminating all observations with value zero for wtvar. 

Menu path: /Model/Other linear models/Weighted Least Squares

Script command: <@ref="wls">

# working-dir Utilities "Working directory"

The "working directory" is where gretl looks by default when reading or writing data files or scripts via the file Open and Save dialogs. 

In addition the working directory is the default location for 

<indent>
• reading files via the script commands <@lit="append">, <@lit="open">, <@lit="run"> and <@lit="include">; and 
</indent>

<indent>
• writing files via the commands <@lit="eqnprint">, <@lit="tabprint">, <@lit="gnuplot">, <@lit="outfile"> and <@lit="store">. 
</indent>

The option of having gretl use the current directory (as determined via the shell) at start-up may be useful to people who are in the habit of launching gretl from a command prompt rather than a menu or icon. 

This dialog also allows you to set the behavior of the GUI file selector: when you open or save a file in a given folder, should the selector remember and return to the same folder on the next invocation? Or should the selector always visit the chosen working directory? 

Menu path: /File/Working directory

# x12a Utilities "X-12-ARIMA"

There are two procedural options here, controlled by the lower set of radio-buttons. 

If you select "Execute X-12-ARIMA directly" then gretl writes a command file for X-12-ARIMA and calls the x12a program to execute the commands. In this case you have the option of producing a graph and/or saving selected output series to the gretl dataset. 

If you select "Make X-12-ARIMA command file" gretl writes a command file for X-12-ARIMA, as above, but then opens this file in an editor window. In that window you are able to make changes and to save the file under a chosen name. You are also able to send the file for execution by x12a (by clicking the "Run" button on the editor window toolbar) and view the output. But in this case you do not have the option of saving data as gretl series or producing a gretl graph. 

# xcorrgm Statistics "Cross-correlogram"

Prints and graphs the cross-correlogram for variables <@var="var1"> and <@var="var2">, which may be specified by name or number. The values are the sample correlation coefficients between the current value of <@var="var1"> and successive leads and lags of <@var="var2">. 

If an <@var="order"> value is specified the length of the cross-correlogram is limited to at most that number of leads and lags, otherwise the length is determined automatically, as a function of the frequency of the data and the number of observations. 

Menu path: /View/Cross-correlogram
Other access: Main window pop-up menu (multiple selection)

Script command: <@ref="xcorrgm">

# xtab Statistics "Cross-tabulate variables"

Displays a contingency table or cross-tabulation for each combination of the selected variables. Note that all the variables must be discrete. 

By default, frequency count values are shown in the cells and on the margins of the table. However, you can choose to display either row or column percentages instead. 

By default, cells with a zero count are shown as empty, but you can choose to show zero values explicitly. 

Pearson's chi-square test for independence is displayed if the expected frequency under independence is at least 1.0e-7 for all cells. A common rule of thumb for the validity of this statistic is that at least 80 percent of cells should have expected frequencies of 5 or greater; if this criterion is not met a warning is printed. 

If the contingency table is 2 by 2, Fisher's Exact Test for independence is computed. Note that this test is based on the assumption that the row and column totals are fixed, which may or may not be approriate depending on how the data were generated. The left p-value should be used when the alternative to independence is negative association (values tend to cluster in the lower left and upper right cells); the right p-value should be used if the alternative is positive association. The two-tailed p-value for this test is calculated by method (b) in section 2.1 of Agresti (1992): it is the sum of the probabilities of all possible tables having the given row and column totals and having a probability less than or equal to that of the observed table. 

Script command: <@ref="xtab">
gretl-common 1.9.6-1build1 / usr / share / gretl / gretlgui.hlp.pt