F Test Pcgive

An F-test is any statistical test in which the test statistic has an F-distribution under the null hypothesis.It is most often used when comparing statistical models that have been fitted to a data set, in order to identify the model that best fits the population from which the data were sampled. Exact 'F-tests' mainly arise when the models have been fitted to the data using least squares. The name was coined by George W. Snedecor, in honour of Sir Ronald A. Fisher. Fisher initially developed the statistic as the variance ratio in the 1920s.^[1]

Equation (5) can be used to test the joint null that yt contains a stochastic trend, but no deterministic trend, i.e. H0: δ = 0,β = 0. To do this we need to use an F-test. We can compare the results of the F-test with the critical values reported in Dickey and Fuller’s (1981, Table VI, p.

1Common examples
2Formula and calculation

Common examples[edit]

Common examples of the use of F-tests include the study of the following cases:

The hypothesis that the means of a given set of normally distributed populations, all having the same standard deviation, are equal. This is perhaps the best-known F-test, and plays an important role in the analysis of variance (ANOVA).
The hypothesis that a proposed regression model fits the data well. See Lack-of-fit sum of squares.
The hypothesis that a data set in a regression analysis follows the simpler of two proposed linear models that are nested within each other.

In addition, some statistical procedures, such as Scheffé's method for multiple comparisons adjustment in linear models, also use F-tests.

F-test of the equality of two variances[edit]

The F-test is sensitive to non-normality.^[2]^[3] In the analysis of variance (ANOVA), alternative tests include Levene's test, Bartlett's test, and the Brown–Forsythe test. However, when any of these tests are conducted to test the underlying assumption of homoscedasticity (i.e. homogeneity of variance), as a preliminary step to testing for mean effects, there is an increase in the experiment-wise Type I error rate.^[4]

Formula and calculation[edit]

Most F-tests arise by considering a decomposition of the variability in a collection of data in terms of sums of squares. The test statistic in a F-test is the ratio of two scaled sums of squares reflecting different sources of variability. These sums of squares are constructed so that the statistic tends to be greater when the null hypothesis is not true. In order for the statistic to follow the F-distribution under the null hypothesis, the sums of squares should be statistically independent, and each should follow a scaled χ²-distribution. The latter condition is guaranteed if the data values are independent and normally distributed with a common variance.

Multiple-comparison ANOVA problems[edit]

The F-test in one-way analysis of variance is used to assess whether the expected values of a quantitative variable within several pre-defined groups differ from each other. For example, suppose that a medical trial compares four treatments. The ANOVA F-test can be used to assess whether any of the treatments is on average superior, or inferior, to the others versus the null hypothesis that all four treatments yield the same mean response. This is an example of an 'omnibus' test, meaning that a single test is performed to detect any of several possible differences. Alternatively, we could carry out pairwise tests among the treatments (for instance, in the medical trial example with four treatments we could carry out six tests among pairs of treatments). The advantage of the ANOVA F-test is that we do not need to pre-specify which treatments are to be compared, and we do not need to adjust for making multiple comparisons. The disadvantage of the ANOVA F-test is that if we reject the null hypothesis, we do not know which treatments can be said to be significantly different from the others, nor, if the F-test is performed at level α, can we state that the treatment pair with the greatest mean difference is significantly different at level α.

The formula for the one-way ANOVAF-test statistic is

{displaystyle F={frac {text{explained variance}}{text{unexplained variance}}},}

{displaystyle F={frac {text{between-group variability}}{text{within-group variability}}}.}

The 'explained variance', or 'between-group variability' is

{displaystyle sum _{i=1}^{K}n_{i}({bar {Y}}_{icdot }-{bar {Y}})^{2}/(K-1)}

where

{displaystyle {bar {Y}}_{icdot }}

denotes the sample mean in the i-th group,

{displaystyle n_{i}}

is the number of observations in the i-th group,

{displaystyle {bar {Y}}}

denotes the overall mean of the data, and

{displaystyle K}

denotes the number of groups.

The 'unexplained variance', or 'within-group variability' is

{displaystyle sum _{i=1}^{K}sum _{j=1}^{n_{i}}left(Y_{ij}-{bar {Y}}_{icdot }right)^{2}/(N-K),}

where

{displaystyle Y_{ij}}

is the j^th observation in the i^th out of

{displaystyle K}

groups and

{displaystyle N}

is the overall sample size. This F-statistic follows the F-distribution with degrees of freedom

{displaystyle d_{1}=K-1}

and

{displaystyle d_{2}=N-K}

under the null hypothesis. The statistic will be large if the between-group variability is large relative to the within-group variability, which is unlikely to happen if the population means of the groups all have the same value.

Note that when there are only two groups for the one-way ANOVA F-test,

{displaystyle F=t^{2}}

where t is the Student's

{displaystyle t}

statistic.

Regression problems[edit]

Consider two models, 1 and 2, where model 1 is 'nested' within model 2. Model 1 is the restricted model, and model 2 is the unrestricted one. That is, model 1 has p₁ parameters, and model 2 has p₂ parameters, where p₁ < p₂, and for any choice of parameters in model 1, the same regression curve can be achieved by some choice of the parameters of model 2.

One common context in this regard is that of deciding whether a model fits the data significantly better than does a naive model, in which the only explanatory term is the intercept term, so that all predicted values for the dependent variable are set equal to that variable's sample mean. The naive model is the restricted model, since the coefficients of all potential explanatory variables are restricted to equal zero.

Another common context is deciding whether there is a structural break in the data: here the restricted model uses all data in one regression, while the unrestricted model uses separate regressions for two different subsets of the data. This use of the F-test is known as the Chow test.

The model with more parameters will always be able to fit the data at least as well as the model with fewer parameters. Thus typically model 2 will give a better (i.e. lower error) fit to the data than model 1. But one often wants to determine whether model 2 gives a significantly better fit to the data. One approach to this problem is to use an F-test.

If there are n data points to estimate parameters of both models from, then one can calculate the F statistic, given by

{displaystyle F={frac {left({frac {{text{RSS}}_{1}-{text{RSS}}_{2}}{p_{2}-p_{1}}}right)}{left({frac {{text{RSS}}_{2}}{n-p_{2}}}right)}},}

where RSS_i is the residual sum of squares of model i. If the regression model has been calculated with weights, then replace RSS_i with χ², the weighted sum of squared residuals. Under the null hypothesis that model 2 does not provide a significantly better fit than model 1, F will have an F distribution, with (p₂−p₁, n−p₂) degrees of freedom. The null hypothesis is rejected if the F calculated from the data is greater than the critical value of the F-distribution for some desired false-rejection probability (e.g. 0.05). The F-test is a Wald test.

References[edit]

^Lomax, Richard G. (2007). Statistical Concepts: A Second Course. p. 10. ISBN0-8058-5850-4.
^Box, G. E. P. (1953). 'Non-Normality and Tests on Variances'. Biometrika. 40 (3/4): 318–335. doi:10.1093/biomet/40.3-4.318. JSTOR2333350.
^Markowski, Carol A; Markowski, Edward P. (1990). 'Conditions for the Effectiveness of a Preliminary Test of Variance'. The American Statistician. 44 (4): 322–326. doi:10.2307/2684360. JSTOR2684360.
^Sawilowsky, S. (2002). 'Fermat, Schubert, Einstein, and Behrens–Fisher: The Probable Difference Between Two Means When σ₁² ≠ σ₂²'. Journal of Modern Applied Statistical Methods. 1 (2): 461–472. Archived from the original on 2015-04-03. Retrieved 2015-03-30.

External links[edit]

Econometrics lecture (topic: hypothesis testing) on YouTube by Mark Thoma

Retrieved from 'https://en.wikipedia.org/w/index.php?title=F-test&oldid=916562604'

In statistics, the Breusch–Godfrey test, named after Trevor S. Breusch and Leslie G. Godfrey,^[1]^[2] is used to assess the validity of some of the modelling assumptions inherent in applying regression-like models to observed data series. In particular, it tests for the presence of serial correlation that has not been included in a proposed model structure and which, if present, would mean that incorrect conclusions would be drawn from other tests, or that sub-optimal estimates of model parameters are obtained if it is not taken into account. The regression models to which the test can be applied include cases where lagged values of the dependent variables are used as independent variables in the model's representation for later observations. This type of structure is common in econometric models.

Because the test is based on the idea of Lagrange multiplier testing, it is sometimes referred to as LM test for serial correlation.^[3]

A similar assessment can be also carried out with the Durbin–Watson test and the Ljung–Box test.

Background[edit]

The Breusch–Godfrey serial correlation LM test is a test for autocorrelation in the errors in a regression model. It makes use of the residuals from the model being considered in a regression analysis, and a test statistic is derived from these. The null hypothesis is that there is no serial correlation of any order up to p.^[4]

The test is more general than the Durbin–Watson statistic (or Durbin's h statistic), which is only valid for nonstochastic regressors and for testing the possibility of a first-order autoregressive model (e.g. AR(1)) for the regression errors.^{[citation needed]} The BG test has none of these restrictions, and is statistically more powerful than Durbin's h statistic.^{[citation needed]}

Procedure[edit]

Consider a linear regression of any form, for example

{displaystyle Y_{t}=beta _{1}+beta _{2}X_{t,1}+beta _{3}X_{t,2}+u_{t},}

where the errors might follow an AR(p) autoregressive scheme, as follows:

{displaystyle u_{t}=rho _{1}u_{t-1}+rho _{2}u_{t-2}+cdots +rho _{p}u_{t-p}+varepsilon _{t}.,}

The simple regression model is first fitted by ordinary least squares to obtain a set of sample residuals

{displaystyle {hat {u}}_{t}}

Breusch and Godfrey^{[citation needed]} proved that, if the following auxiliary regression model is fitted

{displaystyle {hat {u}}_{t}=alpha _{0}+alpha _{1}X_{t,1}+alpha _{2}X_{t,2}+rho _{1}{hat {u}}_{t-1}+rho _{2}{hat {u}}_{t-2}+cdots +rho _{p}{hat {u}}_{t-p}+varepsilon _{t},}

and if the usual

{displaystyle R^{2}}

statistic is calculated for this model, then the following asymptotic approximation can be used for the distribution of the test statistic

{displaystyle nR^{2},sim ,chi _{p}^{2},}

when the null hypothesis

{displaystyle {H_{0}:lbrace rho _{i}=0{text{ for all }}irbrace }}

holds (that is, there is no serial correlation of any order up to p). Here n is the number of alttext='{displaystyle {hat {u}}_{t}}'>u^t{displaystyle {hat {u}}_{t}},

{displaystyle n=T-p,}

where T is the number of observations in the basic series. Note that the value of n depends on the number of lags of the error term (p).

Software[edit]

In R, this test is performed by function bgtest, available in packagelmtest.^[5]^[6]
In Stata, this test is performed by the command estat bgodfrey.^[7]^[8]
In SAS, the GODFREY option of the MODEL statement in PROC AUTOREG provides a version of this test.
In PythonStatsmodels, the acorr_breush_godfrey function in the module statsmodels.stats.diagnostic ^[9]
In EViews, this test is already done after a regression, you just need to go to 'View' → 'Residual Diagnostics' → 'Serial Correlation LM Test'.

References[edit]

^Breusch, T. S. (1978). 'Testing for Autocorrelation in Dynamic Linear Models'. Australian Economic Papers. 17: 334–355. doi:10.1111/j.1467-8454.1978.tb00635.x.
^Godfrey, L. G. (1978). 'Testing Against General Autoregressive and Moving Average Error Models when the Regressors Include Lagged Dependent Variables'. Econometrica. 46: 1293–1301. JSTOR1913829.
^Asteriou, Dimitrios; Hall, Stephen G. (2011). 'The Breusch–Godfrey LM test for serial correlation'. Applied Econometrics (Second ed.). New York: Palgrave Macmillan. pp. 159–61. ISBN978-0-230-27182-1.
^Macrodados 6.3 Help – Econometric Tools^{[permanent dead link]}
^'lmtest: Testing Linear Regression Models'. CRAN.
^Kleiber, Christian; Zeileis, Achim (2008). 'Testing for autocorrelation'. Applied Econometrics with R. New York: Springer. pp. 104–106. ISBN978-0-387-77318-6.
^'Postestimation tools for regress with time series'(PDF). Stata Manual.
^Baum, Christopher F. (2006). 'Testing for serial correlation'. An Introduction to Modern Econometrics Using Stata. Stata Press. pp. 155–158. ISBN1-59718-013-0.
^Breusch-Godfrey test in Python http://statsmodels.sourceforge.net/devel/generated/statsmodels.stats.diagnostic.acorr_breush_godfrey.html?highlight=autocorrelationArchived 2014-02-28 at the Wayback Machine

Common examples[edit]

F-test of the equality of two variances[edit]

Formula and calculation[edit]

Multiple-comparison ANOVA problems[edit]

Regression problems[edit]

References[edit]

Further reading[edit]

External links[edit]

Background[edit]

Procedure[edit]

Software[edit]

See also[edit]

References[edit]

Further reading[edit]