Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*The author of this computation has been verified*

R Software Module

rwasp_boxcoxlin.wasp

Title produced by software

Box-Cox Linearity Plot

Date of computation

Tue, 11 Nov 2008 12:37:42 -0700

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2008/Nov/11/t1226432384uzjbp698cjvt4eo.htm/, Retrieved Sun, 19 May 2024 10:51:12 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=23887, Retrieved Sun, 19 May 2024 10:51:12 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

Estimated Impact

166

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

F       [Box-Cox Linearity Plot] [Various EDA topic...] [2008-11-11 19:37:42] [3bb0537fcae9c337e49b9ce75ff3d4da] [Current]
- R  D    [Box-Cox Linearity Plot] [Various EDA Topic...] [2008-11-18 19:19:09] [82970caad4b026be9dd352fdec547fe4] 
- RM D    [Box-Cox Normality Plot] [Various EDA Topic...] [2008-11-18 19:49:49] [82970caad4b026be9dd352fdec547fe4] 

Feedback Forum

2008-11-12 16:17:49 [Veerle Jackers] [reply] 
Dit is een twijfelgeval. Het lijkt alsof de curve nog niet helemaal een maximum bereikt heeft, maar best is toch te zeggen dat de optimale lambda 2 is. Daar is de  correlatie het grootst. 
2008-11-18 19:24:36 [Ruben Jacobs] [reply] 
Bij de Box Cox Transformatie gaat men een nieuwe variabele creëren voor X; men verheft X tot een macht door een bepaalde waarde benoemd als Lambda, hier gaat men dan 1 van aftrekken en deze nieuwe waarde dan delen door de Lambda.  
 
Het is dus de bedoeling om een geschikte Lambda waarde te kiezen zodat de X-variabele getransformeerd wordt; om een liniair verband aan te tonen (dat eerst niet liniair was). 
 
De geschikte lamda-waarde wordt gegeven in de tabel en kan je ook in je eerste grafiek zien waar de grafiek zijn maximum bereikt. In jouw berekening kan je eigenlijk zien dat de grafiek zijn maximum nog niet bereikt heeft en daarom krijg je als lambda waarde 2. Dat is omdat de berekening tot daar begrensd is. Je kan dit aanpassen in de R-code en dan krijg je een iets beter resultaat met als lambda waarde 3.11. 
 
http://www.freestatistics.org/blog/index.php?v=date/2008/Nov/18/t1227036011nvcu7klv9u4jtnh.htm 
 
Je kan opmerken bij jouw berekening dat er weinig veranderd is tussen grafiek 2 (oud) en 3 (nieuw met lambda). De risidu-waarden komen bijna overéén. 
 
2008-11-20 15:54:34 [Bas van Keken] [reply] 
Het veranderen van de R-code zoals hierboven beschreven moet als volgt gebeuren: 
l[i] <- (i-201)/100 moet worden l[i] <- (i-50)/100 
(l=lambda). Het getal 50 is een voorbeeld.
2008-11-20 22:07:05 [Olivier Uyttendaele] [reply] 
Algemeen kan je zeggen dat het Box Cox Linearity Plot een manier is om tijdreeksen goed en snel te transformeren. Bedoeling is te onderzoeken of er een lambda parameter bestaat zodanig je een lineair verband krijgt. 
 
In de R-code wordt een nieuwe variabele gecreëerd nl. x1. Deze x1 is eigenlijk de oorspronkelijke variabele x verheven tot de macht lambda -1 en dit dan opnieuw gedeeld door lambda. Hierdoor transformeer je dus de tijdreeks. Je moet op zoek gaan naar de optimale lambda om de tijdreeks te transformeren.  
 
Grafisch kan je in de grafiek zien. Als je in de grafiek naar het maximum van de rechte kijkt, moet je op die waarde de lambda nemen. Als je niet zeker bent van het maximum, kan je eventueel de R code aanpassen(zoals studenten hierboven al hebben uitgelegd) zodanig dat het maximum wel zichtbaar wordt. Wanneer je geen maximum ziet, kan je dus bijgevolg geen besluit trekken. 
 
Concluderend kan je zeggen dat een Box Cox Plot je een antwoord kan geven op; 1) is een transformatie aangewezen de reeks? 2) Wat is de beste waarde voor de transformatie parameter? 
2008-11-24 19:46:59 [Steven Hulsmans] [reply] 
Deze techniek transformeert de variabelen. We moeten evenwel de R code aanpassen.

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

Dataseries Y:

Download CSV

Histogram

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	3 seconds
R Server	'Herman Ole Andreas Wold' @ 193.190.124.10:1001

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 3 seconds \tabularnewline
R Server & 'Herman Ole Andreas Wold' @ 193.190.124.10:1001 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=23887&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]3 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Herman Ole Andreas Wold' @ 193.190.124.10:1001[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=23887&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=23887&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	3 seconds
R Server	'Herman Ole Andreas Wold' @ 193.190.124.10:1001

Box-Cox Linearity Plot
# observations x	67
maximum correlation	0.750022684575245
optimal lambda(x)	2
Residual SD (orginial)	12.0722315297328
Residual SD (transformed)	12.0393246304236

\begin{tabular}{lllllllll}
\hline
Box-Cox Linearity Plot \tabularnewline
# observations x & 67 \tabularnewline
maximum correlation & 0.750022684575245 \tabularnewline
optimal lambda(x) & 2 \tabularnewline
Residual SD (orginial) & 12.0722315297328 \tabularnewline
Residual SD (transformed) & 12.0393246304236 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=23887&T=1

[TABLE]
[ROW][C]Box-Cox Linearity Plot[/C][/ROW]
[ROW][C]# observations x[/C][C]67[/C][/ROW]
[ROW][C]maximum correlation[/C][C]0.750022684575245[/C][/ROW]
[ROW][C]optimal lambda(x)[/C][C]2[/C][/ROW]
[ROW][C]Residual SD (orginial)[/C][C]12.0722315297328[/C][/ROW]
[ROW][C]Residual SD (transformed)[/C][C]12.0393246304236[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=23887&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=23887&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Box-Cox Linearity Plot
# observations x	67
maximum correlation	0.750022684575245
optimal lambda(x)	2
Residual SD (orginial)	12.0722315297328
Residual SD (transformed)	12.0393246304236

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Parameters (Session):

Parameters (R input):

R code (references can be found in the software module):

n <- length(x)
c <- array(NA,dim=c(401))
l <- array(NA,dim=c(401))
mx <- 0
mxli <- -999
for (i in 1:401)
{
l[i] <- (i-201)/100
if (l[i] != 0)
{
x1 <- (x^l[i] - 1) / l[i]
} else {
x1 <- log(x)
}
c[i] <- cor(x1,y)
if (mx < abs(c[i]))
{
mx <- abs(c[i])
mxli <- l[i]
}
}
c
mx
mxli
if (mxli != 0)
{
x1 <- (x^mxli - 1) / mxli
} else {
x1 <- log(x)
}
r<-lm(y~x)
se <- sqrt(var(r$residuals))
r1 <- lm(y~x1)
se1 <- sqrt(var(r1$residuals))
bitmap(file='test1.png')
plot(l,c,main='Box-Cox Linearity Plot',xlab='Lambda',ylab='correlation')
grid()
dev.off()
bitmap(file='test2.png')
plot(x,y,main='Linear Fit of Original Data',xlab='x',ylab='y')
abline(r)
grid()
mtext(paste('Residual Standard Deviation = ',se))
dev.off()
bitmap(file='test3.png')
plot(x1,y,main='Linear Fit of Transformed Data',xlab='x',ylab='y')
abline(r1)
grid()
mtext(paste('Residual Standard Deviation = ',se1))
dev.off()
load(file='createtable')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Box-Cox Linearity Plot',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'# observations x',header=TRUE)
a<-table.element(a,n)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'maximum correlation',header=TRUE)
a<-table.element(a,mx)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'optimal lambda(x)',header=TRUE)
a<-table.element(a,mxli)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Residual SD (orginial)',header=TRUE)
a<-table.element(a,se)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Residual SD (transformed)',header=TRUE)
a<-table.element(a,se1)
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable.tab')

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code