Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*Unverified author*

R Software Module

rwasp_boxcoxlin.wasp

Title produced by software

Box-Cox Linearity Plot

Date of computation

Thu, 13 Nov 2008 15:25:05 -0700

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2008/Nov/13/t1226615174fq8fgkwhplkh5yt.htm/, Retrieved Sun, 19 May 2024 10:51:54 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=24870, Retrieved Sun, 19 May 2024 10:51:54 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

Estimated Impact

160

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

F       [Box-Cox Linearity Plot] [] [2008-11-13 22:25:05] [0655940460a4fd80d3d4d54548b75d49] [Current]

Feedback Forum

2008-11-19 14:51:25 [Sam De Cuyper] [reply] 
Weer geen interpretatie aan de berekeningen gegeven. De box-cox linearity plot geeft de weegave van het verband tussen 2 variabelen die met elkaar in verband staan. Het resultaat is een stijgende of een dalende rechte (bestudeerd wetmatigheid) met geconcentreerde punten. Het is de bedoeling om de variabelen te transformeren (X-variabele) en zo de scatterplot meer lineair te maken. Nu kan echter de vraag gesteld worden of de transfomatie nuttig is. Indien de grafiek een maximum vertoont zal de waarde van het maximum gekozen worden als lambda. Na transformatie is er visueel weinig verschil te merken, waardoor de transformatie onnuttig is. Ze heeft geen of toch zeer weinig effect.
2008-11-24 19:03:39 [Birgit Van Dyck] [reply] 
De student geeft geen interpretatie bij de berekeningen.Een box cox linearity plot transformeert de variabelen, deze transformatie moet nuttig zijn om de scatterplot meer lineair te maken. het is de bedoeling dat de grafiek een maximum vertoont, deze waarde wordt dan gebruikt als lambda. Na de transformatie blijkt er weinig verschil te zijn. De transformatie was onnuttig.  
2008-11-24 19:41:37 [Jasmine Hendrikx] [reply] 
Evaluatie Q3: 
Er is hier wel een berekening gemaakt, maar er is niets in het document geplakt of besproken. Men gaat bij een box-cox linearity plot proberen om een transformatie uit te voeren op de variabele X zodat het verband tussen de twee variabelen (X en Y) groter wordt, dus zodanig dat de scatterplot tussen X en Y zo dicht mogelijk op de rechte ligt. Men gaat het verband dus meer lineair proberen te maken zodat het beter door een rechte benaderd kan worden. Bij de box-cox linearity plot zien we op de Y-as de correlatiecoëfficiënt van de getransformeerde X en Y. Op de horizontale as staat de waarde voor lambda (dit is de transformatieparameter). Men laat deze waarde schommelen van -2 tot 2.  De waarde van lambda die correspondeert met de maximum correlatie is dan de optimale keuze voor lambda. We zien dat de optimale lambda hier 0.49 (zie output) bedraagt, aangezien de correlatiecoëfficiënt dan zijn hoogste punt bereikt (38.55%). De correlatie heeft  betrekking op de scatterplot. Als we echter deze transformatie uitvoeren met lambda gelijk aan 0.49, zien we op de grafieken dat het lineaire verband van de originele data en de transformed data hetzelfde blijft. Er is geen verbetering opgetreden. De standaardafwijking is ook zo goed als hetzelfde gebleven. 
 
De box-cox normality plot (die normaal berekend moest zijn in Q4) is niet berekend. Vandaar dat ik hier even kort zal vermelden wat deze juist doet. Een box-cox normality plot probeert er voor te zorgen dat de verdeling van de tijdreeks meer op een normaalverdeling gaat lijken. Op de y-as van de box-cox normality plot vinden we de correlatiecoëfficiënt terug. Deze correlatie heeft betrekking op de normal QQ plot. Op de x-as vinden we de lambda terug die varieert van -2 tot +2.  
Een belangrijk verschil tussen deze twee transformaties is dat een box-cox linearity plot betrekking heeft op 2 variabelen, terwijl een box-cox normality plot betrekking heeft op 1 variabele. 

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

3.3
2.86
2.27
1.95
2.98
1.71
1.31
1.37
1.8
2.14
2.05
2.43
5.28
4.07
3.24
1.22
1.18
1
1.18
1.86
2.38
1.48
1.62
2.44
3.91
3.83
2.9
1.67
1.19
1.26
1.6
2.61
2.19
1.46
2.17
2.6
4.33
2.9
2.05
1.51
1.19
1.08
1.1
1.39
1.35
1.69
2.35
3.7
3.55
3.75
4.23
2.13
1.33
1.46
2.1
1.76
1.28
1.26
1.99
3.06
3.33
4.02
2.43
1.39
1.52
1.75
2.22
2.57
2.37
1.69
2.71
3.06
4.64
3.22
2.35
2.01
1.49
1.31
1.29
1.33
1.33
1.39
2.39
3.04

Dataseries Y:

Download CSV

Histogram

2.36
1.95
2.16
2.76
2.09
1.49
1.17
1.3
1.26
2.17
2.03
2.18
2.61
2.58
3.86
3.81
2.41
1.47
1.33
1.38
1.57
2.6
2.18
2.36
2.24
2.41
2.51
2.98
1.87
1.9
1.47
1.45
2.71
2.9
2.11
2.18
2.24
2.05
2.42
2.77
1.99
1.47
1.09
0.93
1.32
2.03
2.04
2.78
2.8
3.03
3.11
2.75
2.78
1.76
1.29
1.28
1.43
1.71
1.89
1.84
2.08
2.09
2.36
2.99
2.75
1.58
1.69
1.3
1.97
1.84
1.96
1.86
2.75
2.62
2.41
3.61
2.03
1.45
1.4
1.3
1.58
2.1
2.27
2.54

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	3 seconds
R Server	'Herman Ole Andreas Wold' @ 193.190.124.10:1001

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 3 seconds \tabularnewline
R Server & 'Herman Ole Andreas Wold' @ 193.190.124.10:1001 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=24870&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]3 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Herman Ole Andreas Wold' @ 193.190.124.10:1001[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=24870&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=24870&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	3 seconds
R Server	'Herman Ole Andreas Wold' @ 193.190.124.10:1001

Box-Cox Linearity Plot
# observations x	84
maximum correlation	0.385500318587828
optimal lambda(x)	0.49
Residual SD (orginial)	0.576474508717216
Residual SD (transformed)	0.575980442213128

\begin{tabular}{lllllllll}
\hline
Box-Cox Linearity Plot \tabularnewline
# observations x & 84 \tabularnewline
maximum correlation & 0.385500318587828 \tabularnewline
optimal lambda(x) & 0.49 \tabularnewline
Residual SD (orginial) & 0.576474508717216 \tabularnewline
Residual SD (transformed) & 0.575980442213128 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=24870&T=1

[TABLE]
[ROW][C]Box-Cox Linearity Plot[/C][/ROW]
[ROW][C]# observations x[/C][C]84[/C][/ROW]
[ROW][C]maximum correlation[/C][C]0.385500318587828[/C][/ROW]
[ROW][C]optimal lambda(x)[/C][C]0.49[/C][/ROW]
[ROW][C]Residual SD (orginial)[/C][C]0.576474508717216[/C][/ROW]
[ROW][C]Residual SD (transformed)[/C][C]0.575980442213128[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=24870&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=24870&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Box-Cox Linearity Plot
# observations x	84
maximum correlation	0.385500318587828
optimal lambda(x)	0.49
Residual SD (orginial)	0.576474508717216
Residual SD (transformed)	0.575980442213128

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Parameters (Session):

Parameters (R input):

R code (references can be found in the software module):

n <- length(x)
c <- array(NA,dim=c(401))
l <- array(NA,dim=c(401))
mx <- 0
mxli <- -999
for (i in 1:401)
{
l[i] <- (i-201)/100
if (l[i] != 0)
{
x1 <- (x^l[i] - 1) / l[i]
} else {
x1 <- log(x)
}
c[i] <- cor(x1,y)
if (mx < abs(c[i]))
{
mx <- abs(c[i])
mxli <- l[i]
}
}
c
mx
mxli
if (mxli != 0)
{
x1 <- (x^mxli - 1) / mxli
} else {
x1 <- log(x)
}
r<-lm(y~x)
se <- sqrt(var(r$residuals))
r1 <- lm(y~x1)
se1 <- sqrt(var(r1$residuals))
bitmap(file='test1.png')
plot(l,c,main='Box-Cox Linearity Plot',xlab='Lambda',ylab='correlation')
grid()
dev.off()
bitmap(file='test2.png')
plot(x,y,main='Linear Fit of Original Data',xlab='x',ylab='y')
abline(r)
grid()
mtext(paste('Residual Standard Deviation = ',se))
dev.off()
bitmap(file='test3.png')
plot(x1,y,main='Linear Fit of Transformed Data',xlab='x',ylab='y')
abline(r1)
grid()
mtext(paste('Residual Standard Deviation = ',se1))
dev.off()
load(file='createtable')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Box-Cox Linearity Plot',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'# observations x',header=TRUE)
a<-table.element(a,n)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'maximum correlation',header=TRUE)
a<-table.element(a,mx)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'optimal lambda(x)',header=TRUE)
a<-table.element(a,mxli)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Residual SD (orginial)',header=TRUE)
a<-table.element(a,se)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Residual SD (transformed)',header=TRUE)
a<-table.element(a,se1)
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable.tab')

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code