Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*Unverified author*

R Software Module

rwasp_cloud.wasp

Title produced by software

Trivariate Scatterplots

Date of computation

Tue, 11 Nov 2008 12:40:32 -0700

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2008/Nov/11/t1226432527j2el78xlena2eaw.htm/, Retrieved Sun, 19 May 2024 12:38:49 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=23889, Retrieved Sun, 19 May 2024 12:38:49 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

Estimated Impact

135

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

F     [Partial Correlation] [Partial correlation] [2008-11-04 19:15:33] [077ffec662d24c06be4c491541a44245]
F   P   [Partial Correlation] [Partial correlation] [2008-11-11 16:11:52] [73d6180dc45497329efd1b6934a84aba]
F   P     [Partial Correlation] [Partial Correlation] [2008-11-11 17:31:20] [6816386b1f3c2f6c0c9f2aa1e5bc9362]
F RMP         [Trivariate Scatterplots] [] [2008-11-11 19:40:32] [81dc0ee785f23261ccd6abf7aef76c2a] [Current]

Feedback Forum

2008-11-20 17:17:49 [Olivier Uyttendaele] [reply] 
Dit is voor een gedeelte juist, je berekent inderdaad de associatie van 3 variabelen maar je gaat ook opzoek naar de partiele correlatie.  
De correlatie van de 2 variabelen X & Y van hierboven (bivariate density) noem je bijvoorbeeld de simpele correlatie r(X,Y). Bij dit model introduceer je dan een 3de variabele Z. Bedoeling hier is om te bekijken of Z -misschien wel of misschien niet- een invloed heeft de relatie tussen X & Y.  
 
Via de partial correlation te berekenen tussen X & Y kan je dus nagaan of Z een factor is die invloed heeft  Dit wordt dan r(X,Y|Z). Als r(X,Y) relatief groot is, en r(X,Y|Z) is veel kleiner, dan kan je veronderstellen dat Z een invloedrijke variabele is. Z kan dus voor een gedeelte de relatie uitleggen tussen X & Y. Het zal de relatie uitleggen maar we zullen niet te weten komen wat de relatie veroorzaakt. 
 
Als er 3 variabelen zijn, kan je dus drie eenvoudige correlaties maken nl. r(X,Y), r(X,Z) en r(Y,Z). Wanneer je deze drie correlaties kent, kan je gemakkelijk de partiele correlatie berekenen. Vb; r (X,Z|Y).  
Je zal dus telkens op deze manier bij deze de banden met de variabele wegwissen. Wanneer de partial correlation r(X,Y|Z) dicht bij de simpele correlatie ligt, dan kan gesteld worden dat Z weinig invloed heeft op de correlatie tussen X,Y. 
2008-11-20 17:22:49 [Olivier Uyttendaele] [reply] 
Bovenstaand uitleg hoort bij partial correlation, dit is een misverstand va
2008-11-20 17:23:48 [Olivier Uyttendaele] [reply] 
an mij, hieronder staat de feedback die bij dit model hoort: 
 
 
De commentaar die je geeft in het document is vrij correct, maar toch zijn er enkele aandachtspunten en verbeteringen. 
 
Wat de kubussen betreft, moet je opletten met de interpretatie. Zij geven ons een verkeerd beeld, er is niet echt een duidelijk patroon te zien. Dit is logisch aangezien het een 3D figuur betreft die op een 2D scherm wordt geprojecteerd. Je kan niet zien hoe de afstand tussen de punten zich verhoudt. De punten worden meer op 1 lijn geprojecteerd (2D scherm). 
 
In de matrix daaronder krijg je dan een projectie van de kubussen (scatterplots), deze kunnen wederom een verkeerd beeld geven aangezien je nog een dimensie moet toevoegen. Op de hoofddiagonaal staan de histogrammen. 
 
Uit de bovenstaande scatterplots wordt dan een bivariate Kernel density plot getekend. 
Hier wordt wederom zoals bij Q1 gewerkt met hoogtelijnen die punten met een zelfde dichtheid gaan verbinden. Aangezien je hier duidelijk clusters kunt waarnemen, geeft dit plot meer en duidelijkere info dan een scatterplot & correlatie. 
2008-11-23 10:09:35 [Inge Meelberghs] [reply] 
Bij het interpreteren van de kubus moet je opletten! Hoe je deze ook bekijkt, er zal altijd vertekening zijn. Dit komt doordat de kubus een 3D figuur is dat op een 2D scherm wordt afgebeeld. Het is dus heel moeilijk om  hier inzicht in te krijgen.  
 
De trivariate scatterplots geeft een projectie van bovenstaande kubussen weer. Net als de kubus geeft ook deze grafiek een vertekend beeld omdat de scaterplots 2 dimensionseel zijn en er met de derde variable hier dus geen rekening wordt gehouden.  
 
Je kan dus best de bivariate scatter plot gebruiken. Door gebruik van deze grafiek kan je op een makkelijke manier twee variabelen met elkaar vergelijken. Bij deze techniek wordt gebruik gemaakt van hoogtelijnen die punten van gelijke dichtheid met elkaar verbinden . Op de grafiek kan je zien dat er verschillende zones voorkomen door de kleurverandering. De rode zone duidt op een sterke correlatie, de groene en de gele duiden op een eerdere zwakke correlatie. Ook hier kunnen we dan weer afleiden dat er een sterk lineair verband is tussen de uitvoer van België aan landen buiten de EU en de uitvoer van Vlaanderen.
2008-11-23 10:17:35 [Bonifer Spillemaeckers] [reply] 
Deze vraag heb ik niet echt goed geinterpreteerd. Bij deze plots moet je goed opletten met de kubussen. Deze geven eigenlijk een vertekend beeld, omdat ze driedimensionaal gepresenteerd worden. Het is moeilijk te zien hoe de verhouding is tussen de afstanden van de punten. Je kan hiervoor beter gaan kijken naar plots die tweedimensionaal gepresenteerd worden. Zo kan je te weten of er al dan niet een verband bestaat tussen bepaalde data. 

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

Dataseries Y:

Download CSV

Histogram

Dataseries Z:

Download CSV

Histogram

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	5 seconds
R Server	'Sir Ronald Aylmer Fisher' @ 193.190.124.24

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 5 seconds \tabularnewline
R Server & 'Sir Ronald Aylmer Fisher' @ 193.190.124.24 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=23889&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]5 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Sir Ronald Aylmer Fisher' @ 193.190.124.24[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=23889&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=23889&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	5 seconds
R Server	'Sir Ronald Aylmer Fisher' @ 193.190.124.24

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Figure 4

PNG link

Postscript link

PDF link

Figure 5

PNG link

Postscript link

PDF link

Figure 6

PNG link

Postscript link

PDF link

Figure 7

PNG link

Postscript link

PDF link

Parameters (Session):

par1 = 50 ; par2 = 50 ; par3 = Y ; par4 = Y ; par5 = Variable X ; par6 = Variable Y ; par7 = Variable Z ;

Parameters (R input):

par1 = 50 ; par2 = 50 ; par3 = Y ; par4 = Y ; par5 = Variable X ; par6 = Variable Y ; par7 = Variable Z ;

R code (references can be found in the software module):

x <- array(x,dim=c(length(x),1))
colnames(x) <- par5
y <- array(y,dim=c(length(y),1))
colnames(y) <- par6
z <- array(z,dim=c(length(z),1))
colnames(z) <- par7
d <- data.frame(cbind(z,y,x))
colnames(d) <- list(par7,par6,par5)
par1 <- as.numeric(par1)
par2 <- as.numeric(par2)
if (par1>500) par1 <- 500
if (par2>500) par2 <- 500
if (par1<10) par1 <- 10
if (par2<10) par2 <- 10
library(GenKern)
library(lattice)
panel.hist <- function(x, ...)
{
usr <- par('usr'); on.exit(par(usr))
par(usr = c(usr[1:2], 0, 1.5) )
h <- hist(x, plot = FALSE)
breaks <- h$breaks; nB <- length(breaks)
y <- h$counts; y <- y/max(y)
rect(breaks[-nB], 0, breaks[-1], y, col='black', ...)
}
bitmap(file='cloud1.png')
cloud(z~x*y, screen = list(x=-45, y=45, z=35),xlab=par5,ylab=par6,zlab=par7)
dev.off()
bitmap(file='cloud2.png')
cloud(z~x*y, screen = list(x=35, y=45, z=25),xlab=par5,ylab=par6,zlab=par7)
dev.off()
bitmap(file='cloud3.png')
cloud(z~x*y, screen = list(x=35, y=-25, z=90),xlab=par5,ylab=par6,zlab=par7)
dev.off()
bitmap(file='pairs.png')
pairs(d,diag.panel=panel.hist)
dev.off()
x <- as.vector(x)
y <- as.vector(y)
z <- as.vector(z)
bitmap(file='bidensity1.png')
op <- KernSur(x,y, xgridsize=par1, ygridsize=par2, correlation=cor(x,y), xbandwidth=dpik(x), ybandwidth=dpik(y))
image(op$xords, op$yords, op$zden, col=terrain.colors(100), axes=TRUE,main='Bivariate Kernel Density Plot (x,y)',xlab=par5,ylab=par6)
if (par3=='Y') contour(op$xords, op$yords, op$zden, add=TRUE)
if (par4=='Y') points(x,y)
(r<-lm(y ~ x))
abline(r)
box()
dev.off()
bitmap(file='bidensity2.png')
op <- KernSur(y,z, xgridsize=par1, ygridsize=par2, correlation=cor(y,z), xbandwidth=dpik(y), ybandwidth=dpik(z))
op
image(op$xords, op$yords, op$zden, col=terrain.colors(100), axes=TRUE,main='Bivariate Kernel Density Plot (y,z)',xlab=par6,ylab=par7)
if (par3=='Y') contour(op$xords, op$yords, op$zden, add=TRUE)
if (par4=='Y') points(y,z)
(r<-lm(z ~ y))
abline(r)
box()
dev.off()
bitmap(file='bidensity3.png')
op <- KernSur(x,z, xgridsize=par1, ygridsize=par2, correlation=cor(x,z), xbandwidth=dpik(x), ybandwidth=dpik(z))
op
image(op$xords, op$yords, op$zden, col=terrain.colors(100), axes=TRUE,main='Bivariate Kernel Density Plot (x,z)',xlab=par5,ylab=par7)
if (par3=='Y') contour(op$xords, op$yords, op$zden, add=TRUE)
if (par4=='Y') points(x,z)
(r<-lm(z ~ x))
abline(r)
box()
dev.off()

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code