Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*The author of this computation has been verified*

R Software Module

rwasp_regression_trees1.wasp

Title produced by software

Recursive Partitioning (Regression Trees)

Date of computation

Fri, 10 Dec 2010 12:54:47 +0000

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2010/Dec/10/t129198557414vs5cn9zsm610q.htm/, Retrieved Mon, 29 Apr 2024 09:46:59 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=107621, Retrieved Mon, 29 Apr 2024 09:46:59 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

Estimated Impact

180

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

-     [Recursive Partitioning (Regression Trees)] [] [2010-12-05 19:35:21] [b98453cac15ba1066b407e146608df68]
-   PD    [Recursive Partitioning (Regression Trees)] [Brutoloonindex] [2010-12-10 12:54:47] [8e16b01a5be2b3f7f3ad6418d9d6fd5b] [Current]
-   P       [Recursive Partitioning (Regression Trees)] [Brutoloonindex] [2010-12-10 12:56:41] [3074aa973ede76ac75d398946b01602f] 

Feedback Forum

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

2148	77.405	82.145	315.4
2118	85.056	78.213	329.3
1603	90.088	88.099	308.2
2066	99.285	106.25	335.8
2095	80.428	80.487	343.7
2210	88.017	80.336	349.2
1609	93.489	90.065	312.4
1964	103.961	108.888	337.6
2114	82.591	82.747	360.2
2054	90.913	82.213	372.1
1424	96.787	93.41	341.8
2025	106.045	109.465	377.4
2003	84.752	84.373	337.2
2017	94.173	98.715	384.6
1528	97.733	99.646	358.6
2130	108.499	115.239	383.4
2017	87.972	89.082	384.4
2260	96.091	89.934	402.7
1805	101.846	99.957	372.1
2394	115.652	122.717	364.9
2586	91.269	95.895	314.9
2429	100.911	97.085	320.7
1910	105.248	109.414	308.6
2515	118.681	126.945	328.7

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	3 seconds
R Server	'RServer@AstonUniversity' @ vre.aston.ac.uk

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 3 seconds \tabularnewline
R Server & 'RServer@AstonUniversity' @ vre.aston.ac.uk \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=107621&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]3 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'RServer@AstonUniversity' @ vre.aston.ac.uk[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=107621&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=107621&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	3 seconds
R Server	'RServer@AstonUniversity' @ vre.aston.ac.uk

Goodness of Fit
Correlation	0.7958
R-squared	0.6333
RMSE	6.318

\begin{tabular}{lllllllll}
\hline
Goodness of Fit \tabularnewline
Correlation & 0.7958 \tabularnewline
R-squared & 0.6333 \tabularnewline
RMSE & 6.318 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=107621&T=1

[TABLE]
[ROW][C]Goodness of Fit[/C][/ROW]
[ROW][C]Correlation[/C][C]0.7958[/C][/ROW]
[ROW][C]R-squared[/C][C]0.6333[/C][/ROW]
[ROW][C]RMSE[/C][C]6.318[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=107621&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=107621&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Goodness of Fit
Correlation	0.7958
R-squared	0.6333
RMSE	6.318

Actuals, Predictions, and Residuals
#	Actuals	Forecasts	Residuals
1	77.405	88.066	-10.661
2	85.056	88.066	-3.01000000000001
3	90.088	88.066	2.02199999999999
4	99.285	104.730363636364	-5.44536363636364
5	80.428	88.066	-7.638
6	88.017	88.066	-0.0490000000000066
7	93.489	88.066	5.423
8	103.961	104.730363636364	-0.769363636363636
9	82.591	88.066	-5.47500000000001
10	90.913	88.066	2.84699999999999
11	96.787	88.066	8.721
12	106.045	104.730363636364	1.31463636363637
13	84.752	88.066	-3.31400000000001
14	94.173	104.730363636364	-10.5573636363636
15	97.733	104.730363636364	-6.99736363636363
16	108.499	104.730363636364	3.76863636363636
17	87.972	88.066	-0.0940000000000083
18	96.091	88.066	8.02499999999999
19	101.846	104.730363636364	-2.88436363636363
20	115.652	104.730363636364	10.9216363636364
21	91.269	88.066	3.203
22	100.911	104.730363636364	-3.81936363636363
23	105.248	104.730363636364	0.51763636363637
24	118.681	104.730363636364	13.9506363636364

\begin{tabular}{lllllllll}
\hline
Actuals, Predictions, and Residuals \tabularnewline
# & Actuals & Forecasts & Residuals \tabularnewline
1 & 77.405 & 88.066 & -10.661 \tabularnewline
2 & 85.056 & 88.066 & -3.01000000000001 \tabularnewline
3 & 90.088 & 88.066 & 2.02199999999999 \tabularnewline
4 & 99.285 & 104.730363636364 & -5.44536363636364 \tabularnewline
5 & 80.428 & 88.066 & -7.638 \tabularnewline
6 & 88.017 & 88.066 & -0.0490000000000066 \tabularnewline
7 & 93.489 & 88.066 & 5.423 \tabularnewline
8 & 103.961 & 104.730363636364 & -0.769363636363636 \tabularnewline
9 & 82.591 & 88.066 & -5.47500000000001 \tabularnewline
10 & 90.913 & 88.066 & 2.84699999999999 \tabularnewline
11 & 96.787 & 88.066 & 8.721 \tabularnewline
12 & 106.045 & 104.730363636364 & 1.31463636363637 \tabularnewline
13 & 84.752 & 88.066 & -3.31400000000001 \tabularnewline
14 & 94.173 & 104.730363636364 & -10.5573636363636 \tabularnewline
15 & 97.733 & 104.730363636364 & -6.99736363636363 \tabularnewline
16 & 108.499 & 104.730363636364 & 3.76863636363636 \tabularnewline
17 & 87.972 & 88.066 & -0.0940000000000083 \tabularnewline
18 & 96.091 & 88.066 & 8.02499999999999 \tabularnewline
19 & 101.846 & 104.730363636364 & -2.88436363636363 \tabularnewline
20 & 115.652 & 104.730363636364 & 10.9216363636364 \tabularnewline
21 & 91.269 & 88.066 & 3.203 \tabularnewline
22 & 100.911 & 104.730363636364 & -3.81936363636363 \tabularnewline
23 & 105.248 & 104.730363636364 & 0.51763636363637 \tabularnewline
24 & 118.681 & 104.730363636364 & 13.9506363636364 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=107621&T=2

[TABLE]
[ROW][C]Actuals, Predictions, and Residuals[/C][/ROW]
[ROW][C]#[/C][C]Actuals[/C][C]Forecasts[/C][C]Residuals[/C][/ROW]
[ROW][C]1[/C][C]77.405[/C][C]88.066[/C][C]-10.661[/C][/ROW]
[ROW][C]2[/C][C]85.056[/C][C]88.066[/C][C]-3.01000000000001[/C][/ROW]
[ROW][C]3[/C][C]90.088[/C][C]88.066[/C][C]2.02199999999999[/C][/ROW]
[ROW][C]4[/C][C]99.285[/C][C]104.730363636364[/C][C]-5.44536363636364[/C][/ROW]
[ROW][C]5[/C][C]80.428[/C][C]88.066[/C][C]-7.638[/C][/ROW]
[ROW][C]6[/C][C]88.017[/C][C]88.066[/C][C]-0.0490000000000066[/C][/ROW]
[ROW][C]7[/C][C]93.489[/C][C]88.066[/C][C]5.423[/C][/ROW]
[ROW][C]8[/C][C]103.961[/C][C]104.730363636364[/C][C]-0.769363636363636[/C][/ROW]
[ROW][C]9[/C][C]82.591[/C][C]88.066[/C][C]-5.47500000000001[/C][/ROW]
[ROW][C]10[/C][C]90.913[/C][C]88.066[/C][C]2.84699999999999[/C][/ROW]
[ROW][C]11[/C][C]96.787[/C][C]88.066[/C][C]8.721[/C][/ROW]
[ROW][C]12[/C][C]106.045[/C][C]104.730363636364[/C][C]1.31463636363637[/C][/ROW]
[ROW][C]13[/C][C]84.752[/C][C]88.066[/C][C]-3.31400000000001[/C][/ROW]
[ROW][C]14[/C][C]94.173[/C][C]104.730363636364[/C][C]-10.5573636363636[/C][/ROW]
[ROW][C]15[/C][C]97.733[/C][C]104.730363636364[/C][C]-6.99736363636363[/C][/ROW]
[ROW][C]16[/C][C]108.499[/C][C]104.730363636364[/C][C]3.76863636363636[/C][/ROW]
[ROW][C]17[/C][C]87.972[/C][C]88.066[/C][C]-0.0940000000000083[/C][/ROW]
[ROW][C]18[/C][C]96.091[/C][C]88.066[/C][C]8.02499999999999[/C][/ROW]
[ROW][C]19[/C][C]101.846[/C][C]104.730363636364[/C][C]-2.88436363636363[/C][/ROW]
[ROW][C]20[/C][C]115.652[/C][C]104.730363636364[/C][C]10.9216363636364[/C][/ROW]
[ROW][C]21[/C][C]91.269[/C][C]88.066[/C][C]3.203[/C][/ROW]
[ROW][C]22[/C][C]100.911[/C][C]104.730363636364[/C][C]-3.81936363636363[/C][/ROW]
[ROW][C]23[/C][C]105.248[/C][C]104.730363636364[/C][C]0.51763636363637[/C][/ROW]
[ROW][C]24[/C][C]118.681[/C][C]104.730363636364[/C][C]13.9506363636364[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=107621&T=2

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=107621&T=2

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Actuals, Predictions, and Residuals
#	Actuals	Forecasts	Residuals
1	77.405	88.066	-10.661
2	85.056	88.066	-3.01000000000001
3	90.088	88.066	2.02199999999999
4	99.285	104.730363636364	-5.44536363636364
5	80.428	88.066	-7.638
6	88.017	88.066	-0.0490000000000066
7	93.489	88.066	5.423
8	103.961	104.730363636364	-0.769363636363636
9	82.591	88.066	-5.47500000000001
10	90.913	88.066	2.84699999999999
11	96.787	88.066	8.721
12	106.045	104.730363636364	1.31463636363637
13	84.752	88.066	-3.31400000000001
14	94.173	104.730363636364	-10.5573636363636
15	97.733	104.730363636364	-6.99736363636363
16	108.499	104.730363636364	3.76863636363636
17	87.972	88.066	-0.0940000000000083
18	96.091	88.066	8.02499999999999
19	101.846	104.730363636364	-2.88436363636363
20	115.652	104.730363636364	10.9216363636364
21	91.269	88.066	3.203
22	100.911	104.730363636364	-3.81936363636363
23	105.248	104.730363636364	0.51763636363637
24	118.681	104.730363636364	13.9506363636364

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Parameters (Session):

par1 = 2 ; par2 = none ; par4 = no ;

Parameters (R input):

par1 = 2 ; par2 = none ; par3 = ; par4 = no ;

R code (references can be found in the software module):

library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code