Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*The author of this computation has been verified*

R Software Module

rwasp_regression_trees1.wasp

Title produced by software

Recursive Partitioning (Regression Trees)

Date of computation

Tue, 14 Dec 2010 16:16:56 +0000

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2010/Dec/14/t12923433471lo7j0oibqlljre.htm/, Retrieved Fri, 03 May 2024 01:58:41 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=109825, Retrieved Fri, 03 May 2024 01:58:41 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

Estimated Impact

112

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

-     [Recursive Partitioning (Regression Trees)] [] [2010-12-05 19:50:12] [b98453cac15ba1066b407e146608df68]
-   PD    [Recursive Partitioning (Regression Trees)] [] [2010-12-14 16:16:56] [6b67b7c8c7d0a997c30f007387afbdb8] [Current]

Feedback Forum

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

1579	0	4,0	45,7	17.0
2146	0	5,9	81,9	21.0
2462	0	7,1	56,8	21.0
3695	0	10,5	65,1	18.0
4831	0	15,1	86,2	20.0
5134	0	16,8	35,1	11.0
6250	0	15,3	133,8	20.0
5760	0	18,4	34,5	13.0
6249	0	16,1	69,9	14.0
2917	0	11,3	98,3	23.0
1741	0	7,9	86,7	24.0
2359	0	5,6	58,2	22.0
1511	1	3,4	83,6	17.0
2059	0	4,8	83,5	18.0
2635	0	6,5	112,3	24.0
2867	0	8,5	134,3	23.0
4403	0	15,1	30,0	8.0
5720	0	15,7	44,5	10.0
4502	0	18,7	120,1	18.0
5749	0	19,2	43,4	13.0
5627	0	12,9	199,4	23.0
2846	0	14,4	68,1	14.0
1762	0	6,2	99,8	15.0
2429	0	3,3	69,5	18.0
1169	0	4,6	71,3	18.0
2154	1	7,2	167,8	20.0
2249	0	7,8	66,3	14.0
2687	0	9,9	41,9	12.0
4359	0	13,6	57,2	20.0
5382	0	17,1	72,3	14.0
4459	0	17,8	96,5	16.0
6398	0	18,6	172,1	19.0
4596	0	14,7	25,8	12.0
3024	0	10,5	105,1	17.0
1887	0	8,6	92,2	16.0
2070	0	4,4	109,3	18.0
1351	0	2,3	101,7	19.0
2218	0	2,8	29,1	8.0
2461	1	8,8	34,6	10.0
3028	0	10,7	46,7	10.0
4784	0	13,9	82,0	19.0
4975	0	19,3	34,4	8.0
4607	0	19,5	72,7	13.0
6249	0	20,4	44,4	8.0
4809	0	15,3	31,0	12.0
3157	0	7,9	64,0	15.0
1910	0	8,3	65,4	18.0
2228	0	4,5	64,5	17.0
1594	0	3,2	153,8	24.0
2467	0	5,0	48,8	14.0
2222	0	6,6	25,0	15.0
3607	1	11,1	37,2	15.0
4685	0	12,8	40,8	11.0
4962	0	16,3	78,4	18.0
5770	0	17,4	112,4	18.0
5480	0	18,9	122,7	21.0
5000	0	15,8	82,9	13.0
3228	0	11,7	67,6	15.0
1993	0	6,4	78,4	17.0
2288	0	2,9	65,7	17.0
1580	0	4,7	44,9	22.0
2111	0	2,4	80,9	19.0
2192	0	7,2	38,8	17.0
3601	0	10,7	46,1	17.0
4665	1	13,4	60,0	19.0
4876	0	18,5	53,9	11.0
5813	0	18,3	123,5	16.0
5589	0	16,8	69,5	15.0
5331	0	16,6	74,2	11.0
3075	0	14,1	47,0	13.0
2002	0	6,1	60,9	18.0
2306	0	3,5	51,4	22.0
1507	0	1,7	18,7	9.0
1992	0	2,3	88,1	19.0
2487	0	4,5	65,3	16.0
3490	0	9,3	46,0	16.0
4647	0	14,2	115,6	20.0
5594	1	17,3	25,8	7.0
5611	0	23,0	48,1	8.0
5788	0	16,3	202,3	21.0
6204	0	18,4	9,2	8.0
3013	0	14,2	56,3	17.0
1931	0	9,1	71,6	20.0
2549	0	5,9	93,0	18.0
1504	0	7,2	82,3	26.0
2090	0	6,8	95,4	18.0
2702	0	8,0	61,9	20.0
2939	0	14,3	0,0	0.0
4500	0	14,6	103,4	22.0
6208	0	17,5	99,2	19.0
6415	1	17,2	96,7	18.0
5657	0	17,2	56,9	13.0
5964	0	14,1	57,6	16.0
3163	0	10,5	65,2	11.0
1997	0	6,8	71,7	22.0
2422	0	4,1	89,2	19.0
1376	0	6,5	70,7	23.0
2202	0	6,1	35,4	11.0
2683	0	6,3	140,5	24.0
3303	0	9,3	45,4	14.0
5202	0	16,4	53,9	11.0
5231	0	16,1	69,9	17.0
4880	0	18,0	101,9	20.0
7998	1	17,6	89,3	19.0
4977	0	14,0	70,7	12.0
3531	0	10,5	72,4	19.0
2025	0	6,9	67,6	26.0
2205	0	2,8	43,3	13.0
1442	0	0,7	62,9	12.0
2238	0	3,6	57,1	20.0
2179	0	6,7	68,2	15.0
3218	0	12,5	47,1	15.0
5139	0	14,4	43,1	17.0
4990	0	16,5	64,5	11.0
4914	0	18,7	73,1	20.0
6084	0	19,4	37,7	9.0
5672	1	15,8	29,1	10.0
3548	0	11,3	105,0	17.0
1793	0	9,7	98,0	25.0
2086	0	2,9	80,8	19.0

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	5 seconds
R Server	'Sir Ronald Aylmer Fisher' @ 193.190.124.24
R Framework error message	The field 'Names of X columns' contains a hard return which cannot be interpreted. Please, resubmit your request without hard returns in the 'Names of X columns'.

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 5 seconds \tabularnewline
R Server & 'Sir Ronald Aylmer Fisher' @ 193.190.124.24 \tabularnewline
R Framework error message & The field 'Names of X columns' contains a hard return which cannot be interpreted.
Please, resubmit your request without hard returns in the 'Names of X columns'. \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=109825&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]5 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Sir Ronald Aylmer Fisher' @ 193.190.124.24[/C][/ROW]
[ROW][C]R Framework error message[/C][C]The field 'Names of X columns' contains a hard return which cannot be interpreted.
Please, resubmit your request without hard returns in the 'Names of X columns'.[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=109825&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=109825&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	5 seconds
R Server	'Sir Ronald Aylmer Fisher' @ 193.190.124.24
R Framework error message	The field 'Names of X columns' contains a hard return which cannot be interpreted. Please, resubmit your request without hard returns in the 'Names of X columns'.

10-Fold Cross Validation
	Prediction (training)			Prediction (testing)
Actual	C1	C2	CV	C1	C2	CV
C1	470	77	0.8592	47	6	0.8868
C2	3	540	0.9945	5	52	0.9123
Overall	-	-	0.9266	-	-	0.9

\begin{tabular}{lllllllll}
\hline
10-Fold Cross Validation \tabularnewline
 & Prediction (training) & Prediction (testing) \tabularnewline
Actual & C1 & C2 & CV & C1 & C2 & CV \tabularnewline
C1 & 470 & 77 & 0.8592 & 47 & 6 & 0.8868 \tabularnewline
C2 & 3 & 540 & 0.9945 & 5 & 52 & 0.9123 \tabularnewline
Overall & - & - & 0.9266 & - & - & 0.9 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=109825&T=1

[TABLE]
[ROW][C]10-Fold Cross Validation[/C][/ROW]
[ROW][C][/C][C]Prediction (training)[/C][C]Prediction (testing)[/C][/ROW]
[ROW][C]Actual[/C][C]C1[/C][C]C2[/C][C]CV[/C][C]C1[/C][C]C2[/C][C]CV[/C][/ROW]
[ROW][C]C1[/C][C]470[/C][C]77[/C][C]0.8592[/C][C]47[/C][C]6[/C][C]0.8868[/C][/ROW]
[ROW][C]C2[/C][C]3[/C][C]540[/C][C]0.9945[/C][C]5[/C][C]52[/C][C]0.9123[/C][/ROW]
[ROW][C]Overall[/C][C]-[/C][C]-[/C][C]0.9266[/C][C]-[/C][C]-[/C][C]0.9[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=109825&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=109825&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

10-Fold Cross Validation
	Prediction (training)			Prediction (testing)
Actual	C1	C2	CV	C1	C2	CV
C1	470	77	0.8592	47	6	0.8868
C2	3	540	0.9945	5	52	0.9123
Overall	-	-	0.9266	-	-	0.9

Confusion Matrix (predicted in columns / actuals in rows)
	C1	C2
C1	51	9
C2	0	60

\begin{tabular}{lllllllll}
\hline
Confusion Matrix (predicted in columns / actuals in rows) \tabularnewline
 & C1 & C2 \tabularnewline
C1 & 51 & 9 \tabularnewline
C2 & 0 & 60 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=109825&T=2

[TABLE]
[ROW][C]Confusion Matrix (predicted in columns / actuals in rows)[/C][/ROW]
[ROW][C][/C][C]C1[/C][C]C2[/C][/ROW]
[ROW][C]C1[/C][C]51[/C][C]9[/C][/ROW]
[ROW][C]C2[/C][C]0[/C][C]60[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=109825&T=2

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=109825&T=2

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Confusion Matrix (predicted in columns / actuals in rows)
	C1	C2
C1	51	9
C2	0	60

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Parameters (Session):

par1 = 1 ; par2 = quantiles ; par3 = 2 ; par4 = yes ;

Parameters (R input):

par1 = 1 ; par2 = quantiles ; par3 = 2 ; par4 = yes ;

R code (references can be found in the software module):

library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code