Free Statistics

of Irreproducible Research!

Author's title

Author*The author of this computation has been verified*
R Software Modulerwasp_regression_trees1.wasp
Title produced by softwareRecursive Partitioning (Regression Trees)
Date of computationFri, 10 Dec 2010 21:40:26 +0000
Cite this page as followsStatistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2010/Dec/10/t12920171252f13h4nyux1iecx.htm/, Retrieved Mon, 29 Apr 2024 16:10:08 +0000
Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=107961, Retrieved Mon, 29 Apr 2024 16:10:08 +0000
QR Codes:

Original text written by user:
IsPrivate?No (this computation is public)
User-defined keywords
Estimated Impact194
Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)
-     [Recursive Partitioning (Regression Trees)] [] [2010-12-05 19:35:21] [b98453cac15ba1066b407e146608df68]
-   PD  [Recursive Partitioning (Regression Trees)] [Recursive Partiti...] [2010-12-10 21:32:17] [1429a1a14191a86916b95357f6de790b]
F           [Recursive Partitioning (Regression Trees)] [Recursive Partiti...] [2010-12-10 21:40:26] [e192c8164fa91adb027f71579ac0a49a] [Current]
-   PD        [Recursive Partitioning (Regression Trees)] [] [2011-12-13 18:32:30] [e21b9c93af4eb9605ecfaf58a559e5ab]
- RMP         [Recursive Partitioning (Regression Trees)] [] [2011-12-13 20:18:05] [74be16979710d4c4e7c6647856088456]
Feedback Forum
2010-12-19 10:44:48 [48eb36e2c01435ad7e4ea7854a9d98fe] [reply
De student past hier de techniek 'recursive partitioning' toe en dat gebeurt op een correcte en duidelijke manier. Ook de gegeven interpretatie - zowel van de boomstructuur als van de 'confusion matrix' en de 'cross validation' - is juist en bovendien helder geformuleerd.

Wanneer de student dit zou wensen op te nemen in zijn of haar finale paper zou ik als tip nog willen meegeven dat hij of zij misschien de vergelijking kan maken tussen de boomstructuur zonder categorisatie en de boomstructuur zoals deze hier werd opgesteld.

Post a new message
Dataseries X:
13	13	14	13	3
12	12	8	13	5
15	10	12	16	6
12	9	7	12	6
10	10	10	11	5
12	12	7	12	3
15	13	16	18	8
9	12	11	11	4
11	5	16	14	6
12	12	14	14	4
11	6	6	9	4
15	11	16	11	5
11	12	11	12	6
7	14	12	12	4
11	14	7	13	6
11	12	13	11	4
10	12	11	12	6
6	7	9	11	4
11	9	7	13	2
15	11	14	15	7
11	11	15	10	5
12	12	7	11	4
14	12	15	13	6
15	11	17	16	6
13	8	14	14	5
13	9	14	14	6
16	12	8	14	4
13	10	8	8	4
12	10	14	13	7
11	8	8	13	4
9	12	11	11	4
16	11	16	15	6
12	12	10	15	6
10	7	8	9	5
13	11	14	13	6
16	11	16	16	7
14	12	13	13	6
15	9	5	11	3
5	15	8	12	3
8	11	10	12	4
11	11	8	12	6
16	11	13	14	7
9	15	6	8	4
9	11	12	13	5
13	12	16	16	6
10	12	5	13	6
6	9	15	11	6
12	12	12	14	5
8	12	8	13	4
14	13	13	13	5
12	11	14	13	5
11	9	12	12	4
16	9	16	16	6
8	11	10	15	2
15	11	15	15	8
7	12	8	12	3
16	12	16	14	6
14	9	19	12	6
9	9	6	12	5
14	12	13	13	5
11	12	15	12	6
13	12	7	12	5
15	12	13	13	6
5	14	4	5	2
15	11	14	13	5
13	12	13	13	5
11	11	11	14	5
11	6	14	17	6
12	10	12	13	6
12	12	15	13	6
12	13	14	12	5
12	8	13	13	5
14	12	8	14	4
6	12	6	11	2
7	12	7	12	4
14	6	13	12	6
14	11	13	16	6
10	10	11	12	5
13	12	5	12	3
12	13	12	12	6
9	11	8	10	4
12	7	11	15	5
16	11	14	15	8
10	11	9	12	4
14	11	10	16	6
10	11	13	15	6
16	12	16	16	7
15	10	16	13	6
12	11	11	12	5
10	12	8	11	4
8	7	4	13	6
8	13	7	10	3
11	8	14	15	5
13	12	11	13	6
16	11	17	16	7
16	12	15	15	7
14	14	17	18	6
11	10	5	13	3
4	10	4	10	2
14	13	10	16	8
9	10	11	13	3
8	10	10	14	3
8	7	9	15	4
11	10	12	14	5
12	8	15	13	7
11	12	7	13	6
14	12	13	15	6
16	11	14	14	6
15	12	12	16	7
16	12	14	14	6
14	12	15	14	6
11	12	8	16	6
14	11	12	12	4
12	12	12	13	4
14	11	16	12	5
8	11	9	12	4
13	13	15	14	6
16	12	15	14	6
12	12	6	14	5
16	12	14	16	8
12	12	15	13	6
11	8	10	14	5
4	8	6	4	4
16	12	14	16	8
15	11	12	13	6
10	12	8	16	4
13	13	11	15	6
15	12	13	14	6
12	12	9	13	4
14	11	15	14	6
7	12	13	12	3
19	12	15	15	6
12	10	14	14	5
12	11	16	13	4
13	12	14	14	6
15	12	14	16	4
8	10	10	6	4
12	12	10	13	4
10	13	4	13	6
8	12	8	14	5
16	12	12	15	8
13	11	12	13	7
9	11	9	12	4
14	10	12	15	6
14	11	14	12	6
12	11	11	14	2




Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time6 seconds
R Server'Gwilym Jenkins' @ 72.249.127.135
R Framework error message
The field 'Names of X columns' contains a hard return which cannot be interpreted.
Please, resubmit your request without hard returns in the 'Names of X columns'.

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 6 seconds \tabularnewline
R Server & 'Gwilym Jenkins' @ 72.249.127.135 \tabularnewline
R Framework error message & 
The field 'Names of X columns' contains a hard return which cannot be interpreted.
Please, resubmit your request without hard returns in the 'Names of X columns'.
\tabularnewline \hline \end{tabular} %Source: https://freestatistics.org/blog/index.php?pk=107961&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]6 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Gwilym Jenkins' @ 72.249.127.135[/C][/ROW]
[ROW][C]R Framework error message[/C][C]
The field 'Names of X columns' contains a hard return which cannot be interpreted.
Please, resubmit your request without hard returns in the 'Names of X columns'.
[/C][/ROW] [/TABLE] Source: https://freestatistics.org/blog/index.php?pk=107961&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=107961&T=0

As an alternative you can also use a QR Code:  

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time6 seconds
R Server'Gwilym Jenkins' @ 72.249.127.135
R Framework error message
The field 'Names of X columns' contains a hard return which cannot be interpreted.
Please, resubmit your request without hard returns in the 'Names of X columns'.







10-Fold Cross Validation
Prediction (training)Prediction (testing)
ActualC1C2CVC1C2CV
C14101200.773640100.8
C21945990.755416710.8161
Overall--0.7627--0.8102

\begin{tabular}{lllllllll}
\hline
10-Fold Cross Validation \tabularnewline
 & Prediction (training) & Prediction (testing) \tabularnewline
Actual & C1 & C2 & CV & C1 & C2 & CV \tabularnewline
C1 & 410 & 120 & 0.7736 & 40 & 10 & 0.8 \tabularnewline
C2 & 194 & 599 & 0.7554 & 16 & 71 & 0.8161 \tabularnewline
Overall & - & - & 0.7627 & - & - & 0.8102 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=107961&T=1

[TABLE]
[ROW][C]10-Fold Cross Validation[/C][/ROW]
[ROW][C][/C][C]Prediction (training)[/C][C]Prediction (testing)[/C][/ROW]
[ROW][C]Actual[/C][C]C1[/C][C]C2[/C][C]CV[/C][C]C1[/C][C]C2[/C][C]CV[/C][/ROW]
[ROW][C]C1[/C][C]410[/C][C]120[/C][C]0.7736[/C][C]40[/C][C]10[/C][C]0.8[/C][/ROW]
[ROW][C]C2[/C][C]194[/C][C]599[/C][C]0.7554[/C][C]16[/C][C]71[/C][C]0.8161[/C][/ROW]
[ROW][C]Overall[/C][C]-[/C][C]-[/C][C]0.7627[/C][C]-[/C][C]-[/C][C]0.8102[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=107961&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=107961&T=1

As an alternative you can also use a QR Code:  

The GUIDs for individual cells are displayed in the table below:

10-Fold Cross Validation
Prediction (training)Prediction (testing)
ActualC1C2CVC1C2CV
C14101200.773640100.8
C21945990.755416710.8161
Overall--0.7627--0.8102







Confusion Matrix (predicted in columns / actuals in rows)
C1C2
C14513
C22167

\begin{tabular}{lllllllll}
\hline
Confusion Matrix (predicted in columns / actuals in rows) \tabularnewline
 & C1 & C2 \tabularnewline
C1 & 45 & 13 \tabularnewline
C2 & 21 & 67 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=107961&T=2

[TABLE]
[ROW][C]Confusion Matrix (predicted in columns / actuals in rows)[/C][/ROW]
[ROW][C][/C][C]C1[/C][C]C2[/C][/ROW]
[ROW][C]C1[/C][C]45[/C][C]13[/C][/ROW]
[ROW][C]C2[/C][C]21[/C][C]67[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=107961&T=2

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=107961&T=2

As an alternative you can also use a QR Code:  

The GUIDs for individual cells are displayed in the table below:

Confusion Matrix (predicted in columns / actuals in rows)
C1C2
C14513
C22167



Parameters (Session):
par1 = 1 ; par2 = none ; par3 = 4 ; par4 = no ;
Parameters (R input):
par1 = 1 ; par2 = equal ; par3 = 2 ; par4 = yes ;
R code (references can be found in the software module):
library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}