Free Statistics

of Irreproducible Research!

Author's title

Author*The author of this computation has been verified*
R Software Modulerwasp_regression_trees1.wasp
Title produced by softwareRecursive Partitioning (Regression Trees)
Date of computationFri, 24 Dec 2010 21:59:23 +0000
Cite this page as followsStatistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2010/Dec/24/t1293227869hmzqfqsth89kmh6.htm/, Retrieved Tue, 30 Apr 2024 07:15:44 +0000
Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=115297, Retrieved Tue, 30 Apr 2024 07:15:44 +0000
QR Codes:

Original text written by user:
IsPrivate?No (this computation is public)
User-defined keywords
Estimated Impact107
Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)
-       [Recursive Partitioning (Regression Trees)] [] [2010-12-24 21:59:23] [0956ee981dded61b2e7128dae94e5715] [Current]
Feedback Forum

Post a new message
Dataseries X:
1221.53	2617.2	10168.52	6957.61	23448.78
1180.55	2506.13	9937.04	6688.49	23007.99
1183.26	2679.07	9202.45	6601.37	23096.32
1141.2	2589.73	9369.35	6229.02	22358.17
1049.33	2457.46	8824.06	5925.22	20536.49
1101.6	2517.3	9537.3	6147.97	21029.81
1030.71	2386.53	9382.64	5965.52	20128.99
1089.41	2453.37	9768.7	5964.33	19765.19
1186.69	2529.66	11057.4	6135.7	21108.59
1169.43	2475.14	11089.94	6153.55	21239.35
1104.49	2525.93	10126.03	5598.46	20608.7
1073.87	2480.93	10198.04	5608.79	20121.99
1115.1	2229.85	10546.44	5957.43	21872.5
1095.63	2169.14	9345.55	5625.95	21821.5
1036.19	2030.98	10034.74	5414.96	21752.87
1057.08	2071.37	10133.23	5675.16	20955.25
1020.62	1953.35	10492.53	5458.04	19724.19
987.48	1748.74	10356.83	5332.14	20573.33
919.32	1696.58	9958.44	4808.64	18378.73
919.14	1900.09	9522.5	4940.82	18171
872.81	1908.64	8828.26	4769.45	15520.99
797.87	1881.46	8109.53	4084.76	13576.02
735.09	2100.18	7568.42	3843.74	12811.57
825.88	2672.2	7994.05	4338.35	13278.21
903.25	3136	8859.56	4810.2	14387.48
896.24	2994.38	8512.27	4669.44	13888.24
968.75	3168.22	8576.98	4987.97	13968.67
1166.36	3751.41	11259.86	5831.02	18016.21
1282.83	3925.43	13072.87	6422.3	21261.89
1267.38	3719.52	13376.81	6479.56	22731.1
1280	3757.12	13481.38	6418.32	22102.01
1400.38	3722.23	14338.54	7096.79	24533.12
1385.59	4127.47	13849.99	6948.82	25755.35
1322.7	4162.5	12525.54	6534.97	22849.2
1330.63	4441.82	13603.02	6748.13	24331.67
1378.55	4325.29	13592.47	6851.75	23455.74
1468.36	4350.83	15307.78	8067.32	27812.65
1481.14	4384.47	15680.67	7870.52	28643.61
1549.38	4639.4	16737.63	8019.22	31352.58
1526.75	4697.86	16785.69	7861.51	27142.47
1473.99	4614.76	16569.09	7638.17	23984.14
1455.27	4471.65	17248.89	7584.14	23184.94
1503.35	4305.23	18138.36	8007.32	21772.73
1530.62	4433.57	17875.75	7883.04	20634.47
1482.37	4388.53	17400.41	7408.87	20318.98
1420.86	4140.3	17287.65	6917.03	19800.93
1406.82	4144.38	17604.12	6715.44	19651.51
1438.24	4070.78	17383.42	6789.11	20106.42
1418.3	3906.01	17225.83	6596.92	19964.72
1400.63	3795.91	16274.33	6309.19	18960.48
1377.94	3703.32	16399.39	6268.92	18324.35
1335.85	3675.8	16127.58	6004.33	17543.05
1303.82	3911.06	16140.76	5859.57	17392.27
1276.66	3912.28	15456.81	5681.97	16971.34
1270.2	3839.25	15505.18	5683.31	16267.62
1270.09	3744.63	15467.33	5692.86	15857.89
1310.61	3549.25	16906.23	6009.89	16661.3
1294.87	3394.14	17059.66	5970.08	15805.04
1280.66	3264.26	16205.43	5796.04	15918.48
1280.08	3328.8	16649.82	5674.15	15753.14
1248.29	3223.98	16111.43	5408.26	14876.43
1249.48	3228.01	14872.15	5193.4	14937.14
1207.01	3112.83	13606.5	4929.07	14386.37
1228.81	3051.67	13574.3	5044.12	15428.52
1220.33	3039.71	12413.6	4829.69	14903.55
1234.18	3125.67	11899.6	4886.5	14880.98
1191.33	3106.54	11584.01	4586.28	14201.06
1191.5		11276.59	4460.63	13867.07
1156.85		11008.9	4184.84	13908.97
1180.59		11668.95	4348.77	13516.88
1203.6		11740.6	4350.49	14195.35
1181.27		11387.59	4254.85	13721.69




Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time5 seconds
R Server'RServer@AstonUniversity' @ vre.aston.ac.uk
R Framework error message
Warning: there are blank lines in the 'Data X' field.
Please, use NA for missing data - blank lines are simply
 deleted and are NOT treated as missing values.

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 5 seconds \tabularnewline
R Server & 'RServer@AstonUniversity' @ vre.aston.ac.uk \tabularnewline
R Framework error message & 
Warning: there are blank lines in the 'Data X' field.
Please, use NA for missing data - blank lines are simply
 deleted and are NOT treated as missing values.
\tabularnewline \hline \end{tabular} %Source: https://freestatistics.org/blog/index.php?pk=115297&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]5 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'RServer@AstonUniversity' @ vre.aston.ac.uk[/C][/ROW]
[ROW][C]R Framework error message[/C][C]
Warning: there are blank lines in the 'Data X' field.
Please, use NA for missing data - blank lines are simply
 deleted and are NOT treated as missing values.
[/C][/ROW] [/TABLE] Source: https://freestatistics.org/blog/index.php?pk=115297&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=115297&T=0

As an alternative you can also use a QR Code:  

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time5 seconds
R Server'RServer@AstonUniversity' @ vre.aston.ac.uk
R Framework error message
Warning: there are blank lines in the 'Data X' field.
Please, use NA for missing data - blank lines are simply
 deleted and are NOT treated as missing values.







10-Fold Cross Validation
Prediction (training)Prediction (testing)
ActualC1C2C3C4C5CVC1C2C3C4C5CV
C1121131000.89631131000.7333
C263537100.4274943000.25
C31215892000.6544016700.4286
C4002957460.4318001520.625
C5107011000.84750121180.8182
Overall-----0.6512-----0.5867

\begin{tabular}{lllllllll}
\hline
10-Fold Cross Validation \tabularnewline
 & Prediction (training) & Prediction (testing) \tabularnewline
Actual & C1 & C2 & C3 & C4 & C5 & CV & C1 & C2 & C3 & C4 & C5 & CV \tabularnewline
C1 & 121 & 13 & 1 & 0 & 0 & 0.8963 & 11 & 3 & 1 & 0 & 0 & 0.7333 \tabularnewline
C2 & 63 & 53 & 7 & 1 & 0 & 0.4274 & 9 & 4 & 3 & 0 & 0 & 0.25 \tabularnewline
C3 & 12 & 15 & 89 & 20 & 0 & 0.6544 & 0 & 1 & 6 & 7 & 0 & 0.4286 \tabularnewline
C4 & 0 & 0 & 29 & 57 & 46 & 0.4318 & 0 & 0 & 1 & 5 & 2 & 0.625 \tabularnewline
C5 & 10 & 7 & 0 & 1 & 100 & 0.8475 & 0 & 1 & 2 & 1 & 18 & 0.8182 \tabularnewline
Overall & - & - & - & - & - & 0.6512 & - & - & - & - & - & 0.5867 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=115297&T=1

[TABLE]
[ROW][C]10-Fold Cross Validation[/C][/ROW]
[ROW][C][/C][C]Prediction (training)[/C][C]Prediction (testing)[/C][/ROW]
[ROW][C]Actual[/C][C]C1[/C][C]C2[/C][C]C3[/C][C]C4[/C][C]C5[/C][C]CV[/C][C]C1[/C][C]C2[/C][C]C3[/C][C]C4[/C][C]C5[/C][C]CV[/C][/ROW]
[ROW][C]C1[/C][C]121[/C][C]13[/C][C]1[/C][C]0[/C][C]0[/C][C]0.8963[/C][C]11[/C][C]3[/C][C]1[/C][C]0[/C][C]0[/C][C]0.7333[/C][/ROW]
[ROW][C]C2[/C][C]63[/C][C]53[/C][C]7[/C][C]1[/C][C]0[/C][C]0.4274[/C][C]9[/C][C]4[/C][C]3[/C][C]0[/C][C]0[/C][C]0.25[/C][/ROW]
[ROW][C]C3[/C][C]12[/C][C]15[/C][C]89[/C][C]20[/C][C]0[/C][C]0.6544[/C][C]0[/C][C]1[/C][C]6[/C][C]7[/C][C]0[/C][C]0.4286[/C][/ROW]
[ROW][C]C4[/C][C]0[/C][C]0[/C][C]29[/C][C]57[/C][C]46[/C][C]0.4318[/C][C]0[/C][C]0[/C][C]1[/C][C]5[/C][C]2[/C][C]0.625[/C][/ROW]
[ROW][C]C5[/C][C]10[/C][C]7[/C][C]0[/C][C]1[/C][C]100[/C][C]0.8475[/C][C]0[/C][C]1[/C][C]2[/C][C]1[/C][C]18[/C][C]0.8182[/C][/ROW]
[ROW][C]Overall[/C][C]-[/C][C]-[/C][C]-[/C][C]-[/C][C]-[/C][C]0.6512[/C][C]-[/C][C]-[/C][C]-[/C][C]-[/C][C]-[/C][C]0.5867[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=115297&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=115297&T=1

As an alternative you can also use a QR Code:  

The GUIDs for individual cells are displayed in the table below:

10-Fold Cross Validation
Prediction (training)Prediction (testing)
ActualC1C2C3C4C5CVC1C2C3C4C5CV
C1121131000.89631131000.7333
C263537100.4274943000.25
C31215892000.6544016700.4286
C4002957460.4318001520.625
C5107011000.84750121180.8182
Overall-----0.6512-----0.5867







Confusion Matrix (predicted in columns / actuals in rows)
C1C2C3C4C5
C1114000
C2311000
C3211020
C4000122
C5110210

\begin{tabular}{lllllllll}
\hline
Confusion Matrix (predicted in columns / actuals in rows) \tabularnewline
 & C1 & C2 & C3 & C4 & C5 \tabularnewline
C1 & 11 & 4 & 0 & 0 & 0 \tabularnewline
C2 & 3 & 11 & 0 & 0 & 0 \tabularnewline
C3 & 2 & 1 & 10 & 2 & 0 \tabularnewline
C4 & 0 & 0 & 0 & 12 & 2 \tabularnewline
C5 & 1 & 1 & 0 & 2 & 10 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=115297&T=2

[TABLE]
[ROW][C]Confusion Matrix (predicted in columns / actuals in rows)[/C][/ROW]
[ROW][C][/C][C]C1[/C][C]C2[/C][C]C3[/C][C]C4[/C][C]C5[/C][/ROW]
[ROW][C]C1[/C][C]11[/C][C]4[/C][C]0[/C][C]0[/C][C]0[/C][/ROW]
[ROW][C]C2[/C][C]3[/C][C]11[/C][C]0[/C][C]0[/C][C]0[/C][/ROW]
[ROW][C]C3[/C][C]2[/C][C]1[/C][C]10[/C][C]2[/C][C]0[/C][/ROW]
[ROW][C]C4[/C][C]0[/C][C]0[/C][C]0[/C][C]12[/C][C]2[/C][/ROW]
[ROW][C]C5[/C][C]1[/C][C]1[/C][C]0[/C][C]2[/C][C]10[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=115297&T=2

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=115297&T=2

As an alternative you can also use a QR Code:  

The GUIDs for individual cells are displayed in the table below:

Confusion Matrix (predicted in columns / actuals in rows)
C1C2C3C4C5
C1114000
C2311000
C3211020
C4000122
C5110210



Parameters (Session):
par1 = 1 ; par2 = quantiles ; par3 = 5 ; par4 = yes ;
Parameters (R input):
par1 = 1 ; par2 = quantiles ; par3 = 5 ; par4 = yes ;
R code (references can be found in the software module):
library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}