Free Statistics

of Irreproducible Research!

Author's title

Author*The author of this computation has been verified*
R Software Modulerwasp_regression_trees1.wasp
Title produced by softwareRecursive Partitioning (Regression Trees)
Date of computationTue, 14 Dec 2010 17:44:43 +0000
Cite this page as followsStatistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2010/Dec/14/t1292348662vxqvrerpd7m645o.htm/, Retrieved Thu, 02 May 2024 19:28:29 +0000
Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=109946, Retrieved Thu, 02 May 2024 19:28:29 +0000
QR Codes:

Original text written by user:
IsPrivate?No (this computation is public)
User-defined keywords
Estimated Impact122
Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)
-     [Recursive Partitioning (Regression Trees)] [] [2010-12-05 18:59:57] [b98453cac15ba1066b407e146608df68]
F   PD  [Recursive Partitioning (Regression Trees)] [] [2010-12-14 14:48:39] [de55ccbf69577500a5f46ed42a101114]
-   P     [Recursive Partitioning (Regression Trees)] [] [2010-12-14 15:13:52] [de55ccbf69577500a5f46ed42a101114]
F   P         [Recursive Partitioning (Regression Trees)] [] [2010-12-14 17:44:43] [6b31f806e9ccc1f74a26091056f791cb] [Current]
Feedback Forum
2010-12-19 08:46:15 [48eb36e2c01435ad7e4ea7854a9d98fe] [reply
De student stelt hier een boomstructuur met categorisatie en de interpretatie ervan is correct.

We zien echter ook de zogenaamde 'confusion matrix', maar deze werd niet behandeld door de student. Een interpretatie kan er als volgt uitzien: In categorie 1 zitten er (89 + 33 =) 122 waarden. Hiervan werden er 89 correct voorspeld. In categorie 2 zitten er (10 + 108 = ) 118 waarden. Hiervan werden er 108 correct voorspeld. Hieruit kunnen we afleiden dat vooral voor de waarden in de tweede categorie, de voorspellingen zeer accuraat kunnen gebeuren, maar ook wat betreft categorie 1 is de voorspellingsgraad vrij hoog.

Post a new message
Dataseries X:
8	350	165	3693	11.5
8	318	150	3436	11
8	302	140	3449	10.5
8	429	198	4341	10
8	440	215	4312	8.5
8	455	225	4425	10
8	383	170	3563	10
8	340	160	3609	8
8	455	225	3086	10
4	113	95	2372	15
6	199	97	2774	15.5
4	97	46	1835	20.5
4	110	87	2672	17.5
4	104	95	2375	17.5
4	121	113	2234	12.5
8	360	215	4615	14
8	307	200	4376	15
8	304	193	4732	18.5
4	97	88	2130	14.5
4	113	95	2228	14
6	250	100	3329	15.5
6	232	100	3288	15.5
8	350	165	4209	12
8	318	150	4096	13
8	400	170	4746	12
8	400	175	5140	12
4	140	72	2408	19
6	250	100	3282	15
4	122	86	2220	14
4	116	90	2123	14
4	88	76	2065	14.5
4	71	65	1773	19
4	97	60	1834	19
4	91	70	1955	20.5
4	97.5	80	2126	17
4	122	86	2226	16.5
8	350	165	4274	12
8	318	150	4135	13.5
8	351	153	4129	13
8	429	208	4633	11
8	350	155	4502	13.5
8	400	190	4422	12.5
3	70	97	2330	13.5
8	307	130	4098	14
8	302	140	4294	16
4	121	112	2933	14.5
4	121	76	2511	18
4	122	86	2395	16
4	120	97	2506	14.5
4	98	80	2164	15
8	350	175	4100	13
8	304	150	3672	11.5
8	302	137	4042	14.5
8	318	150	3777	12.5
8	400	150	4464	12
8	351	158	4363	13
8	440	215	4735	11
8	455	225	4951	11
6	225	105	3121	16.5
6	250	100	3278	18
6	250	88	3021	16.5
6	198	95	2904	16
8	400	150	4997	14
8	350	180	4499	12.5
6	232	100	2789	15
4	140	72	2401	19.5
4	108	94	2379	16.5
4	122	85	2310	18.5
6	155	107	2472	14
8	350	145	4082	13
8	400	230	4278	9.5
4	116	75	2158	15.5
4	114	91	2582	14
8	318	150	3399	11
4	121	110	2660	14
8	350	180	3664	11
6	198	95	3102	16.5
6	232	100	2901	16
4	122	80	2451	16.5
4	71	65	1836	21
6	250	100	3781	17
6	258	110	3632	18
8	302	140	4141	14
8	350	150	4699	14.5
8	302	140	4638	16
8	304	150	4257	15.5
4	79	67	1963	15.5
4	97	78	2300	14.5
4	83	61	2003	19
4	90	75	2125	14.5
4	116	75	2246	14
4	120	97	2489	15
4	79	67	2000	16
6	225	95	3264	16
6	250	72	3158	19.5
8	400	170	4668	11.5
8	350	145	4440	14
8	351	148	4657	13.5
6	231	110	3907	21
6	258	110	3730	19
6	225	95	3785	19
8	262	110	3221	13.5
8	302	129	3169	12
4	140	83	2639	17
6	232	100	2914	16
4	134	96	2702	13.5
4	90	71	2223	16.5
6	171	97	2984	14.5
4	115	95	2694	15
4	120	88	2957	17
4	121	115	2671	13.5
4	91	53	1795	17.5
4	116	81	2220	16.9
4	140	92	2572	14.9
4	101	83	2202	15.3
8	305	140	4215	13
8	304	120	3962	13.9
8	351	152	4215	12.8
6	250	105	3353	14.5
6	200	81	3012	17.6
4	85	52	2035	22.2
4	98	60	2164	22.1
6	225	100	3651	17.7
6	250	110	3645	16.2
6	258	95	3193	17.8
4	85	70	1990	17
4	97	75	2155	16.4
4	130	102	3150	15.7
8	318	150	3940	13.2
6	168	120	3820	16.7
8	350	180	4380	12.1
8	302	130	3870	15
8	318	150	3755	14
4	111	80	2155	14.8
4	79	58	1825	18.6
4	85	70	1945	16.8
8	305	145	3880	12.5
8	318	145	4140	13.7
6	231	105	3425	16.9
6	225	100	3630	17.7
8	400	180	4220	11.1
8	350	170	4165	11.4
8	351	149	4335	14.5
4	97	78	1940	14.5
4	97	75	2265	18.2
4	140	89	2755	15.8
4	98	83	2075	15.9
4	97	67	1985	16.4
6	146	97	2815	14.5
4	121	110	2600	12.8
4	90	48	1985	21.5
4	98	66	1800	14.4
4	85	70	2070	18.6
8	318	140	3735	13.2
8	302	139	3570	12.8
6	200	95	3155	18.2
6	200	85	2965	15.8
6	225	100	3430	17.2
6	232	90	3210	17.2
6	200	85	3070	16.7
6	225	110	3620	18.7
8	305	145	3425	13.2
6	231	165	3445	13.4
8	318	140	4080	13.7
4	98	68	2155	16.5
4	119	97	2300	14.7
4	105	75	2230	14.5
4	151	85	2855	17.6
5	131	103	2830	15.9
6	163	125	3140	13.6
6	163	133	3410	15.8
4	89	71	1990	14.9
4	98	68	2135	16.6
6	200	85	2990	18.2
4	140	88	2890	17.3
6	225	110	3360	16.6
8	305	130	3840	15.4
8	351	138	3955	13.2
8	318	135	3830	15.2
8	351	142	4054	14.3
8	267	125	3605	15
4	89	71	1925	14
4	86	65	1975	15.2
4	121	80	2670	15
4	141	71	3190	24.8
8	260	90	3420	22.2
4	105	70	2150	14.9
4	85	65	2020	19.2
4	151	90	2670	16
6	173	115	2595	11.3
4	151	90	2556	13.2
4	98	76	2144	14.7
4	98	70	2120	15.5
4	86	65	2019	16.4
4	140	88	2870	18.1
4	151	90	3003	20.1
4	97	78	2188	15.8
4	134	90	2711	15.5
4	119	92	2434	15
4	108	75	2265	15.2
4	156	105	2800	14.4
4	85	65	2110	19.2
5	121	67	2950	19.9
4	91	67	1850	13.8
4	89	62	1845	15.3
4	122	88	2500	15.1
4	135	84	2490	15.7
4	151	84	2635	16.4
6	173	110	2725	12.6
4	135	84	2385	12.9
4	86	64	1875	16.4
4	81	60	1760	16.1
4	85	65	1975	19.4
4	89	62	2050	17.3
4	105	63	2215	14.9
4	98	65	2045	16.2
4	105	74	2190	14.2
4	119	100	2615	14.8
4	141	80	3230	20.4
6	146	120	2930	13.8
6	231	110	3415	15.8
6	200	88	3060	17.1
6	225	85	3465	16.6
4	112	88	2640	18.6
4	112	88	2395	18
4	135	84	2525	16
4	151	90	2735	18
4	105	74	1980	15.3
4	91	68	1970	17.6
4	105	63	2125	14.7
4	120	88	2160	14.5
4	107	75	2205	14.5
4	91	67	1965	15.7
6	181	110	2945	16.4
6	262	85	3015	17
4	144	96	2665	13.9
4	151	90	2950	17.3
4	140	86	2790	15.6
4	135	84	2295	11.6
4	120	79	2625	18.6




Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time5 seconds
R Server'Sir Ronald Aylmer Fisher' @ 193.190.124.24
R Framework error message
The field 'Names of X columns' contains a hard return which cannot be interpreted.
Please, resubmit your request without hard returns in the 'Names of X columns'.

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 5 seconds \tabularnewline
R Server & 'Sir Ronald Aylmer Fisher' @ 193.190.124.24 \tabularnewline
R Framework error message & 
The field 'Names of X columns' contains a hard return which cannot be interpreted.
Please, resubmit your request without hard returns in the 'Names of X columns'.
\tabularnewline \hline \end{tabular} %Source: https://freestatistics.org/blog/index.php?pk=109946&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]5 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Sir Ronald Aylmer Fisher' @ 193.190.124.24[/C][/ROW]
[ROW][C]R Framework error message[/C][C]
The field 'Names of X columns' contains a hard return which cannot be interpreted.
Please, resubmit your request without hard returns in the 'Names of X columns'.
[/C][/ROW] [/TABLE] Source: https://freestatistics.org/blog/index.php?pk=109946&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=109946&T=0

As an alternative you can also use a QR Code:  

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time5 seconds
R Server'Sir Ronald Aylmer Fisher' @ 193.190.124.24
R Framework error message
The field 'Names of X columns' contains a hard return which cannot be interpreted.
Please, resubmit your request without hard returns in the 'Names of X columns'.







Confusion Matrix (predicted in columns / actuals in rows)
C1C2
C18933
C210108

\begin{tabular}{lllllllll}
\hline
Confusion Matrix (predicted in columns / actuals in rows) \tabularnewline
 & C1 & C2 \tabularnewline
C1 & 89 & 33 \tabularnewline
C2 & 10 & 108 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=109946&T=1

[TABLE]
[ROW][C]Confusion Matrix (predicted in columns / actuals in rows)[/C][/ROW]
[ROW][C][/C][C]C1[/C][C]C2[/C][/ROW]
[ROW][C]C1[/C][C]89[/C][C]33[/C][/ROW]
[ROW][C]C2[/C][C]10[/C][C]108[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=109946&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=109946&T=1

As an alternative you can also use a QR Code:  

The GUIDs for individual cells are displayed in the table below:

Confusion Matrix (predicted in columns / actuals in rows)
C1C2
C18933
C210108



Parameters (Session):
par1 = 5 ; par2 = quantiles ; par3 = 2 ; par4 = no ;
Parameters (R input):
par1 = 5 ; par2 = quantiles ; par3 = 2 ; par4 = no ;
R code (references can be found in the software module):
library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}