Free Statistics

of Irreproducible Research!

Author's title

Author*The author of this computation has been verified*
R Software Modulerwasp_regression_trees1.wasp
Title produced by softwareRecursive Partitioning (Regression Trees)
Date of computationSun, 12 Dec 2010 13:35:04 +0000
Cite this page as followsStatistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2010/Dec/12/t12921608167wrhugvu3sbivud.htm/, Retrieved Tue, 07 May 2024 10:05:36 +0000
Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=108436, Retrieved Tue, 07 May 2024 10:05:36 +0000
QR Codes:

Original text written by user:
IsPrivate?No (this computation is public)
User-defined keywords
Estimated Impact120
Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)
-     [Recursive Partitioning (Regression Trees)] [] [2010-12-05 18:59:57] [b98453cac15ba1066b407e146608df68]
F   PD    [Recursive Partitioning (Regression Trees)] [workshop 10e] [2010-12-12 13:35:04] [3f56c8f677e988de577e4e00a8180a48] [Current]
Feedback Forum
2010-12-19 14:24:50 [48eb36e2c01435ad7e4ea7854a9d98fe] [reply
De student gaat hier de gegevens indelen in 2 groepen, dat hier gewerkt wordt met categorisatie had hij of zij misschien kunnen vermelden, zodat lezers duidelijk weten wat ze kunnen verwachten.

Er wordt door de student correct gesteld dat 'mistakes' nu de enige verklarende variabele is. Maar ondanks het feit dat hij of zij stelt dat er een zogenaamde “confusion matrix” is opgenomen, wordt deze niet besproken.

Deze kunnen we als volgt interpreteren. In categorie 1 zijn er (68 + 12 = ) 80 waarden. Hiervan werden er 68 correct voorspeld. In categorie 2 zijn er (36 + 34 = ) 70 waarden. Hiervan werden er 34 correct voorspeld. Wat betreft categorie 1, kan er dus een vrij accurate voorspelling gemaakt worden. Voor categorie 2 is de mogelijk van een correcte voorspelling zeer klein.

Post a new message
Dataseries X:
11	6	6	4	15	16	2	40	37	15	10	77
26	16	5	4	23	24	1	29	31	9	20	63
26	13	20	10	26	22	1	37	35	12	16	73
15	7	12	6	19	21	1	32	36	15	10	76
10	10	11	5	19	23	1	39	32	17	8	90
21	10	12	8	16	23	1	32	30	14	14	67
27	15	11	9	23	21	2	35	34	9	19	69
21	9	9	9	22	20	2	35	34	12	15	70
21	12	13	8	19	22	1	28	22	11	23	54
21	8	9	11	24	20	1	37	27	13	9	54
22	9	14	6	19	12	2	32	27	16	12	76
29	10	12	8	25	23	2	34	33	16	14	75
29	15	18	11	23	23	2	37	38	15	13	76
29	11	9	5	31	30	1	35	37	10	11	80
30	12	15	10	29	22	2	40	31	16	11	89
19	9	12	7	18	21	1	37	36	12	10	73
19	10	12	7	17	21	1	37	38	15	12	74
22	13	12	13	22	15	2	33	31	13	18	78
18	8	15	10	21	22	2	37	34	18	12	76
28	14	11	8	24	24	1	35	33	13	10	69
17	9	13	6	22	23	2	36	38	17	15	74
18	12	10	8	16	15	2	32	28	14	15	82
20	8	17	7	22	24	1	38	34	13	12	77
16	8	13	5	21	24	2	34	32	13	9	84
17	9	17	9	25	21	2	33	34	15	11	75
25	14	15	11	22	21	2	33	39	15	16	79
22	11	13	11	24	18	2	42	37	13	17	79
34	16	18	11	21	20	2	33	34	14	12	69
31	9	17	9	25	19	2	32	41	13	11	88
38	11	21	7	29	29	2	32	32	16	13	57
18	13	12	6	19	20	2	33	35	14	9	69
25	12	12	7	29	23	1	35	33	12	14	52
20	9	15	6	25	24	2	39	32	18	11	86
23	14	8	5	19	27	1	28	32	9	20	66
12	4	15	4	27	28	1	38	32	16	8	54
20	8	16	10	25	24	2	36	37	16	12	85
15	14	9	8	23	29	1	38	31	17	10	79
21	10	13	6	24	24	1	34	27	13	11	84
21	13	17	11	23	22	2	33	31	15	11	73
20	10	11	4	25	25	2	37	37	17	13	70
30	14	9	9	23	14	2	34	31	15	13	54
22	13	15	10	22	22	2	34	40	14	13	70
33	14	9	6	32	24	1	36	35	10	15	54
25	14	15	9	22	24	1	31	35	13	12	69
20	14	14	10	18	24	2	37	35	11	13	68
10	5	8	6	19	24	1	36	35	16	11	76
15	11	11	8	23	22	1	34	38	16	9	71
21	9	14	13	19	21	2	30	35	11	14	66
16	9	12	8	16	21	2	29	34	15	9	67
23	10	15	10	23	21	2	35	37	15	9	71
25	14	11	5	17	15	2	33	37	12	15	54
18	6	11	8	17	26	2	29	31	17	10	76
33	11	9	6	28	22	1	28	31	15	13	77
18	13	8	9	24	24	1	32	33	16	8	71
18	12	13	9	21	13	2	33	37	14	15	69
13	8	12	7	14	19	2	31	36	17	13	73
24	14	24	20	21	10	2	43	42	10	24	46
19	11	11	8	20	28	1	32	28	11	11	66
20	11	11	8	25	25	2	35	41	15	13	77
21	11	16	7	20	24	1	31	23	15	12	77
18	16	12	7	17	22	2	33	33	7	22	70
29	14	18	10	26	30	1	39	32	17	11	86
13	16	12	5	17	22	1	32	33	14	15	38
26	14	14	8	17	24	1	32	33	18	7	66
22	9	16	9	24	23	1	36	32	14	14	75
28	8	24	20	30	20	1	39	38	14	10	64
28	11	13	6	25	22	2	41	32	9	9	80
23	8	11	10	15	22	2	30	35	14	12	86
22	14	14	11	25	19	2	30	35	11	16	54
28	8	16	12	18	24	2	32	34	15	10	54
28	8	12	7	20	22	2	39	34	16	13	74
31	10	21	12	32	26	2	38	38	17	11	88
15	8	11	8	14	12	2	38	39	16	12	85
15	8	6	6	20	25	1	32	32	12	11	63
24	10	9	6	25	29	2	34	39	15	13	81
22	9	14	9	25	23	2	36	35	15	10	74
17	9	16	5	25	23	2	39	36	16	11	80
25	7	18	11	35	17	2	31	28	16	9	80
32	16	9	6	29	26	1	36	36	11	13	60
23	14	13	10	25	27	2	34	38	12	14	62
20	11	17	8	21	23	1	34	35	14	14	63
20	9	11	7	21	20	2	38	39	15	11	89
28	16	16	8	24	24	2	38	36	17	10	76
20	7	11	9	26	22	2	33	36	19	11	81
20	11	11	8	24	26	2	32	34	15	12	72
23	14	11	10	20	29	1	30	34	16	14	84
20	11	20	13	24	20	2	31	27	14	14	76
21	8	10	7	18	17	2	34	37	16	21	76
14	11	12	7	17	16	2	35	33	15	13	72
31	8	11	8	22	24	1	37	34	17	11	81
21	12	14	9	22	24	2	35	39	12	12	72
18	8	12	9	22	19	2	35	29	18	12	78
26	13	12	8	24	29	2	31	33	13	11	79
25	8	12	7	32	25	2	31	35	14	14	52
9	13	10	6	19	25	1	38	36	14	13	67
18	9	12	8	21	24	1	34	30	14	13	74
19	12	10	8	23	29	1	30	27	12	12	73
29	11	7	4	26	22	2	32	37	14	14	69
31	14	10	8	18	23	1	31	33	12	12	67
24	9	13	10	19	15	2	37	32	15	12	76
16	10	12	7	22	29	2	34	35	11	12	77
19	9	13	8	27	21	1	32	33	11	18	63
19	9	9	7	21	23	2	34	37	15	11	84
22	8	14	10	20	20	2	38	36	14	15	90
31	16	14	9	21	25	1	38	39	15	13	75
20	10	12	8	20	28	2	38	35	16	11	76
26	11	18	5	29	18	2	39	31	14	22	53
17	6	17	8	30	25	2	33	37	18	10	87
16	9	12	9	10	13	2	34	36	13	16	69
16	8	15	9	23	24	2	35	31	14	11	78
9	6	8	11	29	23	2	36	32	13	15	54
19	20	8	7	19	25	1	32	33	14	14	58
22	10	12	8	26	27	2	34	36	14	11	80
15	8	10	4	22	24	2	44	39	17	10	74
25	16	18	16	26	24	2	37	39	12	14	56
30	9	15	9	27	26	2	32	29	16	14	82
30	12	16	10	19	18	2	35	34	15	11	64
24	14	11	12	24	26	1	38	35	10	15	67
20	10	10	8	26	23	1	38	32	13	11	75
12	7	7	4	22	28	1	38	41	15	10	69
31	14	17	11	23	20	2	32	38	16	10	72
25	11	7	8	25	23	2	39	38	14	12	54
23	13	14	12	19	24	1	27	32	13	15	54
23	10	12	8	20	21	2	37	31	17	10	71
26	9	15	6	25	25	2	41	38	14	12	53
14	15	13	8	14	16	2	31	38	16	15	54
18	12	10	8	19	23	1	36	33	15	12	71
28	12	16	14	27	22	2	38	28	12	11	69
19	9	11	10	21	27	1	37	38	16	10	30
21	15	7	5	21	24	1	30	28	8	20	53
18	10	15	8	14	17	1	40	32	9	19	68
29	13	18	12	21	21	2	34	31	13	17	69
16	11	11	11	23	21	2	36	34	19	8	54
22	10	13	8	18	19	2	36	35	11	17	66
15	12	11	8	20	25	1	33	36	15	11	79
21	9	13	9	19	24	1	34	33	11	13	67
17	14	12	6	15	21	1	37	32	15	9	74
17	9	11	5	23	26	1	37	32	16	10	86
33	14	11	8	26	25	2	39	40	15	13	63
17	11	13	7	21	25	2	37	35	12	16	69
20	11	8	4	13	13	1	37	33	16	12	73
17	9	12	9	24	25	1	35	37	15	14	69
16	11	9	5	17	23	1	32	33	13	11	71
18	10	14	9	21	26	2	33	31	14	13	77
32	12	18	12	28	22	2	31	33	11	15	74
22	10	15	6	22	20	2	30	34	15	14	82
19	6	17	8	25	14	2	32	35	12	18	84
29	16	11	6	27	24	2	33	40	14	14	54
23	14	17	7	25	21	2	29	30	13	10	80
17	8	12	9	21	24	2	37	38	15	8	76




Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time6 seconds
R Server'Sir Ronald Aylmer Fisher' @ 193.190.124.24

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 6 seconds \tabularnewline
R Server & 'Sir Ronald Aylmer Fisher' @ 193.190.124.24 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=108436&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]6 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Sir Ronald Aylmer Fisher' @ 193.190.124.24[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=108436&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=108436&T=0

As an alternative you can also use a QR Code:  

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time6 seconds
R Server'Sir Ronald Aylmer Fisher' @ 193.190.124.24







Confusion Matrix (predicted in columns / actuals in rows)
C1C2
C16812
C23634

\begin{tabular}{lllllllll}
\hline
Confusion Matrix (predicted in columns / actuals in rows) \tabularnewline
 & C1 & C2 \tabularnewline
C1 & 68 & 12 \tabularnewline
C2 & 36 & 34 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=108436&T=1

[TABLE]
[ROW][C]Confusion Matrix (predicted in columns / actuals in rows)[/C][/ROW]
[ROW][C][/C][C]C1[/C][C]C2[/C][/ROW]
[ROW][C]C1[/C][C]68[/C][C]12[/C][/ROW]
[ROW][C]C2[/C][C]36[/C][C]34[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=108436&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=108436&T=1

As an alternative you can also use a QR Code:  

The GUIDs for individual cells are displayed in the table below:

Confusion Matrix (predicted in columns / actuals in rows)
C1C2
C16812
C23634



Parameters (Session):
par1 = 5 ; par2 = quantiles ; par3 = 3 ; par4 = no ;
Parameters (R input):
par1 = 5 ; par2 = quantiles ; par3 = 2 ; par4 = no ;
R code (references can be found in the software module):
library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}