Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*Unverified author*

R Software Module

rwasp_regression_trees.wasp

Title produced by software

Recursive Partitioning (Regression Trees)

Date of computation

Wed, 26 May 2010 10:48:45 +0000

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2010/May/26/t1274871010o6c0zjog8tlsfbm.htm/, Retrieved Fri, 03 May 2024 09:10:51 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=76449, Retrieved Fri, 03 May 2024 09:10:51 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

B611,regression tree,steven,coomans,thesis,permaand

Estimated Impact

139

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

-       [Recursive Partitioning (Regression Trees)] [B611,regression t...] [2010-05-26 10:48:45] [d41d8cd98f00b204e9800998ecf8427e] [Current]

Feedback Forum

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

10.65	NA	11.3447238275332	34.00466	0
34	10.65	34.0202876898303	34.00466	152
81.75	12.985	80.394008519981	34.00466	103
106.5	19.8615	104.461728571947	34.00466	98
0.525	28.52535	1.51024885747802	34.00466	24
24.025	25.725315	24.3449175147126	34.00466	24
5.25	25.5552835	6.10102813170674	34.00466	4
9	23.52475515	9.74297366894733	34.00466	0
12.8	22.072279635	13.4323962225627	34.00466	2
25.05	21.1450516715	25.3271480416495	34.00466	2
0.3	21.53554650435	1.29132499835047	34.00466	82
75.75	19.411991853915	74.5430992988187	34.00466	70
54.75	25.0457926685235	54.1692693680278	34.00466	76
1.526	28.0162134016712	2.48213334115502	34.00466	51
1.02	25.3671920615040	1.99046641401920	34.00466	NA
3.752	22.9324728553536	4.64256354643977	34.00466	0
17.25	21.0144255698183	17.7434661387314	34.00466	9
9.2	20.6379830128364	9.92884171115339	34.00466	12
50.25	19.4941847115528	49.7607775635948	34.00466	2
2.25	22.5697662403975	3.18356297838536	34.00466	51
3.95	20.5377896163578	4.83254096458096	34.00466	55
60	18.879010654722	59.2038839531516	34.00466	53
55.8	22.9911095892498	55.1395593527327	34.00466	17
6.75	26.2719986303248	7.55013353908689	34.00466	38
61.95	24.3197987672923	61.1045586779287	34.00466	29
7.025	28.0828188905631	7.817053763127	34.00466	32
85.75	25.9770370015068	84.1965045872622	34.00466	78
18.525	31.9543333013561	18.9797298623388	34.00466	26
6	30.6113999712205	6.8227878984193	34.00466	117
25.35	28.1502599740985	25.5966143213555	34.00466	29 
46.775	27.8702339766886	46.3822925793737	34.00466	5 
51.025	29.7607105790198	50.5102268038572	34.00466	45 
30	31.8871395211178	30.1128301008900	34.00466	13 
3	31.698425569006	3.91120762158714	34.00466	56 
30	28.8285830121054	30.1059458610932	34.00466	55 
44	28.9457247108949	43.6876743593134	34.00466	13 
80.75	30.4511522398054	79.3472312750332	34.00466	65 
27.5	35.4810370158248	27.6902669964837	34.00466	78 
39.725	34.6829333142424	39.5536267539573	34.00466	49 
29.25	35.1871399828181	29.3886144748858	34.00466	90 
32.725	34.5934259845363	32.7602726524212	34.00466	52 
56.25	34.4065833860827	55.591319459741	34.00466	28 
28.65	36.5909250474744	28.8093454557490	34.00466	82 
51.75	35.796832542727	51.2297172974473	34.00466	31 
32.26	37.3921492884543	32.3159796143125	34.00466	4 
72	36.8789343596088	70.8921850090827	34.00466	31 
65.4	40.391040923648	64.5012300391254	34.00466	84 
33.75	42.8919368312832	33.7767127288168	34.00466	56 
77.85	41.9777431481549	76.6044993211423	34.00466	54 
10.875	45.5649688333394	11.5642347595514	34.00466	84

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	2 seconds
R Server	'Gwilym Jenkins' @ 72.249.127.135
R Framework error message	Warning: there are blank lines in the 'Data X' field. Please, use NA for missing data - blank lines are simply deleted and are NOT treated as missing values.

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 2 seconds \tabularnewline
R Server & 'Gwilym Jenkins' @ 72.249.127.135 \tabularnewline
R Framework error message & Warning: there are blank lines in the 'Data X' field.
Please, use NA for missing data - blank lines are simply
 deleted and are NOT treated as missing values. \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=76449&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]2 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Gwilym Jenkins' @ 72.249.127.135[/C][/ROW]
[ROW][C]R Framework error message[/C][C]Warning: there are blank lines in the 'Data X' field.
Please, use NA for missing data - blank lines are simply
 deleted and are NOT treated as missing values.[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=76449&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=76449&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	2 seconds
R Server	'Gwilym Jenkins' @ 72.249.127.135
R Framework error message	Warning: there are blank lines in the 'Data X' field. Please, use NA for missing data - blank lines are simply deleted and are NOT treated as missing values.

Model Performance
#	Complexity	split	relative error	CV error	CV S.D.
1	0.743	0	1	1.019	0.177
2	0.113	1	0.257	0.278	0.049
3	0.01	2	0.144	0.178	0.052

\begin{tabular}{lllllllll}
\hline
Model Performance \tabularnewline
# & Complexity & split & relative error & CV error & CV S.D. \tabularnewline
1 & 0.743 & 0 & 1 & 1.019 & 0.177 \tabularnewline
2 & 0.113 & 1 & 0.257 & 0.278 & 0.049 \tabularnewline
3 & 0.01 & 2 & 0.144 & 0.178 & 0.052 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=76449&T=1

[TABLE]
[ROW][C]Model Performance[/C][/ROW]
[ROW][C]#[/C][C]Complexity[/C][C]split[/C][C]relative error[/C][C]CV error[/C][C]CV S.D.[/C][/ROW]
[ROW][C]1[/C][C]0.743[/C][C]0[/C][C]1[/C][C]1.019[/C][C]0.177[/C][/ROW]
[ROW][C]2[/C][C]0.113[/C][C]1[/C][C]0.257[/C][C]0.278[/C][C]0.049[/C][/ROW]
[ROW][C]3[/C][C]0.01[/C][C]2[/C][C]0.144[/C][C]0.178[/C][C]0.052[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=76449&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=76449&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Model Performance
#	Complexity	split	relative error	CV error	CV S.D.
1	0.743	0	1	1.019	0.177
2	0.113	1	0.257	0.278	0.049
3	0.01	2	0.144	0.178	0.052

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Parameters (Session):

par1 = 1 ; par2 = No ;

Parameters (R input):

par1 = 1 ; par2 = No ;

R code (references can be found in the software module):

library(rpart)
library(partykit)
par1 <- as.numeric(par1)
autoprune <- function ( tree, method='Minimum CV'){
xerr <- tree$cptable[,'xerror']
cpmin.id <- which.min(xerr)
if (method == 'Minimum CV Error plus 1 SD'){
xstd <- tree$cptable[,'xstd']
errt <- xerr[cpmin.id] + xstd[cpmin.id]
cpSE1.min <- which.min( errt < xerr )
mycp <- (tree$cptable[,'CP'])[cpSE1.min]
}
if (method == 'Minimum CV') {
mycp <- (tree$cptable[,'CP'])[cpmin.id]
}
return (mycp)
}
conf.multi.mat <- function(true, new)
{
if ( all( is.na(match( levels(true),levels(new) ) )) )
stop ( 'conflict of vector levels')
multi.t <- list()
for (mylev in levels(true) ) {
true.tmp <- true
new.tmp <- new
left.lev <- levels (true.tmp)[- match(mylev,levels(true) ) ]
levels(true.tmp) <- list ( mylev = mylev, all = left.lev )
levels(new.tmp)  <- list ( mylev = mylev, all = left.lev )
curr.t <- conf.mat ( true.tmp , new.tmp )
multi.t[[mylev]] <- curr.t
multi.t[[mylev]]$precision <-
round( curr.t$conf[1,1] / sum( curr.t$conf[1,] ), 2 )
}
return (multi.t)
}
x <- t(y)
k <- length(x[1,])
n <- length(x[,1])
x1 <- cbind(x[,par1], x[,1:k!=par1])
mycolnames <- c(colnames(x)[par1], colnames(x)[1:k!=par1])
colnames(x1) <- mycolnames #colnames(x)[par1]
m <- rpart(as.data.frame(x1))
par2
if (par2 != 'No') {
mincp <- autoprune(m,method=par2)
print(mincp)
m <- prune(m,cp=mincp)
}
m$cptable
bitmap(file='test1.png')
plot(as.party(m),tp_args=list(id=FALSE))
dev.off()
bitmap(file='test2.png')
plotcp(m)
dev.off()
cbind(y=m$y,pred=predict(m),res=residuals(m))
myr <- residuals(m)
myp <- predict(m)
bitmap(file='test4.png')
op <- par(mfrow=c(2,2))
plot(myr,ylab='residuals')
plot(density(myr),main='Residual Kernel Density')
plot(myp,myr,xlab='predicted',ylab='residuals',main='Predicted vs Residuals')
plot(density(myp),main='Prediction Kernel Density')
par(op)
dev.off()
load(file='createtable')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Model Performance',6,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Complexity',header=TRUE)
a<-table.element(a,'split',header=TRUE)
a<-table.element(a,'relative error',header=TRUE)
a<-table.element(a,'CV error',header=TRUE)
a<-table.element(a,'CV S.D.',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(m$cptable[,1])) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,round(m$cptable[i,'CP'],3))
a<-table.element(a,m$cptable[i,'nsplit'])
a<-table.element(a,round(m$cptable[i,'rel error'],3))
a<-table.element(a,round(m$cptable[i,'xerror'],3))
a<-table.element(a,round(m$cptable[i,'xstd'],3))
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code