Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*Unverified author*

R Software Module

rwasp_regression_trees.wasp

Title produced by software

Recursive Partitioning (Regression Trees)

Date of computation

Wed, 26 May 2010 11:20:15 +0000

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2010/May/26/t1274872866fix8oufa1lxbjf8.htm/, Retrieved Fri, 03 May 2024 08:20:31 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=76456, Retrieved Fri, 03 May 2024 08:20:31 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

B382,regression tree,steven,coomans,thesis,per2maand

Estimated Impact

132

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

-       [Recursive Partitioning (Regression Trees)] [B382,regression t...] [2010-05-26 11:20:15] [d41d8cd98f00b204e9800998ecf8427e] [Current]

Feedback Forum

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

285	NA	263.074978758798	284.715000203273	254,775
215.375	285	251.150995958404	273.669961695715	224,125
313.725	278.0375	240.148887802906	251.121344746975	288,75
256.7625	281.60625	230.023039795889	274.792708736568	257,75
273.79	279.121875	220.685960664563	268.316598141974	274,15
173.0125	278.5886875	212.085468785266	269.853519048767	229,75
174.875	268.03106875	204.143913113455	235.891289968433	214,05
258.2625	258.715461875	196.82024483039	214.685176171976	92,75
222.65	258.6701656875	190.082767981171	229.822804324978	206
231.7375	255.06814911875	183.871806487569	227.333358752735	260,15
150.778	252.735084206875	178.151865687297	228.857278044667	239,15
144.1375	242.539375786188	172.868740338206	201.763796135814	178,4
136.15	232.699188207569	167.995011999447	181.769285845826	222,875
152.875	223.044269386812	163.498250062763	165.941563187890	160,75
238.375	216.027342448131	159.353396166068	161.40829991438	160
147.8	218.262108203318	155.548258305226	188.11152745539	130,025
35.425	211.215897382986	152.031367572698	174.125600162880	137,25
80.375	193.636807644687	148.767298136698	126.004166756322	77,3
143.375	182.310626880219	145.755625456846	110.173380876921	105,05
194.8875	178.417064192197	142.984891871717	121.692499793419	102,75
190.43	180.064107772977	140.440637576800	147.087096517147	190,375
122.525	181.100696995679	138.099808990679	162.124674010416	136,275
153.125	175.243127296111	135.934098031119	148.385789010579	155,525
79.6	173.031314566500	133.942182198235	150.030031637519	52,75
182.8625	163.688183109850	132.093720050010	125.594727004357	131,625

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	3 seconds
R Server	'Sir Ronald Aylmer Fisher' @ 193.190.124.24
R Framework error message	Warning: there are blank lines in the 'Data X' field. Please, use NA for missing data - blank lines are simply deleted and are NOT treated as missing values.

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 3 seconds \tabularnewline
R Server & 'Sir Ronald Aylmer Fisher' @ 193.190.124.24 \tabularnewline
R Framework error message & Warning: there are blank lines in the 'Data X' field.
Please, use NA for missing data - blank lines are simply
 deleted and are NOT treated as missing values. \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=76456&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]3 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Sir Ronald Aylmer Fisher' @ 193.190.124.24[/C][/ROW]
[ROW][C]R Framework error message[/C][C]Warning: there are blank lines in the 'Data X' field.
Please, use NA for missing data - blank lines are simply
 deleted and are NOT treated as missing values.[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=76456&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=76456&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	3 seconds
R Server	'Sir Ronald Aylmer Fisher' @ 193.190.124.24
R Framework error message	Warning: there are blank lines in the 'Data X' field. Please, use NA for missing data - blank lines are simply deleted and are NOT treated as missing values.

Model Performance
#	Complexity	split	relative error	CV error	CV S.D.
1	0.507	0	1	1.114	0.277
2	0.01	1	0.493	0.685	0.169

\begin{tabular}{lllllllll}
\hline
Model Performance \tabularnewline
# & Complexity & split & relative error & CV error & CV S.D. \tabularnewline
1 & 0.507 & 0 & 1 & 1.114 & 0.277 \tabularnewline
2 & 0.01 & 1 & 0.493 & 0.685 & 0.169 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=76456&T=1

[TABLE]
[ROW][C]Model Performance[/C][/ROW]
[ROW][C]#[/C][C]Complexity[/C][C]split[/C][C]relative error[/C][C]CV error[/C][C]CV S.D.[/C][/ROW]
[ROW][C]1[/C][C]0.507[/C][C]0[/C][C]1[/C][C]1.114[/C][C]0.277[/C][/ROW]
[ROW][C]2[/C][C]0.01[/C][C]1[/C][C]0.493[/C][C]0.685[/C][C]0.169[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=76456&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=76456&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Model Performance
#	Complexity	split	relative error	CV error	CV S.D.
1	0.507	0	1	1.114	0.277
2	0.01	1	0.493	0.685	0.169

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Parameters (Session):

par1 = 1 ; par2 = No ;

Parameters (R input):

par1 = 1 ; par2 = No ;

R code (references can be found in the software module):

library(rpart)
library(partykit)
par1 <- as.numeric(par1)
autoprune <- function ( tree, method='Minimum CV'){
xerr <- tree$cptable[,'xerror']
cpmin.id <- which.min(xerr)
if (method == 'Minimum CV Error plus 1 SD'){
xstd <- tree$cptable[,'xstd']
errt <- xerr[cpmin.id] + xstd[cpmin.id]
cpSE1.min <- which.min( errt < xerr )
mycp <- (tree$cptable[,'CP'])[cpSE1.min]
}
if (method == 'Minimum CV') {
mycp <- (tree$cptable[,'CP'])[cpmin.id]
}
return (mycp)
}
conf.multi.mat <- function(true, new)
{
if ( all( is.na(match( levels(true),levels(new) ) )) )
stop ( 'conflict of vector levels')
multi.t <- list()
for (mylev in levels(true) ) {
true.tmp <- true
new.tmp <- new
left.lev <- levels (true.tmp)[- match(mylev,levels(true) ) ]
levels(true.tmp) <- list ( mylev = mylev, all = left.lev )
levels(new.tmp)  <- list ( mylev = mylev, all = left.lev )
curr.t <- conf.mat ( true.tmp , new.tmp )
multi.t[[mylev]] <- curr.t
multi.t[[mylev]]$precision <-
round( curr.t$conf[1,1] / sum( curr.t$conf[1,] ), 2 )
}
return (multi.t)
}
x <- t(y)
k <- length(x[1,])
n <- length(x[,1])
x1 <- cbind(x[,par1], x[,1:k!=par1])
mycolnames <- c(colnames(x)[par1], colnames(x)[1:k!=par1])
colnames(x1) <- mycolnames #colnames(x)[par1]
m <- rpart(as.data.frame(x1))
par2
if (par2 != 'No') {
mincp <- autoprune(m,method=par2)
print(mincp)
m <- prune(m,cp=mincp)
}
m$cptable
bitmap(file='test1.png')
plot(as.party(m),tp_args=list(id=FALSE))
dev.off()
bitmap(file='test2.png')
plotcp(m)
dev.off()
cbind(y=m$y,pred=predict(m),res=residuals(m))
myr <- residuals(m)
myp <- predict(m)
bitmap(file='test4.png')
op <- par(mfrow=c(2,2))
plot(myr,ylab='residuals')
plot(density(myr),main='Residual Kernel Density')
plot(myp,myr,xlab='predicted',ylab='residuals',main='Predicted vs Residuals')
plot(density(myp),main='Prediction Kernel Density')
par(op)
dev.off()
load(file='createtable')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Model Performance',6,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Complexity',header=TRUE)
a<-table.element(a,'split',header=TRUE)
a<-table.element(a,'relative error',header=TRUE)
a<-table.element(a,'CV error',header=TRUE)
a<-table.element(a,'CV S.D.',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(m$cptable[,1])) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,round(m$cptable[i,'CP'],3))
a<-table.element(a,m$cptable[i,'nsplit'])
a<-table.element(a,round(m$cptable[i,'rel error'],3))
a<-table.element(a,round(m$cptable[i,'xerror'],3))
a<-table.element(a,round(m$cptable[i,'xstd'],3))
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code