Home » date » 2010 » Dec » 22 »

*The author of this computation has been verified*
R Software Module: /rwasp_regression_trees1.wasp (opens new window with default values)
Title produced by software: Recursive Partitioning (Regression Trees)
Date of computation: Wed, 22 Dec 2010 13:40:54 +0000
 
Cite this page as follows:
Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL http://www.freestatistics.org/blog/date/2010/Dec/22/t1293025136o1pefg5t5asvf6a.htm/, Retrieved Wed, 22 Dec 2010 14:38:57 +0100
 
BibTeX entries for LaTeX users:
@Manual{KEY,
    author = {{YOUR NAME}},
    publisher = {Office for Research Development and Education},
    title = {Statistical Computations at FreeStatistics.org, URL http://www.freestatistics.org/blog/date/2010/Dec/22/t1293025136o1pefg5t5asvf6a.htm/},
    year = {2010},
}
@Manual{R,
    title = {R: A Language and Environment for Statistical Computing},
    author = {{R Development Core Team}},
    organization = {R Foundation for Statistical Computing},
    address = {Vienna, Austria},
    year = {2010},
    note = {{ISBN} 3-900051-07-0},
    url = {http://www.R-project.org},
}
 
Original text written by user:
 
IsPrivate?
No (this computation is public)
 
User-defined keywords:
 
Dataseries X:
» Textbox « » Textfile « » CSV «
1,4562 8,1000 7,9000 8,7000 104,5000 2443,2700 16,2000 16,3000 3,0000 -12,0000 65,0000 1,4268 8,3000 8,1000 8,9000 89,1000 2293,4100 12,5000 13,6000 6,0000 -11,0000 55,0000 1,4088 8,1000 8,3000 8,9000 82,6000 2070,8300 14,8000 14,3000 7,0000 -11,0000 57,0000 1,4016 7,4000 8,1000 8,1000 102,7000 2029,6000 15,4000 15,5000 -4,0000 -17,0000 57,0000 1,3650 7,3000 7,4000 8,0000 91,8000 2052,0200 13,6000 13,9000 -5,0000 -18,0000 57,0000 1,3190 7,7000 7,3000 8,3000 94,1000 1864,4400 14,2000 14,3000 -7,0000 -19,0000 65,0000 1,3050 8,0000 7,7000 8,5000 103,1000 1670,0700 15,0000 15,8000 -10,0000 -22,0000 69,0000 1,2785 8,0000 8,0000 8,7000 93,2000 1810,9900 14,1000 14,5000 -21,0000 -24,0000 70,0000 1,3239 7,7000 8,0000 8,6000 91,0000 1905,4100 13,7000 15,1000 -22,0000 -24,0000 71,0000 1,3449 6,9000 7,7000 8,3000 94,3000 1862,8300 14,4000 15,8000 -16,0000 -20,0000 71,0000 1,2732 6,6000 6,9000 7,9000 99,4000 2014,4500 15,6000 17,2000 -25,0000 -25,0000 73,0000 1,3322 6,9000 6,6000 7,9000 115,7000 2197,8200 19,7 etc...
 
Output produced by software:

Enter (or paste) a matrix (table) containing all data (time) series. Every column represents a different variable and must be delimited by a space or Tab. Every row represents a period in time (or category) and must be delimited by hard returns. The easiest way to enter data is to copy and paste a block of spreadsheet cells. Please, do not use commas or spaces to seperate groups of digits!


Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time5 seconds
R Server'RServer@AstonUniversity' @ vre.aston.ac.uk


Goodness of Fit
Correlation0.8856
R-squared0.7842
RMSE0.3551


Actuals, Predictions, and Residuals
#ActualsForecastsResiduals
18.17.995454545454550.104545454545454
28.37.995454545454550.304545454545456
38.18.45428571428571-0.354285714285714
47.47.99545454545455-0.595454545454545
57.37.3764705882353-0.0764705882352947
67.77.37647058823530.323529411764706
787.37647058823530.623529411764705
887.995454545454550.00454545454545485
97.77.99545454545455-0.295454545454545
106.97.3764705882353-0.476470588235294
116.67.3764705882353-0.776470588235295
126.96.421428571428570.478571428571429
137.57.37647058823530.123529411764705
147.97.37647058823530.523529411764706
157.77.99545454545455-0.295454545454545
166.57.3764705882353-0.876470588235295
176.16.42142857142857-0.321428571428572
186.46.42142857142857-0.0214285714285714
196.86.421428571428570.378571428571428
207.16.421428571428570.678571428571428
217.37.3764705882353-0.0764705882352947
227.27.3764705882353-0.176470588235294
2377.3764705882353-0.376470588235295
2477.3764705882353-0.376470588235295
2577.3764705882353-0.376470588235295
267.37.3764705882353-0.0764705882352947
277.57.37647058823530.123529411764705
287.27.3764705882353-0.176470588235294
297.77.37647058823530.323529411764706
3087.37647058823530.623529411764705
317.97.99545454545455-0.0954545454545448
3287.995454545454550.00454545454545485
3387.995454545454550.00454545454545485
347.97.99545454545455-0.0954545454545448
357.97.99545454545455-0.0954545454545448
3687.995454545454550.00454545454545485
378.17.995454545454550.104545454545454
388.17.995454545454550.104545454545454
398.27.995454545454550.204545454545454
4088.45428571428571-0.454285714285714
418.37.995454545454550.304545454545456
428.58.454285714285710.0457142857142863
438.68.454285714285710.145714285714286
448.78.454285714285710.245714285714286
458.78.454285714285710.245714285714286
468.58.454285714285710.0457142857142863
478.48.45428571428571-0.0542857142857134
488.58.454285714285710.0457142857142863
498.78.454285714285710.245714285714286
508.78.454285714285710.245714285714286
518.68.454285714285710.145714285714286
527.98.45428571428571-0.554285714285713
538.17.995454545454550.104545454545454
548.27.995454545454550.204545454545454
558.58.454285714285710.0457142857142863
568.68.454285714285710.145714285714286
578.58.454285714285710.0457142857142863
588.38.45428571428571-0.154285714285713
598.28.45428571428571-0.254285714285714
608.78.454285714285710.245714285714286
619.38.454285714285710.845714285714287
629.38.454285714285710.845714285714287
638.88.454285714285710.345714285714287
647.48.45428571428571-1.05428571428571
657.27.3764705882353-0.176470588235294
667.57.37647058823530.123529411764705
678.37.37647058823530.923529411764706
688.88.454285714285710.345714285714287
698.98.454285714285710.445714285714287
708.68.454285714285710.145714285714286
718.48.45428571428571-0.0542857142857134
728.48.45428571428571-0.0542857142857134
738.48.45428571428571-0.0542857142857134
748.48.45428571428571-0.0542857142857134
758.38.45428571428571-0.154285714285713
767.68.45428571428571-0.854285714285714
777.67.37647058823530.223529411764705
787.97.37647058823530.523529411764706
7987.995454545454550.00454545454545485
808.27.995454545454550.204545454545454
818.38.45428571428571-0.154285714285713
828.28.45428571428571-0.254285714285714
838.18.45428571428571-0.354285714285714
8487.995454545454550.00454545454545485
857.87.99545454545455-0.195454545454545
867.67.37647058823530.223529411764705
877.57.37647058823530.123529411764705
886.87.3764705882353-0.576470588235295
896.96.421428571428570.478571428571429
907.17.3764705882353-0.276470588235295
917.37.3764705882353-0.0764705882352947
927.47.37647058823530.0235294117647058
937.67.37647058823530.223529411764705
947.67.37647058823530.223529411764705
957.57.37647058823530.123529411764705
967.57.37647058823530.123529411764705
976.87.3764705882353-0.576470588235295
986.46.42142857142857-0.0214285714285714
996.26.42142857142857-0.221428571428572
10066.42142857142857-0.421428571428572
1016.36.42142857142857-0.121428571428572
1026.36.42142857142857-0.121428571428572
1036.16.42142857142857-0.321428571428572
1046.16.42142857142857-0.321428571428572
1056.36.42142857142857-0.121428571428572
 
Charts produced by software:
http://www.freestatistics.org/blog/date/2010/Dec/22/t1293025136o1pefg5t5asvf6a/2g1m21293025247.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/22/t1293025136o1pefg5t5asvf6a/2g1m21293025247.ps (open in new window)


http://www.freestatistics.org/blog/date/2010/Dec/22/t1293025136o1pefg5t5asvf6a/3g1m21293025247.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/22/t1293025136o1pefg5t5asvf6a/3g1m21293025247.ps (open in new window)


http://www.freestatistics.org/blog/date/2010/Dec/22/t1293025136o1pefg5t5asvf6a/4ra451293025247.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/22/t1293025136o1pefg5t5asvf6a/4ra451293025247.ps (open in new window)


 
Parameters (Session):
par1 = 2 ; par2 = none ; par3 = 3 ; par4 = no ;
 
Parameters (R input):
par1 = 2 ; par2 = none ; par3 = 3 ; par4 = no ;
 
R code (references can be found in the software module):
library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}
 





Copyright

Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.

Software written by Ed van Stee & Patrick Wessa


Disclaimer

Information provided on this web site is provided "AS IS" without warranty of any kind, either express or implied, including, without limitation, warranties of merchantability, fitness for a particular purpose, and noninfringement. We use reasonable efforts to include accurate and timely information and periodically update the information, and software without notice. However, we make no warranties or representations as to the accuracy or completeness of such information (or software), and we assume no liability or responsibility for errors or omissions in the content of this web site, or any software bugs in online applications. Your use of this web site is AT YOUR OWN RISK. Under no circumstances and under no legal theory shall we be liable to you or any other person for any direct, indirect, special, incidental, exemplary, or consequential damages arising from your access to, or use of, this web site.


Privacy Policy

We may request personal information to be submitted to our servers in order to be able to:

  • personalize online software applications according to your needs
  • enforce strict security rules with respect to the data that you upload (e.g. statistical data)
  • manage user sessions of online applications
  • alert you about important changes or upgrades in resources or applications

We NEVER allow other companies to directly offer registered users information about their products and services. Banner references and hyperlinks of third parties NEVER contain any personal data of the visitor.

We do NOT sell, nor transmit by any means, personal information, nor statistical data series uploaded by you to third parties.

We carefully protect your data from loss, misuse, alteration, and destruction. However, at any time, and under any circumstance you are solely responsible for managing your passwords, and keeping them secret.

We store a unique ANONYMOUS USER ID in the form of a small 'Cookie' on your computer. This allows us to track your progress when using this website which is necessary to create state-dependent features. The cookie is used for NO OTHER PURPOSE. At any time you may opt to disallow cookies from this website - this will not affect other features of this website.

We examine cookies that are used by third-parties (banner and online ads) very closely: abuse from third-parties automatically results in termination of the advertising contract without refund. We have very good reason to believe that the cookies that are produced by third parties (banner ads) do NOT cause any privacy or security risk.

FreeStatistics.org is safe. There is no need to download any software to use the applications and services contained in this website. Hence, your system's security is not compromised by their use, and your personal data - other than data you submit in the account application form, and the user-agent information that is transmitted by your browser - is never transmitted to our servers.

As a general rule, we do not log on-line behavior of individuals (other than normal logging of webserver 'hits'). However, in cases of abuse, hacking, unauthorized access, Denial of Service attacks, illegal copying, hotlinking, non-compliance with international webstandards (such as robots.txt), or any other harmful behavior, our system engineers are empowered to log, track, identify, publish, and ban misbehaving individuals - even if this leads to ban entire blocks of IP addresses, or disclosing user's identity.


FreeStatistics.org is powered by