Home » date » 2010 » Dec » 21 »

Paper DMA Recursive Catigorization (none)

*The author of this computation has been verified*
R Software Module: /rwasp_regression_trees1.wasp (opens new window with default values)
Title produced by software: Recursive Partitioning (Regression Trees)
Date of computation: Tue, 21 Dec 2010 13:23:07 +0000
 
Cite this page as follows:
Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL http://www.freestatistics.org/blog/date/2010/Dec/21/t1292937668lc2qp7aiwrtkxz8.htm/, Retrieved Tue, 21 Dec 2010 14:21:08 +0100
 
BibTeX entries for LaTeX users:
@Manual{KEY,
    author = {{YOUR NAME}},
    publisher = {Office for Research Development and Education},
    title = {Statistical Computations at FreeStatistics.org, URL http://www.freestatistics.org/blog/date/2010/Dec/21/t1292937668lc2qp7aiwrtkxz8.htm/},
    year = {2010},
}
@Manual{R,
    title = {R: A Language and Environment for Statistical Computing},
    author = {{R Development Core Team}},
    organization = {R Foundation for Statistical Computing},
    address = {Vienna, Austria},
    year = {2010},
    note = {{ISBN} 3-900051-07-0},
    url = {http://www.R-project.org},
}
 
Original text written by user:
 
IsPrivate?
No (this computation is public)
 
User-defined keywords:
Paper DMA
 
Dataseries X:
» Textbox « » Textfile « » CSV «
3030.29 25.64 2803.47 27.97 2767.63 27.62 2882.6 23.31 2863.36 29.07 2897.06 29.58 3012.61 28.63 3142.95 29.92 3032.93 32.68 3045.78 31.54 3110.52 32.43 3013.24 26.54 2987.1 25.85 2995.55 27.6 2833.18 25.71 2848.96 25.38 2794.83 28.57 2845.26 27.64 2915.03 25.36 2892.63 25.9 2604.42 26.29 2641.65 21.74 2659.81 19.2 2638.53 19.32 2720.25 19.82 2745.88 20.36 2735.7 24.31 2811.7 25.97 2799.43 25.61 2555.28 24.67 2304.98 25.59 2214.95 26.09 2065.81 28.37 1940.49 27.34 2042 24.46 1995.37 27.46 1946.81 30.23 1765.9 32.33 1635.25 29.87 1833.42 24.87 1910.43 25.48 1959.67 27.28 1969.6 28.24 2061.41 29.58 2093.48 26.95 2120.88 29.08 2174.56 28.76 2196.72 29.59 2350.44 30.7 2440.25 30.52 2408.64 32.67 2472.81 33.19 2407.6 37.13 2454.62 35.54 2448.05 37.75 2497.84 41.84 2645.64 42.94 2756.76 49.14 2849.27 44.61 2921.44 40.22 2981.85 44.23 3080.58 45.85 3106.22 53.38 3119.31 53.26 3061.26 51.8 3097.31 55.3 3161.69 57.81 3257.16 63.96 3277.01 63.77 3295. etc...
 
Output produced by software:

Enter (or paste) a matrix (table) containing all data (time) series. Every column represents a different variable and must be delimited by a space or Tab. Every row represents a period in time (or category) and must be delimited by hard returns. The easiest way to enter data is to copy and paste a block of spreadsheet cells. Please, do not use commas or spaces to seperate groups of digits!


Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time5 seconds
R Server'Sir Ronald Aylmer Fisher' @ 193.190.124.24


Goodness of Fit
Correlation0.7633
R-squared0.5826
RMSE498.0036


Actuals, Predictions, and Residuals
#ActualsForecastsResiduals
13030.292561.64447761194468.64552238806
22803.472561.64447761194241.825522388060
32767.632561.64447761194205.98552238806
42882.62561.64447761194320.95552238806
52863.362561.64447761194301.71552238806
62897.062561.64447761194335.41552238806
73012.612561.64447761194450.96552238806
83142.952561.64447761194581.30552238806
93032.932561.64447761194471.28552238806
103045.782561.64447761194484.13552238806
113110.522561.64447761194548.87552238806
123013.242561.64447761194451.59552238806
132987.12561.64447761194425.45552238806
142995.552561.64447761194433.90552238806
152833.182561.64447761194271.535522388060
162848.962561.64447761194287.31552238806
172794.832561.64447761194233.18552238806
182845.262561.64447761194283.61552238806
192915.032561.64447761194353.38552238806
202892.632561.64447761194330.98552238806
212604.422561.6444776119442.7755223880599
222641.652561.6444776119480.00552238806
232659.812561.6444776119498.1655223880598
242638.532561.6444776119476.88552238806
252720.252561.64447761194158.60552238806
262745.882561.64447761194184.23552238806
272735.72561.64447761194174.055522388060
282811.72561.64447761194250.055522388060
292799.432561.64447761194237.785522388060
302555.282561.64447761194-6.36447761193995
312304.982561.64447761194-256.66447761194
322214.952561.64447761194-346.69447761194
332065.812561.64447761194-495.83447761194
341940.492561.64447761194-621.15447761194
3520422561.64447761194-519.64447761194
361995.372561.64447761194-566.27447761194
371946.812561.64447761194-614.83447761194
381765.92561.64447761194-795.74447761194
391635.252561.64447761194-926.39447761194
401833.422561.64447761194-728.22447761194
411910.432561.64447761194-651.21447761194
421959.672561.64447761194-601.97447761194
431969.62561.64447761194-592.04447761194
442061.412561.64447761194-500.23447761194
452093.482561.64447761194-468.16447761194
462120.882561.64447761194-440.76447761194
472174.562561.64447761194-387.08447761194
482196.722561.64447761194-364.92447761194
492350.442561.64447761194-211.20447761194
502440.252561.64447761194-121.394477611940
512408.642561.64447761194-153.004477611940
522472.812561.64447761194-88.8344776119402
532407.62561.64447761194-154.044477611940
542454.622561.64447761194-107.024477611940
552448.052561.64447761194-113.59447761194
562497.842561.64447761194-63.80447761194
572645.642561.6444776119483.9955223880597
582756.762561.64447761194195.11552238806
592849.272561.64447761194287.62552238806
602921.442561.64447761194359.79552238806
612981.852561.64447761194420.20552238806
623080.582561.64447761194518.93552238806
633106.222561.64447761194544.57552238806
643119.312561.64447761194557.66552238806
653061.262561.64447761194499.61552238806
663097.313770.56642857143-673.256428571429
673161.693770.56642857143-608.876428571429
683257.163770.56642857143-513.406428571429
693277.013770.56642857143-493.556428571429
703295.323770.56642857143-475.246428571429
713363.993770.56642857143-406.576428571429
723494.173770.56642857143-276.396428571429
733667.033770.56642857143-103.536428571429
743813.063770.5664285714342.4935714285712
753917.963770.56642857143147.393571428571
763895.513770.56642857143124.943571428571
773801.063770.5664285714330.4935714285712
783570.123770.56642857143-200.446428571429
793701.613770.56642857143-68.9564285714287
803862.273770.5664285714391.7035714285712
813970.13770.56642857143199.533571428571
824138.523770.56642857143367.953571428572
834199.753770.56642857143429.183571428571
844290.893770.56642857143520.323571428572
854443.913770.56642857143673.343571428571
864502.643770.56642857143732.073571428572
874356.983770.56642857143586.413571428571
884591.273770.56642857143820.703571428572
894696.963770.56642857143926.393571428571
904621.43770.56642857143850.83357142857
914562.843770.56642857143792.273571428571
924202.523770.56642857143431.953571428572
934296.493770.56642857143525.923571428571
944435.233770.56642857143664.663571428571
954105.183770.56642857143334.613571428572
964116.683770.56642857143346.113571428572
973844.493770.5664285714373.923571428571
983720.983770.56642857143-49.5864285714288
993674.43770.56642857143-96.1664285714287
1003857.623770.5664285714387.0535714285711
1013801.063770.5664285714330.4935714285712
1023504.373770.56642857143-266.196428571429
1033032.63770.56642857143-737.966428571429
1043047.033770.56642857143-723.536428571429
1052962.343770.56642857143-808.226428571429
1062197.823770.56642857143-1572.74642857143
1072014.453770.56642857143-1756.11642857143
1081862.832561.64447761194-698.81447761194
1091905.412561.64447761194-656.23447761194
 
Charts produced by software:
http://www.freestatistics.org/blog/date/2010/Dec/21/t1292937668lc2qp7aiwrtkxz8/2jt1d1292937779.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/21/t1292937668lc2qp7aiwrtkxz8/2jt1d1292937779.ps (open in new window)


http://www.freestatistics.org/blog/date/2010/Dec/21/t1292937668lc2qp7aiwrtkxz8/3jt1d1292937779.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/21/t1292937668lc2qp7aiwrtkxz8/3jt1d1292937779.ps (open in new window)


http://www.freestatistics.org/blog/date/2010/Dec/21/t1292937668lc2qp7aiwrtkxz8/4uk1g1292937779.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/21/t1292937668lc2qp7aiwrtkxz8/4uk1g1292937779.ps (open in new window)


 
Parameters (Session):
par1 = 1 ; par2 = none ; par4 = no ;
 
Parameters (R input):
par1 = 1 ; par2 = none ; par4 = no ;
 
R code (references can be found in the software module):
library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}
 





Copyright

Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.

Software written by Ed van Stee & Patrick Wessa


Disclaimer

Information provided on this web site is provided "AS IS" without warranty of any kind, either express or implied, including, without limitation, warranties of merchantability, fitness for a particular purpose, and noninfringement. We use reasonable efforts to include accurate and timely information and periodically update the information, and software without notice. However, we make no warranties or representations as to the accuracy or completeness of such information (or software), and we assume no liability or responsibility for errors or omissions in the content of this web site, or any software bugs in online applications. Your use of this web site is AT YOUR OWN RISK. Under no circumstances and under no legal theory shall we be liable to you or any other person for any direct, indirect, special, incidental, exemplary, or consequential damages arising from your access to, or use of, this web site.


Privacy Policy

We may request personal information to be submitted to our servers in order to be able to:

  • personalize online software applications according to your needs
  • enforce strict security rules with respect to the data that you upload (e.g. statistical data)
  • manage user sessions of online applications
  • alert you about important changes or upgrades in resources or applications

We NEVER allow other companies to directly offer registered users information about their products and services. Banner references and hyperlinks of third parties NEVER contain any personal data of the visitor.

We do NOT sell, nor transmit by any means, personal information, nor statistical data series uploaded by you to third parties.

We carefully protect your data from loss, misuse, alteration, and destruction. However, at any time, and under any circumstance you are solely responsible for managing your passwords, and keeping them secret.

We store a unique ANONYMOUS USER ID in the form of a small 'Cookie' on your computer. This allows us to track your progress when using this website which is necessary to create state-dependent features. The cookie is used for NO OTHER PURPOSE. At any time you may opt to disallow cookies from this website - this will not affect other features of this website.

We examine cookies that are used by third-parties (banner and online ads) very closely: abuse from third-parties automatically results in termination of the advertising contract without refund. We have very good reason to believe that the cookies that are produced by third parties (banner ads) do NOT cause any privacy or security risk.

FreeStatistics.org is safe. There is no need to download any software to use the applications and services contained in this website. Hence, your system's security is not compromised by their use, and your personal data - other than data you submit in the account application form, and the user-agent information that is transmitted by your browser - is never transmitted to our servers.

As a general rule, we do not log on-line behavior of individuals (other than normal logging of webserver 'hits'). However, in cases of abuse, hacking, unauthorized access, Denial of Service attacks, illegal copying, hotlinking, non-compliance with international webstandards (such as robots.txt), or any other harmful behavior, our system engineers are empowered to log, track, identify, publish, and ban misbehaving individuals - even if this leads to ban entire blocks of IP addresses, or disclosing user's identity.


FreeStatistics.org is powered by