Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*Unverified author*

R Software Module

rwasp_edauni.wasp

Title produced by software

Univariate Explorative Data Analysis

Date of computation

Thu, 23 Oct 2008 08:45:51 -0600

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2008/Oct/23/t1224773216heavqodxycqfkgi.htm/, Retrieved Sun, 19 May 2024 15:53:25 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=18531, Retrieved Sun, 19 May 2024 15:53:25 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

Estimated Impact

167

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

F       [Univariate Explorative Data Analysis] [Univariate explor...] [2008-10-23 14:45:51] [4940af498c7c54f3992f17142bd40069] [Current]
-   PD    [Univariate Explorative Data Analysis] [] [2008-10-27 19:12:08] [888addc516c3b812dd7be4bd54caa358] 
-   P     [Univariate Explorative Data Analysis] [] [2008-10-27 19:24:14] [888addc516c3b812dd7be4bd54caa358] 

Feedback Forum

2008-10-30 12:54:03 [Tamara Witters] [reply] 
De student heeft voor elke assumtie naar de verkeerde grafiek gekeken. 
Hierbij de juiste oplossing: 
 
Assumption 1: Are the data autocorrelated? (The model assumes no autocorrelation) 
*Grafiek lagplot: 
We kunnen dit testen door de autocorrelatie of het Lagplot. De correlatie is een maatstaf die aangeeft in welke mate de punten op een rechte ligt. 
We kunnen afleiden uit de lag plot dat de autocorrelatie heel dicht bij 0 ligt. 
*Grafiek autocorrelation function: 
We kunnen ook kijken naar de autocorrelation function , dan kunnen we het aantal lags best wel op 36 zetten. 
We zien een eerste grote correlatie bij lag 12, een 2e grote bij lag 24, waardoor we kunnen besluiten dat er seizonale correlatie is. A.d.h hiervan kunnen we ook een voorspelling voor de toekomst maken.  
De horizontale  stippenlijnen geven de betrouwbaarheidintervallen weer met een waarschijnlijkheid van 95%. We hebben 5% kans dat de autocorrelatie erbuiten valt. 
Conclusie: De tijdreeks is niet random en bevat correlatie, nl seizoensgebonden correlatie.  
 
Assumption 2: Is the random component generated by a fixed distribution? (The model assumes a fixed distribution) 
 
Hiervoor moeten we kijken naar het histogram en density plot 
Je kan merken dat het verloop een normaalverdeling weergeeft (uitgezonderd aan de linkerkant een uitschieter) 
Ook kijken we naar het Q-Q plot 
We trekken een denkbeeldige lijn door de punten : liggen de punten op deze denkbeeldige lijn? 
De punten liggen toch vrij dicht op de lijn, bijgevolg: normaalverdeling. 
 
Assumption 3: Is the deterministic component constant? (The model assumes that the distribution has a fixed location) 
 
We kijken hierbij naar het “run sequence plot” 
Op LT is het niveau van deze reeks niet constant 
We moeten ons afvragen of het gemiddelde constant is, bijgevolg kijken we naar de central tendency.Bij de robustness van central tendency zien we toch dat het verloop vrij constant is.  MAAR op LT vermoeden we een dalende trend. 
 
 
Assumption 4: does the random component have a fixed variation? (The model assumes a distribution with fixed variation) 
 
Hiervoor gebruiken we ook de run  sequence plot We kijken naar de spreiding van de reeks over de tijd heen. Het linkse gedeeltde van de reeks schommelt harder. Bijgevolg is er een verandering van de spreiding over de jaren heen. 
 
 
Besluit: 
Er is niet aan alle voorwaarden voldoen dus de tijdreeks voldoet niet helemaal aan het model van: Clothing Production = constant + random component 
Vermits er seizoensgebonden correlatie is. 
 
2008-11-01 13:45:04 [66991d38d6a4b2d9fe97b6c889f3689c] [reply] 
assumtion 1: 
de student maakt hierbij gebruik van de foute grafiek. 
we moeten gebruik maken van de lagplot en de partial autocorrelation. 
bij de lagplot kunnen we het aantal lags instellen op 12 of 36. 
we kunnen hierin zien of we op basis van de vorige observatie iets kunnen zeggen over de huidige observatie. 
bij de partial autocorrelation kunnen we bij een instelling op 36 duidelijk zien dat alle autocorrelatie voor twaalf en na twaalf niet representatief zijn 
(maw: aan het toeval toe te schrijven) of negatief zijn. 
de meeste van deze waarden liggen binnen het betrouwbaarheids interval, de enkele die boven het betrouwbaarheidsinterval komen zijn niet representatief omdat hun kans op voorvallen veel kleiner is dan die bij 12.  
we besluiten bij assumtie 1 dat de tijdreeks niet random is maar correlatie bevat. in dit geval een speciale correlatie nl. seizoensgebonden correlatie. 
 
assumtion 2: 
de student maakt gebruik van de correcte grafieken.  
het is echter zo dat we van deze grafieken een normaalverdeling aflezen. bij de density plot zoeken we naar de bell shaped vorm, bij het histogram naar een piramide vorm. deze worden op 1 uitschieter na bereikt maar het is zo dat deze uitschieter niet zorgwekkend is. we kunnen dus besluiten dat het hier gaat om een normaal verdeling.  
deze conclusie wordt ook bevestigd door het Q-Q plot. 
opmerking hierbij: er is een normaalverdeling bij autocorrelatie.  
dit staaft de voorgaande regel dat er wanneer er geen corr. is er steeds normaalverdeling optreedt maar dat dit niet vice versa is. (wanneer er wel autocorr. is, is er niet noodzakelijk geen normaalverdeling) 
 
assumtion 3: 
hier ga ik volledig akkoord met de blog van de vorige student. 
 
assumtie 4: 
hierbij maken we gebruik van de run sequence plot.  
we kijken hierbij naar de spreiding van de reeks over de tijd heen.  
hierbij splitsen we de grafiek in twee delen. de spreiding van het eerste deel is groter dan die van het tweede deel.  
we besluiten hierbij dat er een verandering is van de schommeling en dus geen fixed variation.  
 
 
conclusie: de tijdreeks voldoet niet aan alle validiteitsvoorwaarden en is dus geen geldig model voor de vorm clothing production = constant + random component.

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	2 seconds
R Server	'Gwilym Jenkins' @ 72.249.127.135

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 2 seconds \tabularnewline
R Server & 'Gwilym Jenkins' @ 72.249.127.135 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=18531&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]2 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Gwilym Jenkins' @ 72.249.127.135[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=18531&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=18531&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	2 seconds
R Server	'Gwilym Jenkins' @ 72.249.127.135

Descriptive Statistics
# observations	61
minimum	66.5
Q1	80.6
median	87.3
mean	86.8934426229508
Q3	94.1
maximum	109.7

\begin{tabular}{lllllllll}
\hline
Descriptive Statistics \tabularnewline
# observations & 61 \tabularnewline
minimum & 66.5 \tabularnewline
Q1 & 80.6 \tabularnewline
median & 87.3 \tabularnewline
mean & 86.8934426229508 \tabularnewline
Q3 & 94.1 \tabularnewline
maximum & 109.7 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=18531&T=1

[TABLE]
[ROW][C]Descriptive Statistics[/C][/ROW]
[ROW][C]# observations[/C][C]61[/C][/ROW]
[ROW][C]minimum[/C][C]66.5[/C][/ROW]
[ROW][C]Q1[/C][C]80.6[/C][/ROW]
[ROW][C]median[/C][C]87.3[/C][/ROW]
[ROW][C]mean[/C][C]86.8934426229508[/C][/ROW]
[ROW][C]Q3[/C][C]94.1[/C][/ROW]
[ROW][C]maximum[/C][C]109.7[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=18531&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=18531&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Descriptive Statistics
# observations	61
minimum	66.5
Q1	80.6
median	87.3
mean	86.8934426229508
Q3	94.1
maximum	109.7

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Figure 4

PNG link

Postscript link

PDF link

Figure 5

PNG link

Postscript link

PDF link

Figure 6

PNG link

Postscript link

PDF link

Figure 7

PNG link

Postscript link

PDF link

Parameters (Session):

par1 = 0 ; par2 = 0 ;

Parameters (R input):

par1 = 0 ; par2 = 12 ;

R code (references can be found in the software module):

par1 <- as.numeric(par1)
par2 <- as.numeric(par2)
x <- as.ts(x)
library(lattice)
bitmap(file='pic1.png')
plot(x,type='l',main='Run Sequence Plot',xlab='time or index',ylab='value')
grid()
dev.off()
bitmap(file='pic2.png')
hist(x)
grid()
dev.off()
bitmap(file='pic3.png')
if (par1 > 0)
{
densityplot(~x,col='black',main=paste('Density Plot   bw = ',par1),bw=par1)
} else {
densityplot(~x,col='black',main='Density Plot')
}
dev.off()
bitmap(file='pic4.png')
qqnorm(x)
qqline(x)
grid()
dev.off()
if (par2 > 0)
{
bitmap(file='lagplot1.png')
dum <- cbind(lag(x,k=1),x)
dum
dum1 <- dum[2:length(x),]
dum1
z <- as.data.frame(dum1)
z
plot(z,main='Lag plot (k=1), lowess, and regression line')
lines(lowess(z))
abline(lm(z))
dev.off()
if (par2 > 1) {
bitmap(file='lagplotpar2.png')
dum <- cbind(lag(x,k=par2),x)
dum
dum1 <- dum[(par2+1):length(x),]
dum1
z <- as.data.frame(dum1)
z
mylagtitle <- 'Lag plot (k='
mylagtitle <- paste(mylagtitle,par2,sep='')
mylagtitle <- paste(mylagtitle,'), and lowess',sep='')
plot(z,main=mylagtitle)
lines(lowess(z))
dev.off()
}
bitmap(file='pic5.png')
acf(x,lag.max=par2,main='Autocorrelation Function')
grid()
dev.off()
}
summary(x)
load(file='createtable')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Descriptive Statistics',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'# observations',header=TRUE)
a<-table.element(a,length(x))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'minimum',header=TRUE)
a<-table.element(a,min(x))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Q1',header=TRUE)
a<-table.element(a,quantile(x,0.25))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'median',header=TRUE)
a<-table.element(a,median(x))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'mean',header=TRUE)
a<-table.element(a,mean(x))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Q3',header=TRUE)
a<-table.element(a,quantile(x,0.75))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'maximum',header=TRUE)
a<-table.element(a,max(x))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable.tab')

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code