mardi 5 janvier 2016

Final test for Statistical Significance Bird aggression scores

Introduction

I'm doing a small pilot research in bird aggression in a colonising frontier regarding their breeding ground.

Background

The study was conducted over multiple years, presenting the colonising (south) and settled (north) collared flycatcher males with conspecific and pied flycatcher males. Scoring their behaviour based upon a quantifiable set aggressive actions. They have found this Island 60 years ago and are steadily spreading from one point in their breeding ground and pushing their relative pied flycatcher away from the more insect bearing territories. In previous studies it was shown that more aggressive males are at the front of such colonising action. In the north sites there is a near 100% collared and the south still has a mixed population.

Hypothesises

In the south location male collared flycatcher will act with higher aggression towards both species. Males will react in the north relatively more to conspecifics than they would in the south.

Problem

After having scored all the interactions I'm now at a loss at what test to use to present the data. Many people give different advise lm or simple Anova etc I have been learning R and statistics at the same time but many terms still confuse me and questions and answers found on the internet I found difficult to interpret to my data. (This is where I bother you).

Question

What test out of the following three could be best used to show that there is or is not a statistical significance?

  • Anova(lm(score~dummy_species*location))
  • summary(aov(score~dummy_species*location))
  • summary(lm(score~dummy_species*location))

Data structure

The data is unfortunately unbalanced.

The amount of conspecific trials was 104 of which 77 were in the northern test area and 27 in the south. Similarly of the 50 pied flycatcher dummy tests 36 were in the north and 14 in the south.

'data.frame':   154 obs. of  8 variables:

 $ location        : Factor w/ 2 levels "N","S": 1 1 1 1 1 1 1 1 2 1 ...

 $ score           : int  1 4 0 1 1 8 9 9 4 3 ...

 $ dummy_species   : Factor w/ 2 levels "CF","PF": 1 1 2 2 1 1 1 1 1 2 ...


model.tables(aov(scoreCF$score~scoreCF$location),"means")

Tables of means

Grand mean

2.993506

 dummy_species 

     CF    PF

  3.529  1.88

rep 104.000 50.00


 location 
      N      S
      2.742  3.686
rep 113.000 41.000


 dummy_species:location 

         location

dummy_species N     S    

      CF   3.19  4.48

      rep 77.00 27.00

      PF   1.81  2.07

      rep 36.00 14.00

TukeyHSD(aov(score~dummy_species*location))

Tukey multiple comparisons of means

95% family-wise confidence level


Fit: aov(formula = score ~ dummy_species * location)


$dummy_species

       diff       lwr        upr     p adj

PF-CF -1.648846 -2.613568 -0.6841239 0.0009332


$location

     diff         lwr    upr     p adj

S-N 0.9440487 -0.07800284 1.9661 0.0699746


$`dummy_species:location`

           diff        lwr        upr     p adj

PF:N-CF:N -1.389250 -2.8774793 0.09898005 0.0766924

CF:S-CF:N  1.286676 -0.3619293 2.93528192 0.1824646

PF:S-CF:N -1.123377 -3.2649782 1.01822492 0.5246337

CF:S-PF:N  2.675926  0.7993571 4.55249475 0.0016744

PF:S-PF:N  0.265873 -2.0557788 2.58752484 0.9908082

PF:S-CF:S -2.410053 -4.8376320 0.01752615 0.0524523

Results

*Anova(lm(score~dummy_species*location))
Anova Table (Type II tests)*

Response: score

                    Sum Sq  Df F value    Pr(>F)  

dummy_species            93.91   1 11.6673 0.0008186 ***

location                 26.82   1  3.3326 0.0699100 .  

dummy_species:location    6.98   1  0.8675 0.3531437    

Residuals              1207.39 150         

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


*> summary(aov(score~dummy_species*location))*

                    Df Sum Sq Mean Sq F value   Pr(>F)    

dummy_species            1   91.8   91.80  11.405 0.000933 ***

location                 1   26.8   26.82   3.333 0.069910 .  

dummy_species:location   1    7.0    6.98   0.868 0.353144    

Residuals              150 1207.4    8.05          

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


*> summary(lm(score~dummy_species*location))*


Call:

lm(formula = score ~ dummy_species * location)


Residuals:

    Min      1Q  Median      3Q     Max 

-4.4815 -2.1948 -0.8056  2.1280  6.9286 


Coefficients:

                      Estimate Std. Error t value Pr(>|t|)   

(Intercept)                 3.1948     0.3233   9.881   <2e-16 ***

dummy_speciesPF            -1.3892     0.5728  -2.425   0.0165 *  

locationS                   1.2867     0.6346   2.028   0.0444 *  

dummy_speciesPF:locationS  -1.0208     1.0960  -0.931   0.3531    

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


Residual standard error: 2.837 on 150 degrees of freedom

Multiple R-squared:  0.09423,   Adjusted R-squared:  0.07611 

F-statistic: 5.202 on 3 and 150 DF,  p-value: 0.001909

Thank you For taking the time, to have a look. Ideally given the time investment (in the field and behind the screen) I would love to have it that male aggression is likely influenced by both location and species. But only if the lm approach would be relevant.

Also my apologies for the layout of my question.

Aucun commentaire:

Enregistrer un commentaire