A-3 streams



Reading

streams <-read.csv("../../data/Art, Design, and Vocation are all diff-different(Sheet1).csv")
print(head(streams))
  SN Degree Course Year Letter.Grade Score Gender
1  1  B.Des    CAC    2            A   8.0      F
2  2  B.Des    CAC    2            O   9.6      F
3  3  B.Des   IADP    2           A+   9.2      F
4  4  B.Des     CE    2            O   9.8      F
5  5  B.Des   BSSD    2            P   3.0      M
6  6  B.Des    CAC    2            O   9.5      F

## Research Question: Art, Design, and Vocation are all different?

Munging

streams_modified <- streams %>%
  mutate(Gender = as.factor(Gender)) %>%
  mutate(Degree = as.factor(Degree))%>%
  mutate(Letter.Grade = as.factor(Letter.Grade))%>%
    mutate(Course = as.factor(Course))

EDA

gf_histogram(~Score,
  fill = ~Degree, 
  data = streams_modified, 
  alpha = 0.5,  
  bins = 25  
) %>%
  gf_vline(xintercept = ~ mean(Score, na.rm = TRUE),  
            linetype = "dashed", color = "red") %>%
  gf_labs(
    title = "Histogram of scores across degrees",
    x = "Scores", 
    y = "Count"
  ) %>%
  gf_text(
    label = "Overall Mean", 
    x = mean(streams_modified$Score, na.rm = TRUE),  
    y = 2, 
    color = "red"
  ) %>%
  gf_refine(guides(fill = guide_legend(title = ""))) 

gf_boxplot(
  data = streams_modified,
  Score ~ Degree,
  fill = ~Degree,
  alpha = 0.5
) %>%
  gf_vline(xintercept = ~ mean(Score, na.rm = TRUE)) %>%
  gf_labs(
    title = "Boxplots of scoress",
    x = "Degree", 
    y = "Scores",
   
  ) %>%
  gf_refine(
    scale_x_discrete(guide = "prism_bracket"),
    guides(fill = guide_legend(title = ""))
  )
Warning: The S3 guide system was deprecated in ggplot2 3.5.0.
ℹ It has been replaced by a ggproto system that can be extended.

- median score for bdes is the highest, lowest is b.voc and bfa lies between the two

  • outliers for b.voc

  • bvoc distribution is skewed left and bdes skewed right with high concentration of scores on higher end

  • bfa distribuion seems to be evenly spread out

Anova

streams_anova <- aov(Score ~ Degree, data = streams_modified)

supernova::pairwise(streams_anova,
  correction = "Bonferroni", # Try "Tukey"
  alpha = 0.05, # 95% CI calculation
  var_equal = TRUE, # We'll see
  plot = TRUE
)

── Pairwise t-tests with Bonferroni correction ─────────────────────────────────
Model: Score ~ Degree
Degree
Levels: 3
Family-wise error-rate: 0.049

  group_1 group_2   diff pooled_se      t    df  lower  upper  p_adj
  <chr>   <chr>    <dbl>     <dbl>  <dbl> <int>  <dbl>  <dbl>  <dbl>
1 B.FA    B.Des   -0.757     0.201 -3.770    87 -1.191 -0.323  .0009
2 B.Voc   B.Des   -0.173     0.201 -0.864    87 -0.607  0.261 1.0000
3 B.Voc   B.FA     0.583     0.201  2.906    87  0.149  1.017  .0139

Inferences

  • significant difference between the mean scores of B.FA and B.Des.

  • also a significant difference between the mean scores of B.Voc and B.Des and B.Voc and B.FA.

    B.Des > B.FA > B.Voc