# add essential pkgs here
library(tidyverse)
## ── Attaching packages ──────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0 ✔ purrr 0.2.5
## ✔ tibble 1.4.2 ✔ dplyr 0.7.8
## ✔ tidyr 0.8.2 ✔ stringr 1.3.1
## ✔ readr 1.1.1 ✔ forcats 0.3.0
## ── Conflicts ─────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
options(stringsAsFactors = F)
posts
and comments
.load("data/HatePolitics_Hang_20190101_2019_06_01.rda")
# you cade should be here
slice(1:5)
## # A tibble: 5 x 2
## commentor n
## <chr> <int>
## 1 gn02118620 7362
## 2 zeumax 5908
## 3 ninaman 5237
## 4 bottger 5062
## 5 bluesunflowe 4421
# you cade should be here
# you cade should be here
slice(1:5)
## # A tibble: 5 x 2
## poster n
## <chr> <int>
## 1 mark2165 198
## 2 kolod546 106
## 3 sincsnow 86
## 4 c1951 85
## 5 fantasy14 82
cindex
指出該則回文為該篇貼文的第幾則回文,ctotal
指出該則回文所在的貼文共有幾則回文。group_by
and mutate()
cindex
ctotal
row_number()
to get the order of each comment.
# you cade should be here
# you cade should be here
select(tag, commentor, cindex, ctotal) %>%
head(n=20)
## # A tibble: 20 x 4
## tag commentor cindex ctotal
## <chr> <chr> <int> <int>
## 1 "→ " tarcowang 1 103
## 2 "推 " alex8725 2 103
## 3 "→ " lovebxcx 3 103
## 4 "推 " want150 4 103
## 5 "→ " want150 5 103
## 6 "→ " lovebxcx 6 103
## 7 "→ " dai26 7 103
## 8 "→ " lovebxcx 8 103
## 9 "→ " lovebxcx 9 103
## 10 "→ " lovebxcx 10 103
## 11 "推 " f124 11 103
## 12 "推 " willy0526 12 103
## 13 "→ " lovebxcx 13 103
## 14 "→ " dodolong0310 14 103
## 15 "噓 " poolplayer 15 103
## 16 "推 " Stigmata 16 103
## 17 "推 " YINCHAUN 17 103
## 18 "噓 " simon78410 18 103
## 19 "→ " lovebxcx 19 103
## 20 "→ " lovebxcx 20 103
avg_order
比方說,某人共在三篇貼文的回應樓層為第3樓、第6、第9樓,其平均回應樓層為6);如果除以該則貼文的總回文樹的話,就會變成平均回文樓層比例(以下範例為有除以總回文數的結果)。另外順便計算標準差(sd_order
)以及每個人的總回文數量tot_comment
。
# you cade should be here
# you cade should be here
slice(1:10)
## # A tibble: 10 x 4
## commentor avg_order sd_order tot_comment
## <chr> <dbl> <dbl> <int>
## 1 jma306 0.0549 0.0753 13
## 2 ckbdfrst 0.0898 0.0537 14
## 3 popy8789 0.0961 0.0665 15
## 4 flavorBZ 0.0969 0.0833 18
## 5 wangyc 0.105 0.0621 14
## 6 GalLe5566 0.127 0.0566 14
## 7 WeiYinChen16 0.133 0.176 16
## 8 hachime 0.142 0.0998 20
## 9 Rrrxddd 0.156 0.202 32
## 10 jk952840 0.159 0.100 14
針對每一個回文者,計算出下列數值 1. 一共回過多少則post,一個post多個回文仍只算一次 2. 針對每篇post,最多曾回應一篇post幾次 3. 針對每篇post,最少曾回應一篇post幾次 4. 針對每篇post,平均回應過幾次 5. 針對每篇post,回應次數的標準差
# you cade should be here
# you cade should be here
slice(1:30)
## # A tibble: 30 x 6
## commentor num_post max_c2p min_c2p mean_c2p sd_c2p
## <chr> <int> <dbl> <dbl> <dbl> <dbl>
## 1 aylao 1775 29 1 4.27 5.45
## 2 bottger 1569 148 1 15.0 27.2
## 3 ninaman 1497 87 1 16.3 18.1
## 4 zeumax 1286 110 1 17.4 22.1
## 5 gn02118620 1206 128 1 36.4 36.7
## 6 bluesunflowe 1197 86 1 18.3 21.3
## 7 pupu20317 1190 103 1 15.6 24.6
## 8 formatted 1114 77 1 15.6 18.6
## 9 alex8725 984 64 1 12.1 13.9
## 10 c1951 886 38 1 6.19 7.05
## # ... with 20 more rows
# your idea here
# your code here