Skip to content

Instantly share code, notes, and snippets.

@billfitzgerald
Created October 27, 2021 02:46
Show Gist options
  • Select an option

  • Save billfitzgerald/4be7d2134d8eb111110a79b35be003c2 to your computer and use it in GitHub Desktop.

Select an option

Save billfitzgerald/4be7d2134d8eb111110a79b35be003c2 to your computer and use it in GitHub Desktop.

Revisions

  1. billfitzgerald created this gist Oct 27, 2021.
    618 changes: 618 additions & 0 deletions pathways_to_nudity.txt
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,618 @@
    fying pathways to harmful groups about nudity

    Identifying pathways to harmful
    groups about nudity

    A key component of the Drebbel system is to discover pathways to harmful entities a user might
    take when engaging with our recommendation surfaces. As part of this effort, we have built a
    workflow to identify entities that act as gateways to recognized harmful entities. In this note, we
    apply this workflow to focus on groups considered harmful due to nudity and sexual activity.

    e Gateway groups for nudity/sexual activity harm seem to facilitate eventual connections
    to non-rec Groups. We should consider interventions that are either targeted towards
    users in these gateway groups, or at the entity-level in order to prevent these
    downstream connections from happening.

    Specific interventions we propose include: GYSJ seed filtering, invite friction and entity-
    level demotion. We are working with the Deamplification team to pursue experiments
    both at entity-level and at the edge-level.

    We should stress however, that not a// gateway groups are potentially problematic in
    and of themselves; we should use other signals of harm (e.g., number of members

    flagged as non-rec, group demotion score etc.) in conjunction to determine the ones
    that we want to consider enforcing on more aggressively.

    +." addition, we believe Gateway groups can be used as (sparse) features to improve
    fecall of existing models. We are working with the Entit
    evaluate models using these groups as features.

    ~
    LW
    oc
    oO
    Zz
    O
    O
    ao
    O
    LL
    y & Actor Understanding team to Q

    Lu
    =
    O
    <
    Q
    Lu
    oc

    n Gateway groups

    ays to harmful entities, we wanted to explore the question “Are there
    and increased the probability of a user joining harmful roups



    2" We call
    Identifying pathways to harmful
    groups about nudity

    A key component of the Drebbel system is to discover pathways to harmful entities a user might
    take when engaging with our recommendation surfaces. As part of this effort, we have built a
    workflow to identify entities that act as gateways to recognized harmful entities. In this note, we
    apply this-workflow to focus on groups considered harmful due to nudity and sexual activity.

    ¢ Gateway groups for nudity/sexual activity harm seem to facilitate eventual connections
    to non-rec Groups. We should consider interventions that are either targeted towards
    users in these gateway groups, or at the entity-level in order to prevent these
    downstream connections from happening.

    * Specific interventions we propose include: GYSJ seed filtering, invite friction and entity-
    level demotion. We are working with the Deamplification team to pursue experiments
    both at entity-level and at the edge-level.

    e We should stress however, that not a// gateway groups are potentially problematic in
    and of themselves; we should use other signals of harm (e.g., number of members

    flagged as non-rec, group demotion score etc.) in conjunction to determine the ones
    that we want to consider enforcing on more aggressively.

    . eo addition, we believe Gateway groups can be used as (sparse) features to improve

    recall of existing models. We are working with the Entity & Actor Understanding team to
    evaluate models using these groups as features.

    pn Gateway groups




    ways to harmful entities, we wanted to explore the question “Are there
    a and increased the probability of a user joining harmful groups?” We call

    REDACTED FOR CONGRESS


    “_ In addition, we believe Gateway groups can be used as (sparse) features to improve
    recall of existing models. We are working with the Entity & Actor Understanding team to

    evaluate models using these groups as features.

    Quick refresher on Gateway groups

    As part of studying pathways to harmful entities, we wanted to explore the question “Are there
    groups that facilitated and increased the probability of a user joining harmful groups?” We call
    such groups gateway groups as they often lead people to join harmful groups.

    Here, we provide a brief overview of how we detect gateway groups. For thorough details see this
    note.

    Probability of joining
    harmful groups
    spikes after joining
    gateway group

    Group



    JOM ”
    7 Y
    Ww
    ag
    ©
    Zz
    Oo
    O
    oa
    Oo
    LL
    Our evaluates ui
    joning » herr ocr conte Model — in fn
    harmful eae am detect =
    the gateway groups ]
    QO
    Lu
    a

    To answer the

    peas

    question, we first build a classifier that, given a list of groups joined by an user, can
    gh a ; the user will end up joining a given targe



    ecurac hethe
    To answer the question, we first build a classifier that, given a list of groups joined by an user, Can
    predict with high accuracy whether the user will end up joining a given target harmful group. For 4
    particular user, after every group they join, we evaluate the probability of them joining a harmful
    group in the future. If this probability spikes after a group join, that is a sign that the group just
    joined might be a gateway. If this spike happens for multiple users, after joining the same group,
    we identify it as a gateway group.

    For this note, we used as the set of target groups those based in US with at least 60 content-level
    strikes for nudity and sexual activity in the month of March (source table

    @au_ nudity _sexual_activity_strike_harm_source: integrity)

    What pathways lead from gateway groups to harmful nudity groups?

    source 7 num 7 confirmed_joins
    gysj 1326540 1234089
    w”
    Y)
    mobile_group_join 800422 737317 Lu
    oe ag
    mobile_add_members 653997 408187 2
    . ©
    470540 423893 O
    oc
    search 247682 225847 O
    Le
    group_mall 239872 207585 |
    newsfeed_story_header 208814 185000 5
    <x
    newsfeed
    |_reshared_story 202309 182748 =
    lead from gateway groups to harmful nudity groups?














    7 num ¢ confirmed_joins
    1326540 1234089 Li
    800422 737317
    653997 408187
    470540 423893
    247682 225847






    239872 207585



    208814 185000



    _ 202309 182748





    182315 166570
    132268
    132268 120918
    106177 93785
    88839 58065
    61462 54135
    45458 43628
    enger_group_attachment 38879 35208



    re sources of j joins of gateway group members to target harmful groups over all time. We



    7 num # confirmed_joins









    320524




    268211



    _ 251610
    Nia i lat la













    — 149706 151795
    newsfeed_story_header 148850 134951
    newsfeed_reshared_story 142128 127599
    mobile_add_members 118133 63896
    Siam ptiachmert 62775 55977

    groups_discover_tab 45399 38031
    permalink 40290 35186
    __Search 35605 29506
    22375 18304
    30 ‘é
    “)
    21895 19170 uw
    ©
    16014 es
    14232 Z
    O
    10827 5444 z

















    J a pathway from nudity gateway groups to other non-rec groups?

    -e Users in gateway groups subsequently join non-rec groups because of exposure to

    GYSJ recommendations

    Results
    ¢ 10.77% of users who joined one of the top 100 gateway groups (ranked by highest
    gateway score) we identify, eventually joined a non-rec group through exposure to
    GYSJ vs. 8.78% of those who had no exposure to GYSJ

    _ Mitigations

    i
    '* We should consider filtering out the top gateway groups from GYSJ seeds

    teway groups being targeted by “super-inviters"?

    e a big source of invitations from gateway groups



    red in PYMI invitations join more non-rec groups

    s join more non-rec groups through PYMK (friending > inv


    ‘© 35% of invites (~730K) to these harmful groups went to members after they joined one
    ‘of the top 100 gateway groups. Of these 730K invites, 20% came from “super-inviters”

    * We did not see evidence supporting the PYMI hypothesis; roughly equal fractions of
    users between control and testing in the long-term PYMI holdout eventually joined non-
    Me

    4 rec groups.

    | F * We also did not see enough evidence to suggest that PYMK influences connections to
    harmful groups either through featuring more users as candidates or showing them
    more friend recommendations

    ___* Introduce feature limits on super-inviters, e.g., number of bulk invites that can be sent
    it by super-inviters. We can make this more targeted by focusing only on invites going

    to users in a gateway group but this is a more intrusive enforcement and would
    sre thought about how we communicate this intervention to the actor.



    Non-rec groups

    themselves good predictors of non-rec groups

    groups for the nudity harm target list, 47 are
    Results
    e Out of the top 100 gateway groups for the nudity harm target list, 47 are correctly
    labeled non-rec; importantly, 42 of these were labeled as non-rec after the workflow
    ran. Although the model is not intended for predicting overall non-rec signal (the model
    is trained on a specific subset of harm strikes — nudity & sexual activity — and so
    would miss out on groups determined non-rec for other harms), this is nonetheless a
    strong indicator of how important the model could be as a signal upstream

    Mitigations
    * We should use gateway groups as a (sparse) feature powering our entity models for

    determining non-amplifiable and non-rec entities.

    e inconjunction with other signals, such as content strike roll-ups, number of non-rec
    members, entity strikes, we can pursue entity-level demotions. Our signal has high
    correlation with the number of group members considered non-rec and has positive
    correlation with other signals such as strikes and the CPI non-amplifiable flag







    1.0 ee

    gateway_score 0.079 0.23 -0.031 0.052 0.025 0.085 5
    ci_ri_strikes 08 ©

    O

    num_nr_members oa
    06 O

    ci_risevere strikes BIR SMUET: LL
    Q

    group demote Buiusya oF
    O

    * Teme 0.025 0.31 7 o2 <&
    Q

    non_rec BUM Lu

    a

    0.0
    members, entity strikes, we can pursue entity-ievel Gemotions. Uur signal nas nigh
    correlation with the number of group members considered non-rec and has positive
    correlation with other signals such as strikes and the CPI non-amplifiable flag.

    1.0

    gateway score 0.079 0.23 -0.031 0.052 0.025 0.085 ‘i
    ci ri strikes . oct} 0.68 emcee 10 Moe] 0.8
    num_nr_members ‘ P 0.11 0.17 0.12 06



    ci_ri_severe_strikes , 0.11 0.35 0.37
    F 0.4
    group demote 5 Oy WM (0) Ss. LORets}
    non_amp F 0.12 0.37 : A 0.2
    non_rec : 0.082 0.25
    i) 0.0

    ov o wi ® Vv a o
    L j i y
    o ne v MX ° —E te
    U = 2 = c © :
    av 5 S Pw} & I c
    I av c an © c °
    > et w ! . So fe
    o = = v ay &
    & 5 $ 2
    a ¢ > °o
    © \ % 5
    a = “ o
    = i
    A =
    G

    REDACTED FOR CONGRESS


    arout

    From an ads perspective this might
    be an interesting feature to identify
    advertisers, business, or other
    commercial entities that might be
    worth enforcing against.



    in case you see
    additional uses or other folks to
    tag.

    Also I'm’going to call it here and



    REDACTED FOR CONGRESS
    From an ads perspective this might
    be an interesting feature to identify
    advertisers, business, Of other
    commercial entities that might be
    worth enforcing against

    in case you see
    additional uses or other folks to
    tag.

    Also I'm going to call it here and
    now that ABP will become ABC at
    some point cause advertisers,
    business, and commerce just kinda
    rolls off the tongue better.

    Oo




    thanks for the tag.
    are you already connected
    with business integrity (Bl)?
    Within BI, you probably want
    to talk to 2 groups:

    1. enforcement folks (I
    assume we also have rules
    against nudity in ads)

    2. actor level enforcement
    a
    there are ad accounts,
    advertisers etc. that you've
    identified are problematic.

    Additionally, you might find
    some pages integrity folks
    helpful, I'm not sure who is

    the right person but start with
    as fou aren't

    REDACTED FOR CONGRESS
    are OU sires t

    th business integrity (Bl)?
    Within BI, you probably want
    to talk to 2 groups
    1. enforcement folks
    assume we also have rules
    against nuaity In ads)

    (PM a). if
    there are ad accounts
    advertisers etc. that you've

    identified are problematic.
    Additionally, you might find

    some pages integrity folks
    helpful, I'm not sure who Is

    the right person but start with

    Jan Kodovsky if you aren't
    already in contact with them.

    o

    Like - Reply
    = es ....:-.

    aspect we're studying in Drebbel -
    gateway entities along the path to
    harmful end states

    Like - Reply

    This is super interesting, how
    transferable is this approach to
    other areas with gateway groups?
    Wondering if we can leverage this

    — for violence cc

    Like Reply id ©

    = a. workflow is
    omain independent and

    REDACTED FOR CONGRESS


    Additionally, you might find
    some pages integrity folks
    helpful, I'm not sure who Is
    the right person but start with
    if you aren't
    already in contact with them.




    Like - Reply
    aspect we're studying in Drebbel -

    gateway entities along the path to
    harmful end states



    another

    Like - Reply

    a This is super interesting, how

    transferable is this approach to
    other areas with gateway groups?
    Wondering if we can leverage this

    approach for violence cc hl
    ©



    Like . Reply

    a a. workflow is

    domain independent and
    finds gateway groups for any
    given set of target groups.
    We are already using it to find
    gateways for the militia
    network in Ethiopia. We are
    looking for other areas to
    apply this workflow on and
    would be great to collaborate!

    Wo

    Like . Reply - 1d

    th | | Write a reply... > @



    REDACTED FOR CONGRESS

    ©