- Uses a multilingual corpus, illustrating results on diverse European languages.
- Includes both high-, medium-, and lower-resource languages, i.e. beyond English.
- Expands on previous work (Plant et al., 2021) with MDP and CGT techniques.
- Uses a multilingual corpus, illustrating results on diverse European languages.
- Includes both high-, medium-, and lower-resource languages, i.e. beyond English.
- Expands on previous work (Plant et al., 2021) with MDP and CGT techniques.
- Compares privacy-preservation techniques across popular language models.
-Analysis demonstrates a significant reduction in relative attack success.
New paper with Richard Plant "You Are What You Write: Author Re-identification Privacy Attacks in the Era of pre-trained Language Models" at Computer Speech & Language: doi.org/10.1016/j.cs...