dpUGC: Learn differentially private representation for user generated contents

Vu, X-S; Tran, Son; Jiang, L

File(s) under permanent embargo

dpUGC: Learn differentially private representation for user generated contents

conference contribution

posted on 2023-05-23, 14:36 authored by Vu, X-S, Son TranSon Tran, Jiang, L

This paper firstly proposes a simple yet efficient generalized approach to apply differential privacy to text representation (i.e., word embedding). Based on it, we propose a user-level approach to learn personalized differentially private word embedding model on user generated contents (UGC). To our best knowledge, this is the first work of learning user-level differentially private word embedding model from text for sharing. The proposed approaches protect the privacy of the individual from re-identification, especially provide better trade-off of privacy and data utility on UGC data for sharing. The experimental results show that the trained embedding models are applicable for the classic text analysis tasks (e.g., regression). Moreover, the proposed approaches of learning difierentially private embedding models are both framework- and dataindependent, which facilitates the deployment and sharing. The source code is available at https://github.com/sonvx/dpText.

History

Publication title

Proceedings of the 20th International Conference on Computational Linguistics and Intelligent Text Processing

Pagination

1-16

Department/School

School of Information and Communication Technology

Publisher

Springer

Place of publication

New York, United States

Event title

20th International Conference on Computational Linguistics and Intelligent Text Processing

Event Venue

La Rochelle, France

Date of Event (Start Date)

2019-04-07

Date of Event (End Date)

2019-04-13

Rights statement

Copyright unknown

Repository Status

Restricted

Socio-economic Objectives

Information systems, technologies and services not elsewhere classified

Usage metrics

Keywords

privacy generated text private word embedding differential privacy UGC

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

dpUGC: Learn differentially private representation for user generated contents

History

Publication title

Pagination

Department/School

Publisher

Place of publication

Event title

Event Venue

Date of Event (Start Date)

Date of Event (End Date)

Rights statement

Repository Status

Socio-economic Objectives

Usage metrics

Categories

Keywords

Licence

Exports