Abstract
Recommender systems are utilizing user data to enhance user experience through collection and analysis.
However, once a recommender is aware of a user's set of things or their ratings, many privacy concerns arise. Various solutions have been suggested to enhance the privacy of existing recommender systems. However, these solutions in the literature only focus on protecting either the things or the ratings. In this paper, we propose a recommender system that safeguards both the user's things and ratings. To achieve this, we have developed innovative matrix resolution algorithms using local differential privacy (LDP). In an LDP-based recommender system, individual users themselves randomize their data to ensure differential privacy and then send the anonymized data to the recommender. The recommender then computes aggregates of this anonymized data.
The framework guarantees the privacy of each user's belongings and ratings from the recommender. However, util
...izing LDP for matrix resolution typically leads to utility issues, such as high dimensionality due to a large number of items, and repetitive estimation algorithms. To address these technical challenges, we employ a dimensionality reduction technique and a unique binary mechanism based on sampling. Additionally, we introduce a stabilizing element for the discomposed gradients.
With the datasets Lens and LibimSeTi, we evaluate the recommendation accuracy of our recommender system and show that our algorithm outperforms the current differentially private gradient descent algorithm for matrix factorization when stronger privacy requirements are necessary.
Introduction
With the growing use of mobile devices, individuals are increasingly relying on online marketplaces for their purchases. This abundance of information often means that consumers are faced with numerous choices, making decision-making more challenging. Recommender systems assist consumers i
discovering items of interest and even suggesting additional items.
Recommender systems have become increasingly common in various aspects of our daily life, such as movie and product recommendations. These systems gather and analyze user information to enhance their overall experience. However, as user information contains personal data, privacy concerns may arise. In order to provide accurate recommendations, recommender systems collect (item, rating) pairs from each user.
Several studies have found that recommender systems can violate user privacy by extracting data that users did not specifically disclose. These systems can infer personal information like gender, age, and political beliefs based on the movies a user has watched. Our objective is to protect both user ratings and personal information while maintaining the quality of recommendations. One popular solution for addressing privacy concerns is differential privacy, which offers strong privacy guarantees.
Many works have applied differential privacy to recommender systems, most of which focus on output perturbation of cooperative filtering algorithms. These works assume a certain recommender scenario. However, with the increasing popularity of recommendation services, there are now several untrusted recommenders online that may misuse user privacy for purposes beyond their own service. Even for trusted recommenders, there is a significant security risk associated with possessing large amounts of sensitive personal information. Narayanan and Shmatikov demonstrated this risk by deanonymizing certain users in the Netflix Prize dataset using public IMDb profiles.
Many times, privacy concerns arise and it is recommended to use native privacy models in various application areas to protect users' privacy from untrusted servers. Some earlier works in recommendation services have also implemented native privacy models. Shen and Jin used differential privacy to protect the list of items
for each user, but their randomization method possibly revealed the user's class preference (e.g., genre) within the predefined categories. More recently, Hua et al. also proposed native models where a user's private information is randomized before being submitted to the recommender. However, these current methods have two shortcomings.
Originally, the purpose of these designs was to protect either the ratings given by users or the items being rated, but not both. In some cases, the items in a user's collection are just as sensitive as the ratings themselves. This is because analyzing the items a user has rated can reveal private information such as political beliefs or sexual orientation. Therefore, it is important to safeguard both the ratings and items of each user in order to enhance privacy. Additionally, these designs prioritize the protection of individual items or rating values within a user's overall collection of items and ratings.
However, in recommendation systems, there are sets of highly correlated things (e.g., movies within the same genre). Masking the presence or absence of one item among the correlated things is not sufficient to protect users' sensitive data. Additionally, users usually transfer multiple ratings at once. Thus, protecting a single rating or item does not align well with application. One possible approach is to enhance the current schemes by applying differential privacy to each individual item and rating.
The concept of Local Differential Privacy (LDP) Algorithm, which has gained significant attention in recent years, introduces a large error in aggregated data, leading to lower quality predictions. LDP has become a popular choice in analysis communities and associated industries as a means to prevent untrusted servers from identifying or accessing users'
personal information. In LDP, each user randomizes their data on their own device to ensure differential privacy, and then sends the randomized answers to an individual. The main goal of LDP is to obtain aggregates of users' data without collecting any personal information. LDP originates from irregular response, which was initially proposed by Warner in 1965.
In recent years, the applicability of LDP has been demonstrated by Google through RAPPOR and its LDP resolution in Chrome browser. Additionally, Apple's adoption of LDP in iOS ten has positioned it as a promising privacy protection technique. LDP aims to mask a user's presence in the data, known as per-user privacy. Within the context of recommender systems, LDP ensures that even if a user is changed to a different user with different ratings and items, the output of the collaborative filtering algorithm remains indistinguishable to an attacker. However, this stronger privacy requirement may result in significant prediction errors when dealing with a large number of items. It is challenging to reduce these errors caused by high dimensionality.
The paper aims to create a recommender system that ensures both user privacy and high-quality recommendations under Local Differential Privacy (LDP). To achieve this, the authors employ matrix resolution and other cooperative filtering techniques. They also introduce innovative differentially private gradient descent algorithms for matrix factorization. User data, including items, ratings, and profiles, are protected using a recent LDP resolution method proposed by Nguyen et al.
Regarding gradient perturbation, our matrix resolution formula demonstrates that the resulting item profiles do not significantly differ with or without user input. This ensures that per-user privacy is maintained, and we can reduce the extent of perturbation by
utilizing a spatiality reduction technique through random projection. As far as we know, our work is the first to propose a Local Differential Privacy (LDP) resolution for matrix resolution that safeguards both users' items and ratings while ensuring per-user privacy and preserving the quality of recommendations.
Conclusion
We have developed new matrix factorization algorithms within the framework of Local Differential Privacy (LDP). These algorithms offer improved privacy protection by guaranteeing per-user privacy and completely securing users' items and ratings.
We address the limitation of the private matrix factorization framework under LDP caused by high dimensionality by incorporating dimensionality reduction techniques and the randomization mechanism of Nguyen et al. The proposed randomization mechanisms require each user to transmit only one bit to a server, resulting in a significant reduction in communication cost from users to the server. Additionally, dimensionality reduction further reduces communication overhead from the server to users.
References
- https://www.wired.com/2016/06/apples-differential-privacycollecting-data/
- R. Balu and T. Furon. Differentially private matrix factorization using sketching Techniques. In IH;MMSec, pages 57–62, 2016.
- R. Bassily and A.
Smith. Local, private, efficient protocols for succinct histograms. In STOC, pages 127–135, 2015.
A. Berlioz, A. Friedman, M.
A. Kaafar, R. Boreli, and S. Berkovsky.
Applying differential privacy to matrix factorization in RecSys, pages 107–114, 2015. L. Brozovsky and V. Petricek's Recommender system for online dating service.
In Znalosti, 2007. (Available: http://www.occamslab.com/petricek/data/)
Kilzer, A. Narayanan, E. W. Felten, and V. Shmatikov emphasize in their study "You might also like" the privacy risks associated with collaborative filtering.
In IEEE S;P, pages 231–246, 2011.
Qin, S.P. Kasiviswanathan, and H. Jin's research focuses on private spatial data aggregation in the local setting.
In ICDE, pages 289–300, 2016.
Local privacy and statistical minimax rates. In FOCS, pages 429–438, 2013. C. Dwork, F. McSherry, K. Nissim, and A. Smith.
Calibrating noise to sensitivity in private data analysis. In TCC, pages 265–284, 2006. U. Erlingsson, V. Pihur, and A. Korolova. RAPPOR: Randomized aggregatable privacy-preserving ordinal response.
In CCS, pages 1054–1067, 2014.
Building a RAPPOR with the unknown: Privacy-preserving learning of associations and data dictionaries. PoPETS, 3, 2016.
F. M. Harper and Joseph A. Konstan. The MovieLens datasets: History and context.
ACM Transactions on Interactive Intelligent Systems (TiiS), 5, 1–19, 2015.
The text is a citation for a paper titled "Extensions of Lipshitz mapping into Hilbert space" by B. Johnson and J. Linden Strauss, enclosed in
tags.
In the Conference on modern analysis and probability, volume 26 of Contemporary Mathematics, pages 189–206, 1984.
K. Lam, D. Frankowski, and J. Riedl.
Do you have confidence in the recommendations made by recommender systems? This article titled "An exploration of security and privacy issues in recommender systems" was published in ETRICS in 2006 and spans pages 14-29. The authors of the article are Z., Liu, Y.-X. Wang, and A. J.
Smola's paper titled "Fast differentially private matrix factorization" was presented at the RecSys conference in 2015, spanning pages 171-178.
paper.
Mironov. Differentially private recommender systems: building privacy into the Netflix prize contenders. In SIGKDD, pages 627–636, 2009.
Robust de-anonymization of large sparse datasets was discussed in "IEEE S;P, pages 111–125, 2008."