How Private AI Can Help with Compliance under China’s Personal Information Protection Law (PIPL)

Share This Post

China’s Personal Information Protection Law (PIPL) that come into force November 1, 2021 sets out stringent requirements for the handling, processing, and protection of personal information. Organizations operating under this law must navigate a complex landscape of obligations, including limiting data collection, ensuring data security, managing sensitive information, and responding to data breaches. Private AI’s advanced machine-learning technology can play a crucial role in helping organizations meet these compliance requirements efficiently and accurately.

Note that the requirements under PIPL can apply to organizations outside of China as well. Article 3 clarifies that PIPL applies to personal information processing outside of China under the following circumstances:

  1. Where the purpose is to provide products or services to natural persons inside the borders;
  2. Where analyzing or assessing activities of natural persons inside the [Chinese] borders;
  3. Other circumstances provided in laws or administrative regulations.

Accurate Identification and Management of Personal Information

Under PIPL, the definition of personal information is broad, covering any data related to identified or identifiable individuals, explicitly excluding anonymized information (Article 4), defined as follows:

“Anonymization” refers to the process of personal information undergoing handling to make it impossible to distinguish specific natural persons and impossible to restore.

This aligns closely with the concept of personal data under other global privacy laws, such as the GDPR, while the GDPR employs a somewhat less onerous definition of anonymized data by adding a reasonableness threshold for the re-identification assessment performed on the data. 

Private AI’s machine-learning models are trained to recognize over 50 different types of personal data entities across 53 languages, including sensitive information categories as defined in Article 28 of PIPL (e.g., biometric data, health information, financial data). This capability ensures that organizations can accurately identify and manage the personal information they handle.

By providing precise identification of personal information within data sets—whether structured, semi-structured, or unstructured—Private AI enables organizations to comply with PIPL’s data minimization principle (Article 6). This principle requires that data collection and processing be limited to the minimum necessary to achieve specific purposes, prohibiting the collection of excessive personal information. With Private AI, organizations can assess whether they are collecting only the necessary data and identify areas where data can be minimized or removed altogether.

Ensuring Compliance with Data Handling and Retention Limits

PIPL mandates that personal information be retained only for the shortest period necessary to achieve the processing purpose (Article 19). Private AI’s technology can assist organizations in managing their data retention policies by identifying and categorizing personal information, making it easier to implement retention schedules and ensure compliance with PIPL’s data retention and management requirements, for example the requirement under Article 51 to implement a categorized management of personal information. With regard to the limitation of data retention, note that while PIPL does not explicitly say so, we can conclude from the definition of personal information that anonymization is equivalent to data disposition.

Moreover, when organizations need to entrust third parties with the handling of personal information (Article 21), Private AI helps ensure that the entrusted data does not exceed the scope agreed upon in the contract. This is crucial for avoiding unauthorized processing or retention of personal information by third parties. In this context, too, data anonymization supported by Private AI can be helpful, as PIPL explicitly states that third party data recipients are not permitted to retain the personal information they have received. Presumably, anonymizing the data would meet this requirement. 

Handling Sensitive Personal Information

Sensitive personal information under PIPL requires additional protections and must be processed only when absolutely necessary (Article 28). Private AI’s machine learning models can detect and categorize sensitive data types, enabling organizations to apply stricter controls and obtain the necessary separate consent (Article 29). This is particularly important in sectors like healthcare, finance, and technology, where sensitive personal information is frequently processed.

Facilitating Incident Response and Data Breach Management

In the event of a data breach, PIPL requires prompt action and notification to relevant authorities and affected individuals (Article 57). Private AI’s solution can quickly identify the types of personal information involved in a breach, helping organizations assess the severity of the incident and determine whether notification obligations apply, which demand that the categories of personal information subject to the breach are identified. By ensuring that the correct information is reported, Private AI aids in the effective management of data breaches and compliance with PIPL’s incident response requirements.

Supporting Compliance with De-identification and Anonymization Requirements

PIPL distinguishes between de-identification and anonymization (Article 73), with specific requirements for each. De-identification, where personal information is processed to ensure that individuals cannot be identified without additional information, is a security measure as per Article 51, which information handlers are obliged to adopt, considering the purpose of processing, the sensitivity of the information, possible security risks, etc.

Private AI’s technology can assist in the de-identification process by identifying and removing direct identifiers, helping organizations meet the de-identification standards set forth by PIPL.

In cases where anonymization is desirable as it excludes the information from the application of PIPL, Private AI’s technology provides the first essential step, identification and removal of personal identifiers. Automating this step can save considerable time and effort, especially since the accuracy requirement is so high under PIPL. Recall that anonymization requires the data to irreversibly no longer be linkable to an individual. For it to be impossible to link an individual identify individual whose data is contained in a data set, the identification of personal identifiers in the data is crucial. Subsequently, depending on the data, an expert should be consulted to assess the re-identification risk and, where required, take further measures to ensure the high standard for anonymization is met.

Conducting Personal Information Protection Impact Assessments

PIPL requires organizations to conduct Personal Information Protection Impact Assessments in certain high-risk scenarios, such as handling sensitive personal information or transferring data abroad (Articles 55 and 56). Private AI’s detailed reports on the types and locations of personal information within data sets provide a robust foundation for these assessments, ensuring that relevant risks are identified effectively.

Conclusion

Private AI offers advanced solutions that align with the rigorous requirements of China’s PIPL. By leveraging cutting-edge machine-learning technology, organizations can accurately identify, manage, and protect personal information, ensuring compliance with key provisions of the law. Whether it’s minimizing data collection, managing sensitive information, responding to data breaches, or conducting protection impact assessments, Private AI equips organizations with the tools needed to navigate PIPL’s complex regulatory landscape effectively.

To see the tech in action, try our web demo, or get an API key to try it yourself on your own data.

Subscribe To Our Newsletter

Sign up for Private AI’s mailing list to stay up to date with more fresh content, upcoming events, company news, and more! 

More To Explore

Download the Free Report

Request an API Key

Fill out the form below and we’ll send you a free API key for 500 calls (approx. 50k words). No commitment, no credit card required!

Language Packs

Expand the categories below to see which languages are included within each language pack.
Note: English capabilities are automatically included within the Enterprise pricing tier. 

French
Spanish
Portuguese

Arabic
Hebrew
Persian (Farsi)
Swahili

French
German
Italian
Portuguese
Russian
Spanish
Ukrainian
Belarusian
Bulgarian
Catalan
Croatian
Czech
Danish
Dutch
Estonian
Finnish
Greek
Hungarian
Icelandic
Latvian
Lithuanian
Luxembourgish
Polish
Romanian
Slovak
Slovenian
Swedish
Turkish

Hindi
Korean
Tagalog
Bengali
Burmese
Indonesian
Khmer
Japanese
Malay
Moldovan
Norwegian (Bokmål)
Punjabi
Tamil
Thai
Vietnamese
Mandarin (simplified)

Arabic
Belarusian
Bengali
Bulgarian
Burmese
Catalan
Croatian
Czech
Danish
Dutch
Estonian
Finnish
French
German
Greek
Hebrew
Hindi
Hungarian
Icelandic
Indonesian
Italian
Japanese
Khmer
Korean
Latvian
Lithuanian
Luxembourgish
Malay
Mandarin (simplified)
Moldovan
Norwegian (Bokmål)
Persian (Farsi)
Polish
Portuguese
Punjabi
Romanian
Russian
Slovak
Slovenian
Spanish
Swahili
Swedish
Tagalog
Tamil
Thai
Turkish
Ukrainian
Vietnamese

Rappel

Testé sur un ensemble de données composé de données conversationnelles désordonnées contenant des informations de santé sensibles. Téléchargez notre livre blanc pour plus de détails, ainsi que nos performances en termes d’exactitude et de score F1, ou contactez-nous pour obtenir une copie du code d’évaluation.

99.5%+ Accuracy

Number quoted is the number of PII words missed as a fraction of total number of words. Computed on a 268 thousand word internal test dataset, comprising data from over 50 different sources, including web scrapes, emails and ASR transcripts.

Please contact us for a copy of the code used to compute these metrics, try it yourself here, or download our whitepaper.