OpenAI open-sources Privacy Filter to mask PII locally
OpenAI released Privacy Filter under Apache 2.0 on GitHub and Hugging Face. The 1.5-billion-parameter model runs locally, masks eight PII types and scores 96% F1.
OpenAI published Privacy Filter under the Apache 2.0 license on GitHub and Hugging Face. The company made the model code and weights available for download, modification and commercial use.
Privacy Filter is a 1.5-billion-parameter model designed to run on ordinary laptops. It scans text, tags sensitive items and replaces them with generic placeholders such as [PRIVATE_PERSON], [PRIVATE_EMAIL] and [ACCOUNT_NUMBER].
The model targets eight categories of personally identifiable information: names, addresses, email addresses, phone numbers, URLs, dates, account numbers and secrets such as passwords and API keys. OpenAI says the model uses context from surrounding text rather than only pattern matching to decide whether a token is private.
OpenAI demonstrated the filter on an email example where a project file number, an email address and a phone number were replaced with the corresponding placeholders. The company reports a 96% F1 score on the PII-Masking-300k benchmark for default, out-of-the-box performance and a corrected test result near 97.4%.
Running the model locally is intended to keep raw text on a user’s device. By scrubbing personal data before text is sent to a cloud service or a chatbot, the filter reduces the amount of raw personal data that leaves the machine.
OpenAI listed practical uses including businesses summarizing customer emails without sending names to a remote API, lawyers drafting notes while preserving client confidentiality, clinicians preparing referrals without patient identities and developers removing API keys before sharing code. The company says everyday users can mask identifying details before pasting text into a public or cloud-based assistant.
OpenAI cautioned the tool is not an anonymization tool, a compliance certification or a substitute for policy review. The company warned the model can miss unusual identifiers, over-redact short sentences and perform unevenly across languages. OpenAI and privacy specialists describe the filter as one layer in a broader data protection workflow rather than a sole safeguard.
Releasing the project under Apache 2.0 allows researchers, developers and security teams to inspect the model, propose improvements and integrate it into products. The release follows a broader pattern of smaller models that can run on local hardware and let organizations process text without transmitting raw data to external servers.
The material on GNcrypto is intended solely for informational use and must not be regarded as financial advice. We make every effort to keep the content accurate and current, but we cannot warrant its precision, completeness, or reliability. GNcrypto does not take responsibility for any mistakes, omissions, or financial losses resulting from reliance on this information. Any actions you take based on this content are done at your own risk. Always conduct independent research and seek guidance from a qualified specialist. For further details, please review our Terms, Privacy Policy and Disclaimers.








