PII Detection Versions Save

Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets.

v0.2.23

2 years ago

v0.2.22

2 years ago

v0.2.20

3 years ago

-enum_name is now included in restricted words-app does not freeze for the 2 last datasets you shared with me -outputs are all created in an 'outputs' folder, in the same directory as the source file

v0.2.19

3 years ago
  • Fixed bugs
  • Added restricted words for column names

v0.2.18

3 years ago

v0.2.12

3 years ago

v0.1.2

6 years ago

Changelog

  • Now bundled as an installable application: Application opens much quicker and without the black screen
  • Improved instruction text
  • Additional menu options, including for feedback

Introduction

This is an alpha release of an executable GUI application for identifying PII within a dataset. This is completely untested on IPA PII-containing datasets due to a lack of data access.

As an alpha release, it is expected that this will contain bugs and will not work for all users. Please share any issues or feedback you have resulting from use by filing an issue on GitHub or emailing. Please do not share this application outside of IPA at this time.

This Windows 7* version does not contain many features included in releases for modern operating systems. Some features not included:

  • Reviewing PII in-app
  • Removing PII from dataset
  • Recoding variables
  • Logging activities

Instead, this version employs a number of methods to identify fields that may contain PII, and then it lists those fields for the user to review and take action on outside of the application. Ensuring the dataset is devoid of PII is ultimately still your responsibility.

Instructions

  1. Download the Windows Installer (.msi file)
  2. Run the downloaded file
  3. Follow the installation instructions
  4. Open the app and follow its instructions

Notes

Plans for future development are included in the issues: https://github.com/PovertyAction/PII_detection/issues and on Asana: https://app.asana.com/0/418411014871343/543165118458083

*This is compatible with Windows 10 as well, though a separate Windows 10 release with more features is intended.

v0.1.1

6 years ago

Changelog

  • Stata variable labels are now included in analysis
  • Spanish & Swahili variable names included in detection
  • Improved sensitivity controls

Introduction

This is an alpha release of an executable GUI application for identifying PII within a dataset. This is completely untested on IPA PII-containing datasets due to a lack of data access.

As an alpha release, it is expected that this will contain bugs and will not work for all users. Please share any issues or feedback you have resulting from use by filing an issue on GitHub or emailing. Please do not share this application outside of IPA at this time.

This Windows 7* version does not contain many features included in releases for modern operating systems. Some features not included:

  • Reviewing PII in-app
  • Removing PII from dataset
  • Recoding variables
  • Logging activities

Instead, this version employs a number of methods to identify fields that may contain PII, and then it lists those fields for the user to review and take action on outside of the application. Ensuring the dataset is devoid of PII is ultimately still your responsibility.

Instructions

  1. After downloading the .exe file, you will open it and select "Run".
  2. It may take a little while (up to a minute) for the main application to load and open.
  3. Then follow the in-app instructions.

Notes

Plans for future development are included in the issues: https://github.com/PovertyAction/PII_detection/issues and on Asana: https://app.asana.com/0/418411014871343/543165118458083

*This is compatible with Windows 10 as well, though a separate Windows 10 release with more features is intended.

v0.1.0

6 years ago

Introduction

This is the initial release of an executable GUI application for identifying PII within a dataset. It is completely untested on IPA PII-containing datasets due to a lack of data access.

As the initial release, it is expected that this will contain bugs and will not work for all users. Please share any issues or feedback you have resulting from use by filing an issue on GitHub or replying to this post on Chatter. Please do not share this application outside of IPA at this time.

This Windows 7* version does not contain many features included in releases for modern operating systems. Some features not included:

  • Reviewing PII in-app
  • Removing PII from dataset
  • Recoding variables
  • Logging activities

Instead, this version employs a number of methods to identify fields that may contain PII, and then it lists those fields for the user to review and take action on outside of the application. Ensuring the dataset is devoid of PII is ultimately still your responsibility.

Instructions

  1. After downloading the .exe file, you will open it and select "Run".
  2. It may take a little while (up to a minute) for the main application to load and open.
  3. Then follow the in-app instructions.

Notes

Plans for future development are included in the issues: https://github.com/PovertyAction/PII_detection/issues and on Asana: https://app.asana.com/0/418411014871343/543165118458083

*This is compatible with Windows 10 as well, though a separate Windows 10 release with more features is intended.