[email protected] wrote:
I will be receiving data files that include as the primary ID a social
security number (SSN). Yes, using SSNs as IDs is a worrisome practice, but
the agency we are dealing with is not going to change this policy, at least
in the near term.
I am obligated to protect the privacy of this information, and if the data
production were a one-time event, I could use some variant of a uniformly
distributed random number to generate an alternative ID and keep the
cross-walk between the SSN and the uniform random number locked in a
separate location. However, there will be ongoing updates that will require
match merges based on the SSN, along with the addition of new cases to the
population.
Has anyone on the list developed code for encrypting/decrypting a field
that they could send me? I know that there is C++ code in the free Cryptlib
toolkit but I would prefer not to have to plunge into this unless it's
really necessary.
----------------------------------------------------------------------------
Did you ever receive a response to this?
I can't answer about encrypting variables, but I would be interested in
knowing how others in the Stata user community are approaching this, how
they are adapting the ways in which they use Stata to meet the demands of
their institutions' data-protection or privacy-protection policies.
For example, if it is used as a client application with a database residing
on a server elsewhere on campus, I assume that there wouldn't be any
unexpected glitch using Stata with the various protocols for tunneling ODBC
traffic. And I assume that policy would require users to be trained in
proper procedure, but are institutions requiring Stata be "qualified" in
some respect before allowing its use on privacy-protected data?
Joseph Coveney
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/