Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Features for Stata 14
From
László Sándor <[email protected]>
To
[email protected]
Subject
Re: st: Features for Stata 14
Date
Tue, 3 Sep 2013 14:26:07 -0400
One more thing about methods: Stata could get user-friendly with
regression discontinuity designs, probably along these lines:
http://www.mostlyharmlesseconometrics.com/2012/11/rd-news/
Other statistical stuff that comes to mind probably needs to wait to
prove themselves. Though among the bigger things out there, moment
inequalities look like a biggie.
Smaller things: IV could do more, first and foremost the B-JIVE of Kolesár.
Oh, and on utilities: maps and geo things! Stata cannot fall behind on that.
The binned scatter plots put to great use at rajchetty.com and other
projects could even be officially supported. Its efficient coding is
surprisingly hard even with the existing tools available.
Renaming variables without using (loading) the entire data would sound
something easy to do, and would be a lifesaver for merging from raw
data when the data sources were not prudent with identifier names.
I could imagine an option for merge to drop observations from using
when there are duplicates. This is what many of us do (after a proper
investigation), and it is painful to load the raw data again and again
just to do this. Or we keep lingering along an almost identical, huge
data file that is basically the raw data just without the few rogue
duplicates…
Some simple profiling of code that ran, as it is hard to keep track of
bottlenecks (and memory after Stata 12!) in batch mode…
Built-in esttab/outreg2.
As so many things rely on sorting, some smarter or approximate sorting
methods for big data. At least for -xtile- or -egen, cut()- and other
uses.
On Tue, Sep 3, 2013 at 1:59 PM, Phil Schumm <[email protected]> wrote:
> On Sep 3, 2013, at 5:50 AM, Daniel Feenberg <[email protected]> wrote:
>> On Tue, 3 Sep 2013, Simon wrote:
>>> Sorry of this is a little tangential. I occasionally work with data sets that I prefer to keep encrypted. At the moment I have to mount a true crypt drive then run stata as required. Would it be possible to have an encrypted version of a .dta file where stata manages access and requests the pass phrase?
>>
>> Windows has the Encrypting File System for this function.
>>
>> Unix has similar file systems, but more easily you can do this by reading from a pipe. The examples below are about reading from a compressed file, but should work with an encrypted file also, using your choice of encryption code. See:
>>
>> http://www.stata.com/support/faqs/unix/read-data-from-pipe/
>> http://www.nber.org/stata/efficient/pipes.html
>> http://www.stata.com/statalist/archive/2012-09/msg00337.html
>>
>> The statalist message linked to above repeats a rumour that Stata no longer supports pipes, but that is contradicted by my experience, and would be very sad if true.
>>
>> I wouldn't think that Stata should take on the burden of having an encryption module itself, but it should be compatible with the encryption technology that comes with the OS.
>
>
> Agreed -- Stata should (if anything) simply facilitate working with existing encryption solutions. And I continue to have the same experience as you do WRT reading data from pipes -- under Stata 13, I have no problem using GnuPG to decrypt to a pipe and then reading that directly into Stata via -use-. FWIW, I am running Stata SE on OS X.
>
> To the OP, while the pipe strategy may be a viable alternative (assuming that it does indeed continue to work and that you are on Unix/Linux/OS X), I wouldn't be so quick to dismiss TrueCrypt, or equivalently, anything that allows you to create an encrypted volume (e.g., encrypted disk images on OS X). For one thing, if you're really concerned about data security, then you need to have Stata's tmpdir on an encrypted volume too -- otherwise, you're saving (decrypted) copies of your dataset every time you use tempfiles (or every time you execute a command that does). Thus, the easiest way to set up a semi-secure data analysis environment in Stata is to mount an encrypted volume, move your tmpdir to it, and keep all of your datasets there. And finally, if you don't like having to deal with your password/passphrase, you can store it in a keychain.
>
>
> -- Phil
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/