Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: compressed data and named pipes on linux
From
James Sams <[email protected]>
To
[email protected]
Subject
st: compressed data and named pipes on linux
Date
Sat, 18 Aug 2012 18:46:20 -0500
I wish to compress my stata datasets and use them via a named pipe. I was
hoping that the gz* tools would do this, but it appears that they make use of
a temporary file and write the entire compressed file out then read it in,
slowing things down pretty significantly with the extra hard drive read/write
cycle. My first instinct was to use a named pipe, and it seems like this should
work according to past messages to the stata list and a stata FAQ. However,
with stata 12 mp, I am getting a core dump. Can someone tell me where I am
going wrong?
unzip.sh:
#!/bin/bash
gzip_file="$1"
pipe_name="$2"
if [ -e "$pipe_name" ]; then
rm "$pipe_name"
fi
mknod "$pipe_name" p
zcat "$1" > "$pipe_name" &
usage in stata:
local gzip_file 2667.dta.gz
tempfile pipe_file
shell ./unzip.sh `gzip_file' `pipe_file' >& /dev/null < /dev/null
use `pipe_file'
shell rm `pipe_file'
Doing this manually also results in a core dump:
$ mkfifo tempfile
$ zcat 2667.dta.gz > tempfile &
$ stata -q use tempfile.
Segmentation fault (core dumped)
FWIW, Doing this works just fine:
$ mkfifo tempfile
$ zcat 2667.dta.gz > tempfile &
$ cat tempfile > 2667.dta && stata -q use 2667.dta
$ stata -q use 2667.dta
--
James Sams
[email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/