Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Importing subset of a pipe delimited textfile

From	Daniel Feenberg <[email protected]>
To	[email protected]
Subject	Re: st: Importing subset of a pipe delimited textfile
Date	Wed, 17 Oct 2012 07:19:08 -0400 (EDT)


On Wed, 17 Oct 2012, Rob Shaw wrote:

Hi

I have a very large (around 4Gb) text file that has been pipe
delimited. It won't all fit in memory so I want to process it in
parts.

For fixed datasets I would use infile with the in 1/10000000 option
then 10000001/2000000 etc. However, this dataset has been pipe
delimited so I would need to use insheet, but insheet doesn't seem to
permit the "in" option.

Can anyone help please?

I take it that there are commas in the data, so that converting the pipesto something else with filefilter won't work? You could convert the commasto "~"s first? Data already has "~"s? No unused character available atall?

In Unix there is the "split" command, which works on lines. In Windowsthere are many split commands available, none from MS and mostly splittingon bytes. That would work if your file has fixed record lengths. I seethere is "Text-File-Splitter" which seems to work on lines. I haven't usedit.

It is a shame that every input command in Stata is lacking useful featuresthat most of the other input commands seem to have. -in-, -if- and -keep-are all things that should be universal.


dan feenberg


Many thanks
Rob
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Importing subset of a pipe delimited textfile
  - From: Rob Shaw <[email protected]>

Prev by Date: Re: st: appending date variable in 2 datafiles
Next by Date: Re: st: appending date variable in 2 datafiles
Previous by thread: Re: st: Importing subset of a pipe delimited textfile
Next by thread: Re: st: Importing subset of a pipe delimited textfile
Index(es):
- Date
- Thread