Hi Yu,
Perhaps as if not more succint is to open the text file in a text
editor, and replace every instance of " " (that was two spaces) with
" " (one space).
You may have to do it until a message comes up that
- -no instances of " " (2 spaces) could be found--
This assumes that you do not have a string variable (for example an ID
variable) that has two spaces within it which are meaningful, such as
"ABCD EFG HIJK".
Thats the only caveat I can think of.
Best,
Dan
On 8/19/05, Jayesh Kumar <[email protected]> wrote:
> Since you are already working with Perl, you could have find an easier way out.
> In this case, I'll replace spaces with "|", and use delim in insheet command.
> In perl you could say: perl -lane r/ /\|/g filename
>
> If you wish to do it mannually: In any text processor I'll replace all
> consecutive spaces with "|" using find-replace command, until all
> consecutive "|" are removed, and then insheet the file.
>
> HTH,
> Jayesh
> ===================
> Jayesh Kumar
>
> On 8/19/05, Joseph Coveney <[email protected]> wrote:
> > Yu Zhang wrote:
> >
> > It's a shame to ask, but does anyone know how to read
> > data (text file) with multiple spaces between
> > variables? The number of spaces may vary, so I cannot
> > use:
> >
> > . insheet using file, delim(" ")
> >
> > The only way I figured out is to count the number of
> > variables first (e.g., using Perl) and then use:
> >
> > . infile var1-var# using file
> >
> > Is there a more direct way?
> >
> > --------------------------------------------------------------------------------
> >
> > My guess would be to do the same in Stata as you would do in Perl to
> > identify variables.
> >
> > For example, if there is only a single space between tokens within any
> > string
> > variable, and there are at least two spaces (maybe more) between each pair
> > of variables, then:
> > 1. insheet into Stata into a single string variable (mind the limit for
> > string variable length),
> > 2. use Stata's limited regular expressions capability to convert multiple
> > spaces to a convenient delimiter (choose one not otherwise present in the
> > string variables' data),
> > 3. convert multiple delimiters to single delimiters (mind blank cells),
> > 4. export the delimited dataset as an ASCII spreadsheet from Stata (using
> > the -no quote- option) to a temporary file, and then
> > 5. re-import the delimited spreadsheet into Stata.
> >
> > Joseph Coveney
> >
> > * Creating demonstration spreadsheet
> > clear
> > set more off
> > set obs 3
> > generate str var1 = "column1 column2 column3"
> > replace var1 = ///
> > "This is the first column. This is the second column. " ///
> > + "This is the third column." in 2
> > replace var1 = ///
> > "The first-second is two spaces. " ///
> > + "The second-third is four spaces. " in 3
> > * Check these last lines above--they might have line-wrapped
> > * in the e-mail handler.
> > outsheet using space_delimited_text_spreadsheet.prn, noname noquote
> > clear
> > *
> > * Begin here
> > *
> > insheet using space_delimited_text_spreadsheet.prn
> > replace v1 = subinstr(v1, " ", "; ", .)
> > replace v1 = subinstr(v1, "; ; ", "; ", .)
> > tempfile tmpfil0
> > outsheet using `tmpfil0', nonames noquote
> > insheet using `tmpfil0', names delimiter(";") clear
> > erase `tmpfil0'
> > list, clean
> > exit
> >
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/support/faqs/res/findit.html
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> >
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/