| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
RE: st: Re: Compress and saveold
Dear Maarten and Michael,
Thank you so much for your answers. I now have learnt that the 80 character
limit is a Stata/SE problem and not a Stata 8 problem.
But most of all, I thank you for the options that you sent. While hacking
off the characters seem most appropriate at the moment since these variables
are more like comments but I certainly value the other solutions.
I tried to hack off the string and compress if after and it works!
Thank you again.
Regards,
Gauri
From: "Michael Blasnik" <[email protected]>
Reply-To: [email protected]
To: <[email protected]>
Subject: st: Re: Compress and saveold
Date: Sat, 20 Jan 2007 14:47:49 -0500
There are a couple of issues.
1) The 80 character limit is not a v8 vs. v9 issue but a Stata/SE vs.
regular (intercooled) Stata issue.
2) I'm not sure why you are surprised that -compress- didn't affect the
length of strings that actually use all of their length. You don't seem to
understand what -compress- does. It will change the storage type of a
variable to the type that requires the least space/memory without losing
any of the information. A 244 character string cannot fit into a str80
variable if it actually has more than 80 characters in it. You can try
to -trim()- the variables, but it sounds like you've already found that it
won't help.
To deal with this problem, you may want to look first at what information
you actually need that is in those long strings. Perhaps you can find
another way of holding the information besides long text strings. I see
three basic alternatives:
a) Just keep the first 80 characters and discard the rest:
replace mystring=substr(mystring,1,80)
compress mystring
b) find ways to shorten the strings without losing information, for example
if there are long substrings you could abbreviate then you could use a
series of -subinstr- calls to make it shorter:
replace mystring=subinstr(mystring,"Incorporated","Inc",.)
....
compress mystring
c) break the string into a series of shorter strings:
gen mystring1=substr(mystring,1,80)
gen mystring2=substr(mystring,81,160)
gen mystring3=substr(mystring,161,240)
drop mystring
The course you take should depend on what's in the strings that you need.
Michael Blasnik
----- Original Message ----- From: "Gauri Khanna" <[email protected]>
To: <[email protected]>
Sent: Saturday, January 20, 2007 2:20 PM
Subject: st: Compress and saveold
Dear List members,
I sent a stata dataset created in version 9.2 to a stata 8 user who ran
into the 80 characters limit problem for string variables. As you know,
Stata 8 only supports 80 characters on string variables whereas the
dataset I created in version 9.2 has a couple of string variables that are
in excess of 80.
1. So I tried to compress these variables. But when I look at them again
using the -describe- command I find no difference in their length? Does
that mean that these variables are not compressed or cannot be compressed?
Here is the output I get on the 10 variables that I compress:
<snip>
These are exaclty of the same length prior to compressing!
I also eyeballed the data to see if there are leading and trailing blanks
but unfortunately I do have some observations that fill in the entire
string length. So I cannot use -trim-, -ltrim-, and -substr()-.
2. I have used the command -saveold GM- where GM is the name of my dataset
but am not sure if that will work. Do any of you know about -saveold-
solving the problem?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/