Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: joining strings
From
Scott Merryman <[email protected]>
To
[email protected]
Subject
Re: st: joining strings
Date
Wed, 28 Nov 2012 10:22:51 -0600
clear*
input id str1 stringvar
1 x
1 y
1 z
2 a
3 d
4 g
4 h
end
gen str244 foo = ""
forv i = 1/100 {
local a "`a' stringvar[`i']"
}
local b: subinstr local a " " "+", all
local c: subinstr local b "+" ""
bys id: replace foo = `c'
compress
l
On Wed, Nov 28, 2012 at 10:00 AM, Kevin McConeghy
<[email protected]> wrote:
> Hello users,
>
> I have a dataset with ~2.2 mill obs like so:
>
> obs
> id stringvar + other variables
> 1 x
> 1 y
> 1 z
> 2 a
> 3 d
> 4 g
> 4 h
>
> I was trying to combine the stringvar to collapse and make id a unique
> key, like so:
>
> id stringvar
> 1 xyz
> 2 a
> 3 d
> 4 gh
>
> with the following code:
>
> quietly by id: gen x_ = cond(_N==1,0,_n)
> egen x = concat(x_), p(_)
> drop x_
>
> reshape wide stringvar, i(id) j(x) string
> egen str244 all_stringvar=concat(x1, x2, x3....)
>
> Which worked for one quarter of data, but it takes a lot of computer
> power with 1.9 obs and up to 100 unique string var for 1 obs
> My computer wont calculate it for the whole dataset because it runs
> out of memory
>
> Is there some way to skip the reshape step and join string variables
> by specifying the same id?:
> egen stringvar_all = concat(stringvar), by id
>
> Thanks for any help,
> Kevin
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/