Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Re: odbc load from SQL if ID is in my Stata master dataset?
From
"Joseph Coveney" <[email protected]>
To
<[email protected]>
Subject
Re: st: Re: odbc load from SQL if ID is in my Stata master dataset?
Date
Wed, 19 Sep 2012 09:59:57 +0900
Jen Zhen wrote:
> When I use the statement - SELECT * FROM table JOIN idlist ON
> table.id=idlist.id -
> I get the error "Column 'id' in field list is ambiguous, presumably
> because the same id is contained more than once in my main table.
> It's like when in a Stata merge some values of the id variable are
> contained more than once in the master dataset and I want to merge in
> the information from the using dataset for each of them. Not sure if
> the JOIN command has an appropriate "option" for that?
Yes, it does have one. Specify taking only those columns from the first table,
as follows:
SELECT A.* FROM table AS A INNER JOIN idlist AS B ON A.id = B.id
This will prevent the second table's id from appearing in the selection and
causing the ambiguity error message.
> The second option I frankly don't fully understand.
> Assuming the string length allowed in SQL is at least as high as that
> in Stata, i.e. 244 characters, I think I could place all ID numbers
> into about 4 stringths and then subsequently place these 4 stringths
> into those IN(string) brackets.
The 244-character limit on string lengths applies to Stata dataset variables and
Stata scalars. It does not apply to local macro variables or global macro
variables. Local macros and global macros can hold string lengths of about
32,000 characters. That is why I built the SQL statement using a local macro
variable. So, there won't be a problem if you have only hundreds or a few
thousand ID numbers.
> However, I'm not sure how I could take all IDs contained in a Stata
> dataset column and put them into a string?
> I guess I could do so somehow manually by copy-pasting it into MS Word
> and then copy-pasting back into the do file, but that doesn't look
> very elegant or automatizable...
That's what the following does. It takes all IDs in the dataset and pastes each
of them (and a comma) into the SQL statement during each passage through the
loop. (The last dataset observation's ID is pasted with a close parenthesis
instead of a comma.)
> >
> > local SQL_statement SELECT * FROM MyTable WHERE ID IN (
> > drop if missing(ID)
contract ID, freq(NoNo) // Just to be safe
> > quietly count
> > forvalues row = 1/`=`r(N)'-1' {
> > local SQL_statement `SQL_statement' `=ID[`row']',
> > }
> > local SQL_statement `SQL_statement' `=ID[`r(N)']');
> > odbc load, exec("`SQL_statement'") . . .
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/