Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: summarizing data for each panel over chosen time windows
From
R Zhang <[email protected]>
To
[email protected]
Subject
Re: st: summarizing data for each panel over chosen time windows
Date
Mon, 17 Mar 2014 23:23:45 -0400
Dear all,
I try to output the data but got an "invalid syntax" After the line of
code "save E:\Data\Patents\pat_store, replace"
below is sample data, real data 17 million firmid-year combination.
PatentID: is the identification number for company AA’s patent,
citedID is the identification number of a patent that was cited by the
focal patent. I want to generate a dummy that flags the citedID under
the following condition:
citedID=1 if this patent (e.g. 1995 100002 was firm AA’s own patent
filed over the past 5 years, Or 100002 was a patent that was cited by
firm AA over the past 5 years).
************* code *************
clear
input ///
year str2 firmid patentID citedID
1995 "AA" 100001 100002
1995 "AA" 100001 100003
1995 "AA" 100001 100004
1994 "AA" 110001 100002
1994 "AA" 110001 100005
1994 "AA" 110001 120001
1993 "AA" 120001 100006
1993 "AA" 120001 100007
1992 "AA" 130001 100008
1992 "AA" 130001 100009
1991 "AA" 140001 100010
1991 "AA" 140001 100011
1989 "AA" 140001 100011
1988 "AA" 140001 100011
1995 "BB" 100001 100002
1995 "BB" 100001 100003
1995 "BB" 100001 100004
1994 "BB" 110001 100002
1994 "BB" 110001 100005
1994 "BB" 110001 120001
1993 "BB" 120001 100006
1993 "BB" 120001 100007
1992 "BB" 130001 100008
1992 "BB" 130001 100009
1991 "BB" 140001 100010
1991 "BB" 140001 100011
end
egen groupid=group(firmid)
gen howmany = 0
save E:\Data\Patents\howmany,replace
local nfirms=r(max)
quie
forval n = 1/`nfirms' {
use E:\Data\Patents\howmany, clear
keep if firmid==`n'
local nobs=_N
forval i=1/`nobs' {
count if (patentID == citedID[`i'] | citedID == citedID[`i']) ///
& inrange(year, year[`i']-5, year[`i']-1)
replace howmany = r(N) in `i'
}
append using E:\Data\Patents\pat_store
save E:\Data\Patents\pat_store, replace
************* code *************
I am trying to save the data after each loop since there will be
millions of loops in case a computer shutdown I have to start over.
But my code may not be efficient, and I got an "invalid syntax" After
the line of code "save E:\Data\Patents\pat_store, replace"
-Rochelle
On Mon, Mar 17, 2014 at 10:58 PM, R Zhang <[email protected]> wrote:
> Dear all,
>
> I have a 17 million observation panel data (firm year combination). I
> am creating a count for past five years for each firm. My original
> posting was
> http://www.stata.com/statalist/archive/2014-03/msg00215.html
>
> please also refer to Nick's response. His coding works just fine for
> the hypothetical data I posted.
>
> input ///
> year str2 firmid patentID citedID
> 1995 "AA" 100001 100002
> 1995 "AA" 100001 100003
> 1995 "AA" 100001 100004
> 1994 "AA" 110001 100002
> 1994 "AA" 110001 100005
> 1994 "AA" 110001 120001
> 1993 "AA" 120001 100006
> 1993 "AA" 120001 100007
> 1992 "AA" 130001 100008
> 1992 "AA" 130001 100009
> 1991 "AA" 140001 100010
> 1991 "AA" 140001 100011
> 1989 "AA" 140001 100011
> 1988 "AA" 140001 100011
> 1995 "BB" 100001 100002
> 1995 "BB" 100001 100003
> 1995 "BB" 100001 100004
> 1994 "BB" 110001 100002
> 1994 "BB" 110001 100005
> 1994 "BB" 110001 120001
> 1993 "BB" 120001 100006
> 1993 "BB" 120001 100007
> 1992 "BB" 130001 100008
> 1992 "BB" 130001 100009
> 1991 "BB" 140001 100010
> 1991 "BB" 140001 100011
> end
>
> the issue I have now is the real data has 17 million observations. The
> computer ran for several days, and a sudden shutdown, I have to rerun
> the program, and it is still going.
>
> My question is : should I output the data in batch to prevent the
> discontinuation of the program due to unexpected computer shutdown?
> What is a good practice when you run a huge dataset ?
>
> Any suggestions would be greatly appreciated !!!
>
> -Rochelle
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/