Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Using the -copy- command to download google ngram data
From
"Madsen,Paul" <[email protected]>
To
"[email protected]" <[email protected]>
Subject
st: Using the -copy- command to download google ngram data
Date
Wed, 14 Dec 2011 15:53:05 +0000
Dear Statalist,
I would like to download google's ngram data using stata's -copy- command. The data are located here: http://books.google.com/ngrams/datasets.
I'm running Stata/SE 11.2 for windows 64 bit.
Here's the relevant line of Stata code, which is intended to copy the zip file to a local directory and name it download.zip:
copy http://commondatastorage.googleapis.com/books/ngrams/books/googlebooks-eng-us-all-1gram-20090715-0.csv.zip download.zip
The web address in the code was taken from the google ngram website (by right clicking the link to the file and pasting it in stata).
When I run this code, I get the error:
file http://commondatastorage.googleapis.com/books/ngrams/books/googlebooks-eng-us-all-1gram-20090715-0.csv.zip not found
server says file temporarily redirected to http://v5.lscache6.c.bigcache.googleapis.com/books/ngrams/books/googlebooks-eng-us-all-1gram-20090715-0.csv.zip
This looks like an issue on google's end. If I copy the new file location from the error text and run the stata code:
copy http://v5.lscache6.c.bigcache.googleapis.com/books/ngrams/books/googlebooks-eng-us-all-1gram-20090715-0.csv.zip download.zip
I get the error message "unexpected end of file." This problem is not isolated to the specific google ngram file in the example code. I've tried it on several of them with the same problem. I have also tested the code on a different zip file from a different website and the code works well when it is used on another dataset.
It is hard for me to believe that google's files would have some fundamental flaw that makes download directly to Stata impossible. Can something be done in Stata to deal with such a problem (maybe using the shell command)?
Thanks!
Paul E. Madsen
University of Florida
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/