Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: RE: why messy when importing a csv file?
From
"Nick Cox" <[email protected]>
To
<[email protected]>
Subject
RE: st: RE: why messy when importing a csv file?
Date
Thu, 6 May 2010 18:26:02 +0100
We have now two stories about what happens when you type -insheet using
firms.csv-, the one here and that sent 15 minutes earlier at
<http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist
.1005/date/article-297.html>
So, are you saying that your results are not even consistent?
Nick
[email protected]
Jessie Grace
Nick, thank you for reply.
Additionally, the csv file is downloaded from a certain database. If I
copy the contents of the file to Stata's editor window. Everything goes
well.
. list
+-------------------------------+
| stkcd accper a00110~0 |
|-------------------------------|
1. | 2 1999-06-30 4.68e+08 |
2. | 2 2002-09-30 1.17e+09 |
3. | 2 2000-01-01 7.73e+08 |
4. | 2 2000-06-30 9.12e+08 |
5. | 2 2000-12-31 9.96e+08 |
|-------------------------------|
6. | 2 2009-03-31 2.69e+10 |
7. | 2 1997-06-30 0 |
8. | 2 1991-12-31 8.86e+07 |
9. | 2 1992-12-31 2.05e+08 |
10. | 3 1998-12-31 1.21e+08 |
+-------------------------------+
If I copy the contents to a new csv file and type "insheet using
firms.csv", the results are as follows.
. list
+-------------------------+
| v1 |
|-------------------------|
1. | Stkcd,Accper,A001101000 |
2. | ,468010960.13 |
3. | ,1166858479.70 |
4. | ,772831829.15 |
5. | ,911966043.54 |
|-------------------------|
6. | ,995745160.05 |
7. | ,26921921879.80 |
8. | ,0 |
9. | ,88628783.34 |
10. | ,204653478.04 |
|-------------------------|
11. | ,120946052.36 |
+-------------------------+
I think the points are "the contents of each row are in the same cell"
and the double quotes of the second variable in my csv file.
> From: [email protected]
> No definition of "messy" here.
>
> My guess: By default your third variable will be -float- type and will
> be assigned a format %8.0g. That wouldn't look exactly like the
original
> without resetting the format.
>
> Otherwise put: without specifying more information, you are in effect
> _asking_ for Stata's default treatment in terms of storage types and
> formats. So, the results shouldn't seem surprising.
Jessie Grace
> I have a .csv file, which consists of the following.
>
> Stkcd,Accper,A001101000
> 000002,"1999-06-30",468010960.13
> 000002,"2002-09-30",1166858479.70
> 000002,"2000-01-01",772831829.15
> 000002,"2000-06-30",911966043.54
> 000002,"2000-12-31",995745160.05
> 000002,"2009-03-31",26921921879.80
> 000002,"1997-06-30",0
> 000002,"1991-12-31",88628783.34
> 000002,"1992-12-31",204653478.04
> 000003,"1998-12-31",120946052.36
>
> The first row contains variables names. The characteristic of the file
> is the contents of each row are in the same cell.
> No matter I typed "insheet using firms.csv" or "insheet using
> firms.csv,comma", the importing results are messy.
> Could anyone tell me why and how to solve?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/