-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Tuesday, 27.09.2005 at 17:07 +0100, Nick Cox wrote:
> > > Can you suggest a way to 'transparently' import these dates as
> > > 'proper' date variables?
> >
> > Reply aimed at Kevin Crow:
> >
> > Joseph Coveney's very helpful workaround notwithstanding, can I assume
> > that a direct import of date-type values over ODBC in this
> > manner is not
> > possible with Stata?
> >
> > Is this considered a bug by the Stata developers? (This isn't a
> > criticism, rather a request for clarification!)
>
> For Stata, date variables are not a variable type,
> or a set of variable types. Dates are held as integer-valued
> numeric variables and the date part works through formats
> and functions.
>
> In Stata, naturally, you can also hold a date as a string,
> but its date attributes then lie in the eye of the beholder.
>
> Other programs have other attitudes, but the key point here,
> I believe, is that Stata has no concept of a date-type
> variable and even less concept of the meaning of a string
> variable.
Yeah, I appreciate the way that a date is stored 'behind the scenes',
however its failure to work as expected regarding importing from ODBC
sources is inconsistent with the way other data types are identified and
used:
e.g. An example table 'mixbag' which was defined as follows in
PostgreSQL and is artificially created to contain many different data
types: (c was originally 'float' and d was 'double precision', but I
think I have an option set to default all reals to double precision
hence c and d look the same)
Column | Type
- --------+--------------------------
a | smallint
b | integer
c | double precision
d | double precision
e | date
f | time without time zone
g | timestamp with time zone
h | character(10)
i | character varying(10)
j | text
In Stata:
. odbc describe mixbag
DataSource: pgstuff (query)
Table: mixbag (load)
- -------------------------------------------------------------------------------
Variable Name Variable Type
- -------------------------------------------------------------------------------
a int2
b int4
c float8
d float8
e date
f time
g timestamptz
h bpchar
i varchar
j text
- -------------------------------------------------------------------------------
So, at the 'describe' stage, Stata is perfectly aware of the correct
data types for each field.
A non-loading select shows the correct data:
. odbc exec("select * from mixbag"), dsn(pgstuff)
1. +------------------------------------------------------------------------+
|a | 1 |
|b | 2 |
|c | 1.5 |
|d | 2.123412341234 |
|e | 2005-09-23 |
|f | 15:15:15 |
|g | 2005-09-23 15:15:15 |
|h | abcdefghij |
|i | abcde |
|j | blah blah blah blah blah |
+------------------------------------------------------------------------+
Now we load the data for real:
. odbc load, exec("select * from mixbag") dsn(pgstuff)
. list
+-------------------------------------------------------------------------+
1. | a | b | c | d | e | f | g | h |
| 1 | 2 | 1.5 | 2.1234123 | 01jan1960 | 15:15:15 | 01jan1960 | abcdefghij |
|-------------------------------------------------------------------------|
| i | j |
| abcde | blah blah blah blah blah |
+-------------------------------------------------------------------------+
. desc
Contains data
obs: 1
vars: 10
size: 314 (99.9% of memory free)
- -------------------------------------------------------------------------------
storage display value
variable name type format label variable label
- -------------------------------------------------------------------------------
a int %8.0g
b long %12.0g
c double %10.0g
d double %10.0g
e double %td
f str8 %9s
g double %td
h str10 %10s
i str10 %10s
j str244 %244s
- -------------------------------------------------------------------------------
So:
1. Integer variables are identified as integers, and are correctly
imported as integers and appropriately formatted as integers - THIS
IS OK;
2. Real variables are identified as reals, and are correctly
imported as reals and appropriately formatted as reals - THIS IS OK;
3. True string variables are identified as strings, are imported
correctly as strings and appropriately formatted as strings - THIS
IS OK;
4. The date and date/time variables, e and g, are correctly
identified as 'date' and 'timestamptz', Stata tries to format these
variables as dates with the format %td (thus 'believing' them to be
dates), yet it has failed to actually import the correct values.
The time variable, f, is similarly mis-handled. THIS IS *WRONG*
My point (hey, I got there in the end!) is that the date fields are
handled inconsistently compared to the other data types. And, as such,
my request to Kevin about whether he considers this to be a bug or not
...
Dave.
- --
Dave Ewart
[email protected]
Computing Manager, Cancer Epidemiology Unit
Cancer Research UK / Oxford University
PGP: CC70 1883 BD92 E665 B840 118B 6E94 2CFD 694D E370
Get key from http://www.ceu.ox.ac.uk/~davee/davee-ceu-ox-ac-uk.asc
N 51.7518, W 1.2016
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
iD8DBQFDOlsvbpQs/WlN43ARAia4AKCeGLlLNS4XdDybf+tV+iLR1CDlBACaAiW2
O3bydZkZg4MLeyurWqlddHI=
=snzp
-----END PGP SIGNATURE-----
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/