Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Column types for ODBC


From   "Joseph Coveney" <jcoveney@bigplanet.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   Re: st: RE: Column types for ODBC
Date   Sat, 22 May 2010 13:54:43 +0900

Michael Ewens wrote:

The goal is to keep the Stata internal data type as a date (which is 
just a numeric), but send it to the odbc command as date-readable by 
mysql/Access/etc.  If Stata could send the 'odbc insert' the formatted date

"1Jan2000"

rather than its numeric counterpart, it would be a lot easier to work 
with Stata-exported odbc dates.   As is, it appears that databases 
created with Stata's odbc inserts were not built for use outside of 
Stata (my example is Stata creates a MySQL database and R loads it). 
That of course is not a bug or defect, just an important caveat when 
working with 'odbc.'

[excerpted]

>
> I am inserting columns of Stata-formatted dates into an ODBC database
> (MySQL).  If one runs
>
> --
> gen date = mdy(month,year,date)
> format date %td
> odbc insert, dsn(datasource) t(test) u(u) create sqlshow
> --
>
> Stata creates a 'float' column in the new table.  If one wants to
> subsequently use this new table outside of Stata, the date information
> is effectively lost.  Is there a way to force the create option to use
> particularl column types?
>
> My current workaround first converts the 'date' variable to a string
> 'YYYYMMDD.'  The next step could be to run something like
>
> --
> odbc exec("ALTER TABLE test MODIFY date DATE"), dsn(datasource) u(u)
> --
>
> Unfortunately, this approach requires a line of code for all the date
> variables in the data.  Is there a better approach to dealing with dates
> and odbc?  Ideally, the 'datestring' option available for 'odbc load'
> would be available for 'odbc insert'.

--------------------------------------------------------------------------------

You can do this programmatically, without any tedium, in something like the ado 
file below.  (Below that, there's a do-file that tests it out on an MS Access 
database file--I assume that you would need to modify the ado-file's 
ALTER TABLE SQL statement in order to comply with MySQL's or other relational 
database management system's local syntax for the SQL statement.)

As an aside, from your description of the circumstances, it almost seems as if 
MySQL is used primarily as an enhanced file system for interchange of data 
between Stata and R.  If that's its major use, then I believe that there are 
easier and more direct ways of doing that.

If MySQL's purpose goes beyond that, then I recommend treating the relational 
database as an equal partner to Stata and R in the data management & analysis 
enterprise.  I've discovered that it's good practice to create the database 
beforehand by means of considered data-definition language (DDL) in a 
stand-alone SQL script file.  ("Considered" here means "after formal data 
modeling".)  It results in fewer things to worry about than after relinquishing 
DDL control to -odbc insert . . ., create-, and I imagine that that would apply 
after its R counterpart, too.

Joseph Coveney

------ado-file---------

program define odbc_with_dates
    version 11.0
    syntax [varlist], table(string) create [*]

    if ("`varlist'" == "") {
        local varlist _all
    }
    else {
        // no op
    }

    local date_variable_list
    foreach variable of varlist `varlist' {
        local variable_format : format `variable'
        if (strpos("`variable_format'", "%t") == 1) {
            local date_variable_list `date_variable_list' `variable'
        }
        else {
            // continue
        }
    }

    preserve

    quietly tostring `date_variable_list', force replace ///
        usedisplayformat

    odbc insert `varlist', table("`table'") create `options'

    foreach variable of varlist `date_variable_list' {
        odbc exec("ALTER TABLE `table' ALTER COLUMN `variable' DATETIME;"), ///
            `options'
    }

    restore
end

-----do file------------

version 11.0

clear *
set more off
set obs 5

generate int first_dt = date("2010-05-" + string(_n), "YMD")
format first_dt %tdCCYY-NN-DD

generate double second_dt = cofd(first_dt)
format second_dt %tcCCYY-NN-DD_HH:MM:SS

generate str happy_tx = "Happy!"

local dsn MS Access Database;DBQ=F:\Database1.accdb;

// odbc exec("DROP TABLE TestData;"), dsn("`dsn'")

odbc_with_dates , table(TestData) dsn("`dsn'") create

local line_size `c(linesize)'
set linesize 70

odbc describe TestData, dsn("`dsn'")
odbc exec("SELECT * FROM TestData;"), dsn("`dsn'")

set linesize `line_size'

exit

----printout (partial)------

DataSource: MS Access Database;DBQ=F:\Database1.accdb; (\Database1.acc
> db;":query)
Table:      TestData (load)
----------------------------------------------------------------------
> ---------
Variable Name                               Variable Type
----------------------------------------------------------------------
> ---------
first_dt                                    DATETIME
second_dt                                   DATETIME
happy_tx                                    CHAR
----------------------------------------------------------------------
> ---------

. odbc exec("SELECT * FROM TestData;"), dsn("`dsn'")

    1. +-------------------------------------------------------+
       |first_dt | 2010-05-01 00:00:00                         |
       |second_dt| 2010-05-01 00:00:00                         |
       |happy_tx | Happy!                                      |
       +-------------------------------------------------------+

    2. +-------------------------------------------------------+
       |first_dt | 2010-05-02 00:00:00                         |
       |second_dt| 2010-05-02 00:00:00                         |
       |happy_tx | Happy!                                      |
       +-------------------------------------------------------+

    3. +-------------------------------------------------------+
       |first_dt | 2010-05-03 00:00:00                         |
       |second_dt| 2010-05-03 00:00:00                         |
       |happy_tx | Happy!                                      |
       +-------------------------------------------------------+

    4. +-------------------------------------------------------+
       |first_dt | 2010-05-04 00:00:00                         |
       |second_dt| 2010-05-04 00:00:00                         |
       |happy_tx | Happy!                                      |
       +-------------------------------------------------------+

    5. +-------------------------------------------------------+
       |first_dt | 2010-05-05 00:00:00                         |
       |second_dt| 2010-05-05 00:00:00                         |
       |happy_tx | Happy!                                      |
       +-------------------------------------------------------+


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index