Title | Use the round() function to transform the numeric time variable so that the new variable contains only integer values | |
Author | Pei-Chun Lai, StataCorp |
I imported a dataset from an Excel file. When I use the xtset command to declare the panel data, why do I receive the following error message?
xtset panelvar timevar time variable must contain only integer values r(451)
xtset requires the panel and time identifier variables to be numeric, and they must only contain integers.
Excel rounds datetime values to the nearest millisecond, but Stata does not. When you import a dataset from an Excel file, it is possible that a fraction of a millisecond for the time variable is left in Stata. Because the time variable does not have integer values, you will see the r(451) error message after the xtset command.
Stata stores datetimes as the number of milliseconds elapsed since January 1, 1960 00:00:00.000. To ensure that a time variable has only integer values, we suggest using Stata’s round(. ,1) function to round the values of the panel time variable to the nearest millisecond or using round(. ,1000) to round the values of the panel time variable to the nearest second. Then, when you use the xtset command, Stata will not report an error. Here is an example of an Excel spreadsheet with panel data:
We import the panel data file into Stata and change the formatting of the time variable so we can see the milliseconds:
. import excel "test.xlsx", sheet("Sheet1") firstrow clear (3 vars, 9 obs) . format TestTime %tcnn/dd/ccYY_hh:MMam.SS.sss . list
Patien~D TestTime TestRe~t | |
1. | 1 9/27/2019 12:51pm.00.000 10.9 |
2. | 1 9/27/2019 5:25pm.03.999 13.3 |
3. | 1 9/28/2019 7:40am.00.000 9.1 |
4. | 1 9/28/2019 12:00pm.00.000 17.5 |
5. | 1 9/28/2019 4:52pm.11.000 6.8 |
6. | 2 9/29/2019 7:50am.30.999 10 |
7. | 2 9/29/2019 11:55am.05.000 19.3 |
8. | 2 9/29/2019 12:19pm.41.999 17.9 |
9. | 2 9/29/2019 5:29pm.59.999 18 |
When we use the xtset command to declare the panel data, Stata reports an error message because a fraction of a millisecond for the TestTime variable is left:
xtset PatientID TestTime time variable must contain only integer values r(451)
Therefore, we would need to use generate double with the round(., 1) function to create a double-type numeric variable that contains the values of TestTime to the nearest millisecond. This would prevent loss of precision:
. generate double TestTime2=round(TestTime,1) . format TestTime2 %tcnn/dd/ccYY_hh:MMam.SS.ss . list TestTime TestTime2
TestTime TestTime2 | |
1. | 9/27/2019 12:51pm.00.000 9/27/2019 12:51pm.00.00 |
2. | 9/27/2019 5:25pm.03.999 9/27/2019 5:25pm.04.00 |
3. | 9/28/2019 7:40am.00.000 9/28/2019 7:40am.00.00 |
4. | 9/28/2019 12:00pm.00.000 9/28/2019 12:00pm.00.00 |
5. | 9/28/2019 4:52pm.11.000 9/28/2019 4:52pm.11.00 |
6. | 9/29/2019 7:50am.30.999 9/29/2019 7:50am.31.00 |
7. | 9/29/2019 11:55am.05.000 9/29/2019 11:55am.05.00 |
8. | 9/29/2019 12:19pm.41.999 9/29/2019 12:19pm.42.00 |
9. | 9/29/2019 5:29pm.59.999 9/29/2019 5:30pm.00.00 |
TestTime2 has integer values because Stata stores datetimes as the number of milliseconds since January 1, 1960 00:00:00.000. We can now use xtset with TestTime2 to declare the panel data structure:
. xtset PatientID TestTime2 panel variable: PatientID (unbalanced) time variable: TestTime2, 9/27/2019 12:51pm.00.00 to 9/29/2019 5:30pm.00.00, but with gaps delta: .001 seconds
Success!
To learn more about how Excel and Stata handle datetime values, please see Why does my Excel datetime value seem to be behind in Stata?