|
Creating Stata datasets
- Input data from command line
- Input data saved from spreadsheets
- Read data using a dictionary
- Read any type of ASCII data
- Read and write data in the format required by the FDA for NDA submittals
- Read and write XML-formatted data files, including those produced by
Microsoft Excel
- Convert datasets directly from other statistical packages, spreadsheets, and databases using third-party software
ODBC support
- Import data from any ODBC data source, such as Access, Excel, Postgres, or MySQL
- Export data to new or existing ODBC tables
- Execute raw SQL commands individually or in batches
- Support for ODBC on Windows, Mac, and Linux
Built-in spreadsheet editor
- For Windows, Mac, and Unix
Variables Manager
- Change storage types, names, and formats
- Add and edit value labels
- Attach notes to variables
- Filter variables
Data-management functions
Data reorganization
- Row–column transposition
- Data reshaping
- Stacking of variables
- Collapsing into means, totals, etc.
Labels
- Dataset labels
- Variable labels
- Value labels (e.g., male and female for 0 and 1)
- Ability to switch between multiple sets of data, variable, and value labels
- Missing-value labels
- Support for multiple languages
Notes
- Extensive notes can be attached to a dataset
Data snapshots

- Allow multiple levels of undo to modified datasets
|
Sorting
- Ascending or descending sorts
- Multiple-key sorts
- Numeric and string sorts
Merging datasets
- Merge datasets
- By key variables
- By observations
- Join datasets
- Outer join
- Append datasets

- Append time series
Special datasets
Utilities
- Compress (make dataset as small as possible without loss of accuracy)
- Formatted and unformatted disk I/O
- Zip-file support

Variable management
- Generation of new variables
- Replacement of existing variables
- Encoding and decoding string variables
Dataset reports
- Data signatures to verify the integrity of new data
- Flexible description of variables, labels, and types
- Codebooks for variables
- Value-label reports
- Duplicates and missing values

Variable types
- Byte
- Integer (int)
- Long
- Float
- Double
- String
- Dates
- Dates and times
Saved results
- Save results to disk for later use
- Store up to 300 sets of results in memory
- Create tables to compare results
|