Title: Stata
1CCPR Computing ServicesWorkshop 2 StataOctober
20, 2004
2Outline
- Converting Data between Statistical Packages
- Stata
- Basic Commands
- Command Syntax
- Abbreviations
- Missing Values
- Combining Data
- Using do-files
- Getting Help
- Updating Stata
3Converting Data Windows Stat/Transfer
- SAS, Stata, S-Plus, SPSS, Excel, and more
- Windows interface
- Enter in data and out data
- Enter info on other tabs as necessary
- Check results!
4Converting Data Unix Stat/Transfer
- From within stat-transfer
- invoke stat/transfer (specific to Unix machine)
- at stat/transfer prompt, enter
- copy datfile1.ext1 datfile2.ext2
- datfile1.ext1 original file, datfile2.ext2
new file - From Unix prompt
- st datfile1.ext1 datfile2.ext2
- (replace st with local Stat/Transfer invocation)
- See manual for more info and options
- Check results!
5Converting Data DBMS/Copy
- DBMS/Copy for Unix (without xwindows)
- From Unix prompt
- dbmsnox indatfile.ext1 outdatfile.ext2
- ext1 and ext2 are pseudo extensions
- spsswin SPSS for Windows
- Stata7 Stata 7
- sas7sun SAS for Unix v7
- ssdsun SAS for Unix v6
- Example windows spss to stata7
- dbmsnox mydat.spsswin mydat.stata7
- Check results!
6Converting Data
- See ATS website for transferring files between
SAS, Stata, and SPSS - http//www.ats.ucla.edu/stat/sas/faq/convert_pkg.h
tm
7Stata - Getting Started
- Windows Programs gt Stata8
- Command Window enter commands
- Results Window
- Other review, variables, do-editor
- Unix
- Interactive Stata
- commands and results show in same window
- Batch Stata
- nice 10 stata b do myjob.do
8Basic Commands
- Handout 1 (green)
- Reading raw data
- insheet, input, infix, infile
- Using/saving a Stata dataset
- use, webuse
- save
9Basic Commands, cont.
- Describing data
- describe
- Summarize
- codebook
- inspect
- Listing data
- list
- Tables of statistics
- table
- tab1 varlist (one-way tabulation of variables)
- tab2 varlist (two-way tabulations of variables)
10Basic Commands, cont.
- Changing data
- drop
- keep
- generate
- encode var, generate newvar
- recode
- replace
11Basic Commands, cont.
- Labeling data
- label variable
- label define
- label values
- label list
12A few other commands
- compress - saves data more efficiently
- reshape long/wide
- sort/ gsort
- order
- rename
13Stata Syntax
- Basic command syntax
- by varlist
- command varlist exp if exp in range
weighttypeweight , options - Brackets optional portions
- Italics user specified
14Stata Syntax, cont.
- Complete syntax
- by varlist
- command varlist exp if exp in range
weighttypeweight , options - Example 1 (webuse union)
- Stata Command
- .summarize
- Result Summarizes all dataset variables (_all)
15Stata Syntax, cont.
- Complete syntax
- by varlist
- command varlist exp if exp in range
weighttypeweight , options - Example 2 (webuse union)
- Stata command
- .summarize age
- Result Summarizes variable age
16Stata Syntax, cont.
- Complete syntax
- by varlist
- command varlist exp if exp in range
weighttypeweight , options - Example 3 (webuse union)
- Stata Command
- .summarize age if year gt 80
- Result
- Summarizes age, includes only observations with
year gt 80
17Stata Syntax, cont.
- Complete syntax
- by varlist
- command varlist exp if exp in range
weighttypeweight , options - Example 4 (webuse union)
- Stata Command
- .summarize age if year gt 80 in 1/100
- Result
- Summarizes variable age, includes only first 100
obs and only obs with year gt 80
18Stata Syntax, cont.
- Complete syntax
- by varlist
- command varlist exp if exp in range
weighttypeweight , options - Example 5 (webuse union)
- Stata Command
- .by black summarize age if year gt 80
- Result
- Summarizes age separately for different values of
black, including only obs for which year gt 80
19Stata Syntax, cont.
- Complete syntax
- by varlist
- command varlist exp if exp in range
weighttypeweight , options - Example 6 (webuse union)
- Stata Command
- .bysort black summarize age if year gt 80,
detail - Result
- Detailed summaries of variable age, separated
over different values of black, includes only obs
with year gt 80
20Stata Syntax, cont.
- Complete syntax
- by varlist
- command varlist exp if exp in range
weighttypeweight , options - Example 7 (webuse union)
- Generally exp used with commands generate and
replace - Stata Commands
- .generate agelt30 age
- .replace agelt30 0 if age lt 30
- .replace agelt30 0 if age gt 30 age lt .
- Result
- Variable agelt30 set equal to 1, 0, or missing
21Stata Syntax, cont.
- Complete syntax
- by varlist
- command varlist exp if exp in range
weighttypeweight , options - Example 8
- Stata Command
- .summarize race pweightfinal_wt
- Results
- Summarizes variable race accounting for
probability weight called final_wt. - Note
- There are four different types of weights in
Statabe careful.
22Abbreviations in Stata
- Abbreviating command, option, and variable names
- shortest uniquely identifying name is sufficient
- Example
- Variables in use make, price, mpg
- Stata command, not abbreviated
- .summarize make price
- Stata command, abbreviated
- .su ma p
- Exceptions
- describe (d), list (l), and some others
- Commands that change/delete
- Functions implemented by ado-files
23Missing Values in Stata 8
- Stata 8
- 27 representations of numerical missing
- ., .a, .b, , .z
- Relational comparisons
- Biggest number lt . lt .a lt .b lt lt .z
- Mathematical functions
- missing nonmissing missing
- String missing
- Empty quote
24Missing Values in Stata - Pitfalls
- Pitfall 1
- Stata7 vs. Stata8 missing values
- Pitfall 2
- Do NOT
- .replace weightlt200 0 if weight gt 200
- INSTEAD
- .replace weightlt200 0 if weight gt 200
weight lt .
25Combining Data
- Append vs. Merge
- Append same variables, different observations
- Merge - same or related observations, different
variables - Appending data in Stata
- Handout 2
26Combining Data- merge and joinby
- Demonstrate with two sample datasets
- Neighborhood and County samples
- One-to-one merge
- Handout 3
- One-to-many merge use match merge
- Handout 4
- Many-to-many merge use joinby
- Handout 5
27Combining Data
- Variable _merge (generated by merge and joinby)
- update option also includes _merge4,5
- update changes default action when matched
observation has missing values in master and
non-missing in using data - Pitfalls
- Pitfall_merge1 handout 6
- Pitfall_merge2 handout 7
28Do-files
- What is a do-file?
- Stata commands can be executed interactively or
via a do-file - A do-file is a text file containing commands that
can be read by Stata - Handouts are do-files
- Stata command
- .do dofilename.do
29Do-files
- Why use a do-file?
- Documentation
- Communication
- Reproduce interactive session?
- Interactive vs. do-files
- Record EVERYTHING to recreate results in your
do-file!!
30Do-files gt Header, Version Control
- Header
- Include in do-files name, project, project
location, date, purpose, inputs, outputs, special
instructions - Version Control
- include version at top of do-file
- Why?
31Do-file gt End of Line Character
- Commands requiring multiple lines
- delimit
- This command tells Stata to read semi-colons as
the end-of-line character instead of the carriage
return - Comment out the carriage return with
- / at the end of line and / at the beginning of
next - Comment out the carriage return with ///
32Do-files gt End of line Character
- Example 1 delimit
- delimit
- keep firstname lastname birth death
- age weight height
- delimit cr
- Example 2 / /
- keep firstname lastname birth /
- / age weight height
- Example 3 ///
- keep firstname lastname birth ///
- age weight height
33Do-files gt Comments
- Comments
- Lines beginning with will be ignored
- Words between / and / will be ignored (spanning
multiple lines ok) - Words between // and end of line will be ignored
- Words between /// and beginning of next line will
be ignored (one way to spread command over two
lines)
34Do-files gt Comments
SAMPLE EXCERPT OF STATA DO-FILE This line will
be ignored by Stata. use mydata.dta / These
words will be ignored / do myjob.do //The
remainder of this line will be ignored. keep
age race sex ///The remainder of this line will
be ignored, including return first_name height
weight last_name /This line continuation of the
last line
35Saving output
- Work in do-files and log your sessions!
- log using filename
- replace, append
- log close
- Output choices
- .log file - ASCII file
- .smcl file - nicer format for viewing and
printing in Stata
36Basic Commands, cont.
- Graphs are not saved in log files
- Use saving option of graph commands
- saving(graph.ext)
- Export current graph
- graph export graph.ext
- Ex graph export graph.eps
- Supported formats
- .ps, .eps, .wmf, .emf .pict
37Getting Help in Stata
- help command_name
- abbreviated version of manual
- search
- search keywords, local
- search keywords, net
- search keywords, all
- findit keywords
- same as search keywords, all
- Search Stata Listserver and Stata FAQ
38Stata Resources
- www.stata.com gt Resources and Support
- Search Stata Listserver
- Search Stata (FAQ)
- Stata Journal (SJ)
- articles for subscribers
- programs free
- Stata Technical Bulletin (STB)
- replaced with the Stata Journal
- Articles available for purchase, programs free
- Courses (for fee)
39Updating Stata