I'm stuck on how to use sas to categorize my data while using the original excel file. I need to categorize as DMV or Non DMV... Does anyone know how to do it? I think an if/then statement is required.
Prompt:
Most Marine Corps Marathon runners are from the greater DC-Maryland-Virginia (DMV) metro area. Nevertheless, the race attracts enthusiasts nationally and internationally. Since out-of-the-area runners incur additional travel, hotel, food, etc expenses, it seems plausible they might, as a group, be more dedicated runners than local participants.
For simplicity, define anyone from DC, Maryland or Virginia as originating from the DMV. Moreover, consider anyone with a finish time under 6 hours as a “runner” and with 6 hours or more as a “walker”. Exclude walkers from the analysis.
Sample 1:
170 | 72 | 10665 | 10933 | 6963 | 7022 | 5037 | 4265 | 15606 | 14513 |
623 | 709 | 11168 | 11293 | 7723 | 7274 | 6215 | 5053 | 16667 | 16113 |
1044 | 1950 | 11169 | 11497 | 8821 | 7677 | 6363 | 5066 | 16812 | 17698 |
1180 | 2288 | 11691 | 12228 | 9242 | 8728 | 6522 | 5233 | 17038 | 18104 |
1921 | 3391 | 15528 | 14118 | 9503 | 9751 | 6678 | 6530 | 18697 | 18639 |
So far I've had sas produce a table with just the sample 1 runners in it.
Code:
%LET WorkingDirectory = C:\;
PROC IMPORT DATAFILE="&WorkingDirectory.mcm_2017.csv"
OUT=Runners DBMS=csv REPLACE;
RUN;
*Question 1;
Data Sample01;
INPUT Overall_Place @@;
DATALINES;
170 72 10665 10933 6963
7022 5037 4265 15606 623
709 11168 11293 7723 7274
6215 5053 * 1044 1950
11169 11497 8821 7677 6363
5066 * 1180 2288 11691
12228 9242 8728 6522 5233
* 1921 3391 * 14118
9503 9751 6678 6530 *
;
RUN;
*Sample 01 Runners;
PROC SQL;
CREATE TABLE Sample01_Runners as
SELECT Name, Gender, Age, Time, b.Overall_Place, City, State, Country
FROM WORK.RUNNERS a, WORK.SAMPLE01 b
WHERE a.Overall_Place=b.Overall_Place;
Run;
If you have questions don't hesitate to ask.
Using the data in Sample1 how do you identify the participants in DMV? I do not see any information that indicates where anyone is from.
You likely do not need to add any variables at all. You can create analysis groups in data based on a single variable using a custom format.
Something like:
proc format library=work;
value $dmv
'MD', 'VA', 'DC' = 'DMV'
other= 'non DMV'
;
run;
Then use the format for your state variable in procs such as freq, means, ttest etc.
Proc freq data=Sample01_Runners ;
tables state;
format state $dmv.;
run;
If /then /else (or Select when or Case when) will be needed if two or more variables are involved.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.