BookmarkSubscribeRSS Feed
kilimanjaro_256
Calcite | Level 5

I'm stuck on how to use sas to categorize my data while using the original excel file. I need to categorize as DMV or Non DMV... Does anyone know how to do it? I think an if/then statement is required.

 

Prompt:

 

Most Marine Corps Marathon runners are from the greater DC-Maryland-Virginia (DMV) metro area. Nevertheless, the race attracts enthusiasts nationally and internationally. Since out-of-the-area runners incur additional travel, hotel, food, etc expenses, it seems plausible they might, as a group, be more dedicated runners than local participants.

For simplicity, define anyone from DC, Maryland or Virginia as originating from the DMV. Moreover, consider anyone with a finish time under 6 hours as a “runner” and with 6 hours or more as a “walker”. Exclude walkers from the analysis.   

  1. (5 Pts) Using Sample01, test the assertion that runners from outside the DMV have a faster mean (or median) finish time than runners from the DMV. Use . What do you conclude about the assertion?
  2. (5 Pts) Provide a 95% confidence interval for the difference (in minutes) between the mean (or median) finish time of outside the DMV and DMV runners.

Sample 1:

 

170

72

10665

10933

6963

7022

5037

4265

15606

14513

623

709

11168

11293

7723

7274

6215

5053

16667

16113

1044

1950

11169

11497

8821

7677

6363

5066

16812

17698

1180

2288

11691

12228

9242

8728

6522

5233

17038

18104

1921

3391

15528

14118

9503

9751

6678

6530

18697

18639

 

So far I've had sas produce a table with just the sample 1 runners in it. 

 

Code:

 

%LET WorkingDirectory = C:\;

PROC IMPORT DATAFILE="&WorkingDirectory.mcm_2017.csv"
OUT=Runners DBMS=csv REPLACE;
RUN;

*Question 1;
Data Sample01;
INPUT Overall_Place @@;
DATALINES;
170 72 10665 10933 6963
7022 5037 4265 15606 623
709 11168 11293 7723 7274
6215 5053 * 1044 1950
11169 11497 8821 7677 6363
5066 * 1180 2288 11691
12228 9242 8728 6522 5233
* 1921 3391 * 14118
9503 9751 6678 6530 *
;
RUN;

*Sample 01 Runners;
PROC SQL;
CREATE TABLE Sample01_Runners as
SELECT Name, Gender, Age, Time, b.Overall_Place, City, State, Country
FROM WORK.RUNNERS a, WORK.SAMPLE01 b
WHERE a.Overall_Place=b.Overall_Place;
Run;

 

If you have questions don't hesitate to ask.

1 REPLY 1
ballardw
Super User

Using the data in Sample1 how do you identify the participants in DMV? I do not see any information that indicates where anyone is from.

 

 

You likely do not need to add any variables at all. You can create analysis groups in data based on a single variable using a custom format.

Something like:

proc format library=work;

value $dmv

'MD', 'VA', 'DC' = 'DMV'

other= 'non DMV'

;

run;

 

Then use the format for your state variable in procs such as freq, means, ttest etc.

 

Proc freq data=Sample01_Runners ;

   tables state;

   format state $dmv.;

run;

 

If /then /else (or Select when or Case when) will be needed if two or more variables are involved.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 608 views
  • 0 likes
  • 2 in conversation