BookmarkSubscribeRSS Feed
toneill
Calcite | Level 5

New, and slightly reluctant, SAS user (previously STATA user)

SAS dates have me scratching my head a bit. I have a longitudinal dataset, but I am currently only interested in baseline (e.g. first questionnaire) data for which I have created a new dataset.

Variables:

(a) Date of Baseline Questionnaire: var1 = DDMMYYYY (SAS Informat: DDMMYYYY8)

(b) Known Disease Exposure?: var2: yes(1), no(0), don'tknow(88), refused(99) --> ONLY those that responded YES had the next question asked:

(c) Date of Disease Exposure: var3 =  MM (0-12, 0=unknown, 1=January etc.), var4 = YYYY

Goal:

Determine "Time Since Exposure" from baseline questionnaire

What I'd like to do/think I should be doing:

1. Only analyse those individuals at baseline who responded "YES(1)" to var2

2. Assign/impute the middle of each month (e.g. 15th day) as a new variable to every individual (create: var5)

3. Create a new variable: 'DateExp' (var6) -- this will combine var3-5 in to a single variable represented as: DDMMYYY (equivalent to: var5-var3-var4)

- I believe I need to use the MDY function; but the SAS examples are not easily understood (again, being a new user!)

4. Calculate time since exposure (TimeSinceExp=var2-var6/365.25) with an output in Years, rounded to a single decimal (e.g. 5.4 years)

Any assistance would be appreciated. I am pondering this away today and will check in tomorrow to see how close (or far off!) my own code is....so far about 20% of my more complicated coding works (which I would say isn't too bad for 2nd week of use).

Best & many thanks in advance.

-Tyler

2 REPLIES 2
ballardw
Super User

1. Only analyse those individuals at baseline who responded "YES(1)" to var2

Any of the SAS procedures will allow subsetting the data either with a separate Where clause i.e.

proc means data=have;

where var2 = 1;

<other code>

or subsetting on the data set;

proc means data=have (where=(var2=1)) ;

2. Assign/impute the middle of each month (e.g. 15th day) as a new variable to every individual (create: var5)

Var5 = 15;

3. Create a new variable: 'DateExp' (var6) -- this will combine var3-5 in to a single variable represented as: DDMMYYY (equivalent to: var5-var3-var4)

var6 = mdy(var3, var5, var4); (or DateExp = mdy(vare,var5,var4);

format var6 mmddyy10. ; /* this so the date will look like a date*/

Actually if you are not going to use the VAR5 for anything since it represents a fixed value, skip the var5 part

var6= mdy(var3,15,var4);

4. Calculate time since exposure (TimeSinceExp=var2-var6/365.25) with an output in Years, rounded to a single decimal (e.g. 5.4 years)

TimeSinceExp= round ( (var2-var6)/365.25, .1);

I would strongly recommend using variable names that mean something, Month instead of Var3, Year instead of Var4 DateExp instead of Var6 and/or assiging labels.

label DateExp = 'Date of exposure';

label TimeSinceExp = 'Years since exposure';

toneill
Calcite | Level 5

Yes - I plan to use alternative names; I just used them as there is some privacy issues regarding the data, so needed to be "general"! Thanks again.

sas-innovate-2024.png

Today is the last day to save with the early bird rate! Register today for just $695 - $100 off the standard rate.

 

Plus, pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

New Learning Events in April

 

Join us for two new fee-based courses: Administrative Healthcare Data and SAS via Live Web Monday-Thursday, April 24-27 from 1:00 to 4:30 PM ET each day. And Administrative Healthcare Data and SAS: Hands-On Programming Workshop via Live Web on Friday, April 28 from 9:00 AM to 5:00 PM ET.

LEARN MORE

Discussion stats
  • 2 replies
  • 3263 views
  • 1 like
  • 2 in conversation