BookmarkSubscribeRSS Feed
toneill
Calcite | Level 5

New, and slightly reluctant, SAS user (previously STATA user)

SAS dates have me scratching my head a bit. I have a longitudinal dataset, but I am currently only interested in baseline (e.g. first questionnaire) data for which I have created a new dataset.

Variables:

(a) Date of Baseline Questionnaire: var1 = DDMMYYYY (SAS Informat: DDMMYYYY8)

(b) Known Disease Exposure?: var2: yes(1), no(0), don'tknow(88), refused(99) --> ONLY those that responded YES had the next question asked:

(c) Date of Disease Exposure: var3 =  MM (0-12, 0=unknown, 1=January etc.), var4 = YYYY

Goal:

Determine "Time Since Exposure" from baseline questionnaire

What I'd like to do/think I should be doing:

1. Only analyse those individuals at baseline who responded "YES(1)" to var2

2. Assign/impute the middle of each month (e.g. 15th day) as a new variable to every individual (create: var5)

3. Create a new variable: 'DateExp' (var6) -- this will combine var3-5 in to a single variable represented as: DDMMYYY (equivalent to: var5-var3-var4)

- I believe I need to use the MDY function; but the SAS examples are not easily understood (again, being a new user!)

4. Calculate time since exposure (TimeSinceExp=var2-var6/365.25) with an output in Years, rounded to a single decimal (e.g. 5.4 years)

Any assistance would be appreciated. I am pondering this away today and will check in tomorrow to see how close (or far off!) my own code is....so far about 20% of my more complicated coding works (which I would say isn't too bad for 2nd week of use).

Best & many thanks in advance.

-Tyler

2 REPLIES 2
ballardw
Super User

1. Only analyse those individuals at baseline who responded "YES(1)" to var2

Any of the SAS procedures will allow subsetting the data either with a separate Where clause i.e.

proc means data=have;

where var2 = 1;

<other code>

or subsetting on the data set;

proc means data=have (where=(var2=1)) ;

2. Assign/impute the middle of each month (e.g. 15th day) as a new variable to every individual (create: var5)

Var5 = 15;

3. Create a new variable: 'DateExp' (var6) -- this will combine var3-5 in to a single variable represented as: DDMMYYY (equivalent to: var5-var3-var4)

var6 = mdy(var3, var5, var4); (or DateExp = mdy(vare,var5,var4);

format var6 mmddyy10. ; /* this so the date will look like a date*/

Actually if you are not going to use the VAR5 for anything since it represents a fixed value, skip the var5 part

var6= mdy(var3,15,var4);

4. Calculate time since exposure (TimeSinceExp=var2-var6/365.25) with an output in Years, rounded to a single decimal (e.g. 5.4 years)

TimeSinceExp= round ( (var2-var6)/365.25, .1);

I would strongly recommend using variable names that mean something, Month instead of Var3, Year instead of Var4 DateExp instead of Var6 and/or assiging labels.

label DateExp = 'Date of exposure';

label TimeSinceExp = 'Years since exposure';

toneill
Calcite | Level 5

Yes - I plan to use alternative names; I just used them as there is some privacy issues regarding the data, so needed to be "general"! Thanks again.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

Health and Life Sciences Learning

 

Need courses to help you with SAS Life Sciences Analytics Framework, SAS Health Cohort Builder, or other topics? Check out the Health and Life Sciences learning path for all of the offerings.

LEARN MORE

Discussion stats
  • 2 replies
  • 3530 views
  • 1 like
  • 2 in conversation