## MDY & datdiff functions (longitudinal data set)

Occasional Contributor
Posts: 14

# MDY & datdiff functions (longitudinal data set)

New, and slightly reluctant, SAS user (previously STATA user)

SAS dates have me scratching my head a bit. I have a longitudinal dataset, but I am currently only interested in baseline (e.g. first questionnaire) data for which I have created a new dataset.

Variables:

(a) Date of Baseline Questionnaire: var1 = DDMMYYYY (SAS Informat: DDMMYYYY8)

(b) Known Disease Exposure?: var2: yes(1), no(0), don'tknow(88), refused(99) --> ONLY those that responded YES had the next question asked:

(c) Date of Disease Exposure: var3 =  MM (0-12, 0=unknown, 1=January etc.), var4 = YYYY

Goal:

Determine "Time Since Exposure" from baseline questionnaire

What I'd like to do/think I should be doing:

1. Only analyse those individuals at baseline who responded "YES(1)" to var2

2. Assign/impute the middle of each month (e.g. 15th day) as a new variable to every individual (create: var5)

3. Create a new variable: 'DateExp' (var6) -- this will combine var3-5 in to a single variable represented as: DDMMYYY (equivalent to: var5-var3-var4)

- I believe I need to use the MDY function; but the SAS examples are not easily understood (again, being a new user!)

4. Calculate time since exposure (TimeSinceExp=var2-var6/365.25) with an output in Years, rounded to a single decimal (e.g. 5.4 years)

Any assistance would be appreciated. I am pondering this away today and will check in tomorrow to see how close (or far off!) my own code is....so far about 20% of my more complicated coding works (which I would say isn't too bad for 2nd week of use).

Best & many thanks in advance.

-Tyler

Super User
Posts: 13,317

## Re: MDY & datdiff functions (longitudinal data set)

1. Only analyse those individuals at baseline who responded "YES(1)" to var2

Any of the SAS procedures will allow subsetting the data either with a separate Where clause i.e.

proc means data=have;

where var2 = 1;

<other code>

or subsetting on the data set;

proc means data=have (where=(var2=1)) ;

2. Assign/impute the middle of each month (e.g. 15th day) as a new variable to every individual (create: var5)

Var5 = 15;

3. Create a new variable: 'DateExp' (var6) -- this will combine var3-5 in to a single variable represented as: DDMMYYY (equivalent to: var5-var3-var4)

var6 = mdy(var3, var5, var4); (or DateExp = mdy(vare,var5,var4);

format var6 mmddyy10. ; /* this so the date will look like a date*/

Actually if you are not going to use the VAR5 for anything since it represents a fixed value, skip the var5 part

var6= mdy(var3,15,var4);

4. Calculate time since exposure (TimeSinceExp=var2-var6/365.25) with an output in Years, rounded to a single decimal (e.g. 5.4 years)

TimeSinceExp= round ( (var2-var6)/365.25, .1);

I would strongly recommend using variable names that mean something, Month instead of Var3, Year instead of Var4 DateExp instead of Var6 and/or assiging labels.

label DateExp = 'Date of exposure';

label TimeSinceExp = 'Years since exposure';

Occasional Contributor
Posts: 14

## Re: MDY & datdiff functions (longitudinal data set)

Yes - I plan to use alternative names; I just used them as there is some privacy issues regarding the data, so needed to be "general"! Thanks again.

Discussion stats
• 2 replies
• 2655 views
• 1 like
• 2 in conversation