## data logic

Solved
Occasional Contributor
Posts: 12

# data logic

Hello,

I have the following data structure and need to create a dataset based on the following , can you please help me to create a variable avalc for the above subjects based on the following condition Value = 'Y' if any two non-missing consecutive date are >=7 days apart  when DOSE > 0,   othervise 'N'

Subj        EXSTDTC           DOSE

111        2015-04-01         100

111        2015-04-09

111        2015-04-15         100

111        2015-04-22         100

222        2015-04-01         100

222        2015-04-09         100

333        2015-04-01         100

333        2015-04-09

444      2015-04-10          100

444                                  100

444      2015-04-20          100

Accepted Solutions
Solution
‎04-15-2016 09:17 PM
Contributor
Posts: 30

## Re: data logic

This isn't the most elegant solution but I believe it gets you what you want.  I also assumed that you wanted it BY subject.  So that piece is in there as well.

I also worked around the date (the second HAVE dataset due to my SAS not being able to read the dates you provided).

DATA Have;
INPUT SUBJ 1-3 EXSTDTC \$5-14 DOSE 16-18;
DATALINES;
111 2015-04-01 100
111 2015-04-09
111 2015-04-15 100
111 2015-04-22 100
222 2015-04-01 100
222 2015-04-09 100
333 2015-04-01 100
333 2015-04-09
444 2015-04-10 100
444 100
444 2015-04-20 100
;
RUN;

DATA HAVE;
SET HAVE (RENAME = (EXSTDTC = EXSTDTC_OLD));
YEAR = SUBSTR(EXSTDTC_OLD,1,4);
MONTH = SUBSTR(EXSTDTC_OLD,6,2);
DAY = SUBSTR(EXSTDTC_OLD,9,2);
EXSTDTC = MDY(MONTH,DAY,YEAR);
FORMAT EXSTDTC MMDDYY10.;
DROP YEAR MONTH DAY EXSTDTC_OLD;
RUN;

PROC SORT DATA = HAVE;
BY SUBJ;
RUN;

DATA HAVE2 WANT;
SET HAVE;
BY SUBJ;
LAG_DT = LAG(EXSTDTC);
IF FIRST.SUBJ THEN LAG_DT = .;
DIFF = EXSTDTC- LAG_DT;
IF DOSE NE . AND DIFF >= 7 THEN FLAG = 'Y';
ELSE FLAG = 'N';
OUTPUT HAVE2;
IF FLAG = 'Y' THEN OUTPUT WANT;
FORMAT LAG_DT MMDDYY10.;
RUN;

All Replies
Solution
‎04-15-2016 09:17 PM
Contributor
Posts: 30

## Re: data logic

This isn't the most elegant solution but I believe it gets you what you want.  I also assumed that you wanted it BY subject.  So that piece is in there as well.

I also worked around the date (the second HAVE dataset due to my SAS not being able to read the dates you provided).

DATA Have;
INPUT SUBJ 1-3 EXSTDTC \$5-14 DOSE 16-18;
DATALINES;
111 2015-04-01 100
111 2015-04-09
111 2015-04-15 100
111 2015-04-22 100
222 2015-04-01 100
222 2015-04-09 100
333 2015-04-01 100
333 2015-04-09
444 2015-04-10 100
444 100
444 2015-04-20 100
;
RUN;

DATA HAVE;
SET HAVE (RENAME = (EXSTDTC = EXSTDTC_OLD));
YEAR = SUBSTR(EXSTDTC_OLD,1,4);
MONTH = SUBSTR(EXSTDTC_OLD,6,2);
DAY = SUBSTR(EXSTDTC_OLD,9,2);
EXSTDTC = MDY(MONTH,DAY,YEAR);
FORMAT EXSTDTC MMDDYY10.;
DROP YEAR MONTH DAY EXSTDTC_OLD;
RUN;

PROC SORT DATA = HAVE;
BY SUBJ;
RUN;

DATA HAVE2 WANT;
SET HAVE;
BY SUBJ;
LAG_DT = LAG(EXSTDTC);
IF FIRST.SUBJ THEN LAG_DT = .;
DIFF = EXSTDTC- LAG_DT;
IF DOSE NE . AND DIFF >= 7 THEN FLAG = 'Y';
ELSE FLAG = 'N';
OUTPUT HAVE2;
IF FLAG = 'Y' THEN OUTPUT WANT;
FORMAT LAG_DT MMDDYY10.;
RUN;

Super User
Posts: 10,784

## Re: data logic

```What output do you want ?

DATA Have;
INPUT SUBJ 1-3 EXSTDTC : yymmdd10. DOSE 16-18;
format EXSTDTC  yymmdd10. ;
DATALINES;
111 2015-04-01 100
111 2015-04-09
111 2015-04-15 100
111 2015-04-22 100
222 2015-04-01 100
222 2015-04-09 100
333 2015-04-01 100
333 2015-04-09
444 2015-04-10 100
444 100
444 2015-04-20 100
;
RUN;
data want;
set have;
if SUBJ = lag(SUBJ) and dif(EXSTDTC) ge 7 and DOSE gt 0 then avalc ='Y';
else avalc ='N';
run;

```
Super User
Posts: 10,279

## Re: data logic

``````data have;
infile cards dsd dlm=' ' missover;
input subj exstdtc:yymmdd10. dose;
format exstdtc yymmdd10.;
cards;
111 2015-04-01 100
111 2015-04-09
111 2015-04-15 100
111 2015-04-22 100
222 2015-04-01 100
222 2015-04-09 100
333 2015-04-01 100
333 2015-04-09
444 2015-04-10 100
444  100
444 2015-04-20 100
;
run;

proc sort
data=have (where=(exstdtc ne . and dose > 0))
out=int
;
by subj exstdtc;
run;

data want;
set int;
by subj;
if not first.subj and intck('days',lag(exstdtc),exstdtc) > 7
then avalc = 'Y';
else avalc = 'N';
run;

proc print;
run;
``````

Result:

```                                             Obs    subj       exstdtc    dose    avalc

1      111    2015-04-01     100      N
2      111    2015-04-15     100      Y
3      111    2015-04-22     100      N
4      222    2015-04-01     100      N
5      222    2015-04-09     100      Y
6      333    2015-04-01     100      N
7      444    2015-04-10     100      N
8      444    2015-04-20     100      Y
```
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
🔒 This topic is solved and locked.