DATA Step, Macro, Functions and more

data logic

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 12
Accepted Solution

data logic

Hello,

 

I have the following data structure and need to create a dataset based on the following , can you please help me to create a variable avalc for the above subjects based on the following condition Value = 'Y' if any two non-missing consecutive date are >=7 days apart  when DOSE > 0,   othervise 'N'

 

 

 

Subj        EXSTDTC           DOSE              

111        2015-04-01         100

111        2015-04-09

111        2015-04-15         100

111        2015-04-22         100

 

222        2015-04-01         100

222        2015-04-09         100

 

333        2015-04-01         100

333        2015-04-09   

 

444      2015-04-10          100

444                                  100

444      2015-04-20          100

 

Thanks in Advance


Accepted Solutions
Solution
‎04-15-2016 09:17 PM
Contributor
Posts: 30

Re: data logic

This isn't the most elegant solution but I believe it gets you what you want.  I also assumed that you wanted it BY subject.  So that piece is in there as well.

 

I also worked around the date (the second HAVE dataset due to my SAS not being able to read the dates you provided).

 

DATA Have;
INPUT SUBJ 1-3 EXSTDTC $5-14 DOSE 16-18;
DATALINES;
111 2015-04-01 100
111 2015-04-09
111 2015-04-15 100
111 2015-04-22 100
222 2015-04-01 100
222 2015-04-09 100
333 2015-04-01 100
333 2015-04-09
444 2015-04-10 100
444 100
444 2015-04-20 100
;
RUN;

 

DATA HAVE;
SET HAVE (RENAME = (EXSTDTC = EXSTDTC_OLD));
YEAR = SUBSTR(EXSTDTC_OLD,1,4);
MONTH = SUBSTR(EXSTDTC_OLD,6,2);
DAY = SUBSTR(EXSTDTC_OLD,9,2);
EXSTDTC = MDY(MONTH,DAY,YEAR);
FORMAT EXSTDTC MMDDYY10.;
DROP YEAR MONTH DAY EXSTDTC_OLD;
RUN;

 

PROC SORT DATA = HAVE;
BY SUBJ;
RUN;

 

DATA HAVE2 WANT;
SET HAVE;
BY SUBJ;
LAG_DT = LAG(EXSTDTC);
IF FIRST.SUBJ THEN LAG_DT = .;
DIFF = EXSTDTC- LAG_DT;
IF DOSE NE . AND DIFF >= 7 THEN FLAG = 'Y';
ELSE FLAG = 'N';
OUTPUT HAVE2;
IF FLAG = 'Y' THEN OUTPUT WANT;
FORMAT LAG_DT MMDDYY10.;
RUN;

View solution in original post


All Replies
Solution
‎04-15-2016 09:17 PM
Contributor
Posts: 30

Re: data logic

This isn't the most elegant solution but I believe it gets you what you want.  I also assumed that you wanted it BY subject.  So that piece is in there as well.

 

I also worked around the date (the second HAVE dataset due to my SAS not being able to read the dates you provided).

 

DATA Have;
INPUT SUBJ 1-3 EXSTDTC $5-14 DOSE 16-18;
DATALINES;
111 2015-04-01 100
111 2015-04-09
111 2015-04-15 100
111 2015-04-22 100
222 2015-04-01 100
222 2015-04-09 100
333 2015-04-01 100
333 2015-04-09
444 2015-04-10 100
444 100
444 2015-04-20 100
;
RUN;

 

DATA HAVE;
SET HAVE (RENAME = (EXSTDTC = EXSTDTC_OLD));
YEAR = SUBSTR(EXSTDTC_OLD,1,4);
MONTH = SUBSTR(EXSTDTC_OLD,6,2);
DAY = SUBSTR(EXSTDTC_OLD,9,2);
EXSTDTC = MDY(MONTH,DAY,YEAR);
FORMAT EXSTDTC MMDDYY10.;
DROP YEAR MONTH DAY EXSTDTC_OLD;
RUN;

 

PROC SORT DATA = HAVE;
BY SUBJ;
RUN;

 

DATA HAVE2 WANT;
SET HAVE;
BY SUBJ;
LAG_DT = LAG(EXSTDTC);
IF FIRST.SUBJ THEN LAG_DT = .;
DIFF = EXSTDTC- LAG_DT;
IF DOSE NE . AND DIFF >= 7 THEN FLAG = 'Y';
ELSE FLAG = 'N';
OUTPUT HAVE2;
IF FLAG = 'Y' THEN OUTPUT WANT;
FORMAT LAG_DT MMDDYY10.;
RUN;

Super User
Posts: 10,041

Re: data logic

What output do you want ?



DATA Have; 
INPUT SUBJ 1-3 EXSTDTC : yymmdd10. DOSE 16-18; 
format EXSTDTC  yymmdd10. ;
DATALINES; 
111 2015-04-01 100
111 2015-04-09
111 2015-04-15 100
111 2015-04-22 100
222 2015-04-01 100
222 2015-04-09 100
333 2015-04-01 100
333 2015-04-09
444 2015-04-10 100
444 100
444 2015-04-20 100
; 
RUN;
data want;
 set have;
 if SUBJ = lag(SUBJ) and dif(EXSTDTC) ge 7 and DOSE gt 0 then avalc ='Y';
  else avalc ='N';
 run;



Super User
Posts: 7,830

Re: data logic

How about this:

data have;
infile cards dsd dlm=' ' missover;
input subj exstdtc:yymmdd10. dose;
format exstdtc yymmdd10.;
cards;
111 2015-04-01 100
111 2015-04-09
111 2015-04-15 100
111 2015-04-22 100
222 2015-04-01 100
222 2015-04-09 100
333 2015-04-01 100
333 2015-04-09
444 2015-04-10 100
444  100
444 2015-04-20 100
;
run;

proc sort
  data=have (where=(exstdtc ne . and dose > 0))
  out=int
;
by subj exstdtc;
run;

data want;
set int;
by subj;
if not first.subj and intck('days',lag(exstdtc),exstdtc) > 7
then avalc = 'Y';
else avalc = 'N';
run;

proc print;
run;

Result:

                                             Obs    subj       exstdtc    dose    avalc

                                              1      111    2015-04-01     100      N  
                                              2      111    2015-04-15     100      Y  
                                              3      111    2015-04-22     100      N  
                                              4      222    2015-04-01     100      N  
                                              5      222    2015-04-09     100      Y  
                                              6      333    2015-04-01     100      N  
                                              7      444    2015-04-10     100      N  
                                              8      444    2015-04-20     100      Y  
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 239 views
  • 0 likes
  • 4 in conversation