PROC SQL; retrain flag

radhikaa4 · Posted 05-07-2019 05:42 PM

Hello

I have the following dataset

subject_id	date	volume	flag_1	flag_2
1234	1/5/2014	23	0	0
1234	1/6/2014	11	1	0
1234	1/7/2014	93	0	1
1234	1/8/2014	12	0	0
1234	1/9/2014	43	0	0
4432	5/23/2015	32	0	0
4432	5/24/2015	21	0	0
4432	5/25/2015	76	1	0
4432	5/26/2015	23	0	1

What I would like to do is extract values before flag_2. Something like this:

subject_id	date	volume	flag_1	flag_2
1234	1/5/2014	23	0	0
1234	1/6/2014	11	1	0
1234	1/7/2014	93	0	1
4432	5/23/2015	32	0	0
4432	5/24/2015	21	0	0
4432	5/25/2015	76	1	0
4432	5/26/2015	23	0	1

I am completely stuck using PROC SQL;

SELECT subjectid, date, volume, flag_1, flag_2

FROM a.dataset

where flag_2 > flag_1;

ballardw · Posted 05-07-2019 06:11 PM

It looks like you want dates prior to or the same as when flag_2 =1, not a comparison of flag_1 and flag_2.

Is there ever more than one flag_2 = 1 for the same subjected? If so, what is the selection rule then?

What do you want if there are no flag_2=1 for a subject?

This untested code might get you started if there is only one flag_2=1:

proc sql;
   create table want as
   SELECT a.subjectid, a.date, a.volume, a.flag_1, a.flag_2
   FROM a.dataset as a
        left join 
        ( select * from a.dataset where 
          flag_2=1) as b
        on a.subject_id = b.subject_id
   where a.date le b.date
   ;
quit;

radhikaa4 · Posted 05-07-2019 07:09 PM

Hi this works!

But now I ran into another problem (which I did not realize)

So:

subject_id

date

volume

flag_1

flag_2

1234

1/5/2014

23

0

0

1234

1/6/2014

11

1

0

1234

1/7/2014

93

0

1

4432

5/23/2015

32

0

0

4432

5/24/2015

21

0

0

4432

5/25/2015

76

1

0

4432

5/26/2015

23

0

1

I also have to exclude the entries BEFORE flag_1 = 1

So:

subject_id

date

volume

flag_1

flag_2

1234

1/6/2014

11

1

0

1234

1/7/2014

93

0

1

4432

5/25/2015

76

1

0

4432

5/26/2015

23

0

1

Thanks!

PGStats · Posted 05-07-2019 11:16 PM

Trying to keep it simple:

proc sql;
select *
from have
group by subject_id
having 
    date >= min(case when flag_1 then date else . end) and
    date <= min(case when flag_2 then date else constant('BIG') end);
quit;

but at the same time, being careful to handle cases when there is no flag_1 = 1 or no flag_2 = 1. Using the fact that everything is > Null and nothing is > constant('BIG')

PG

Astounding · Posted 05-07-2019 06:22 PM

Can the same ID have more than one observation where FLAG_2=1 ? What should happen?

In any case, I would lean toward a DATA step instead of SQL, as long as your data is already in sorted order:

data want;
   set have;
   by id;
   retain output_flag 'N';
   if first.id then output_flag='Y';
   if output_flag='Y' then output;
   if flag_2=1 then output_flag='N';
   drop output_flag;
run;

Processing the records sequentially should be much faster than anything that involves a join.

novinosrin · Posted 05-07-2019 06:29 PM

Hi @radhikaa4 it's fairly straight forward-->

data have;
input subject_id	date :mmddyy10.	volume	flag_1	flag_2 ;
format date mmddyy10.;
cards;
1234	1/5/2014	23	0	0
1234	1/6/2014	11	1	0
1234	1/7/2014	93	0	1
1234	1/8/2014	12	0	0
1234	1/9/2014	43	0	0
4432	5/23/2015	32	0	0
4432	5/24/2015	21	0	0
4432	5/25/2015	76	1	0
4432	5/26/2015	23	0	1
;

proc sql;
create table want as
select *
from have
group by subject_id
having date<= max((flag_2=1)*date) 
order by subject_id,date;
quit;

novinosrin · Posted 05-07-2019 07:04 PM

Hello @radhikaa4 I too think you should switch to datastep. If your company is paying for a commercial SAS enterprise license, why not use better approaches right?

data have;
input subject_id	date :mmddyy10.	volume	flag_1	flag_2 ;
format date mmddyy10.;
cards;
1234	1/5/2014	23	0	0
1234	1/6/2014	11	1	0
1234	1/7/2014	93	0	1
1234	1/8/2014	12	0	0
1234	1/9/2014	43	0	0
4432	5/23/2015	32	0	0
4432	5/24/2015	21	0	0
4432	5/25/2015	76	1	0
4432	5/26/2015	23	0	1
;

data want;
set have;
by subject_id date;
if first.subject_id then d=1;
if d<=1 ;
d+flag_2;
drop d;
run;

PROC SQL; retrain flag

Re: PROC SQL; retrain flag

Re: PROC SQL; retrain flag

Re: PROC SQL; retrain flag

Re: PROC SQL; retrain flag

Re: PROC SQL; retrain flag

Re: PROC SQL; retrain flag

SAS Innovate 2026 Registration is Open