Dear,
I need assign base line flag based on a sort order. In my data, for some subjects have equal sort order values. My code assigning flag to only last OBS. I need to assign to all last OBS with EQUAL values. Please help in my code.
For ID=2 need to have flags for the last two OBS because subject has equal pid
my code;
proc sort data=data1;
by id test date pid;
run;
data dat2;
data1;
by id test date pid;
if last.test then flag = 'y';
run;
data
id test date pid treatment date
1 dbp 2015-04-06 1 2015-04-07
1 dbp 2015-04-06 2 2015-04-07
1 dbp 2015-04-06 3 2015-04-07
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
output need;
id test date pid treatment date flag
1 dbp 2015-04-06 1 2015-04-07
1 dbp 2015-04-06 2 2015-04-07
1 dbp 2015-04-06 3 2015-04-07 y
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 2 2015-04-07 y
2 dbp 2015-04-06 2 2015-04-07 y
output getting;
d test date pid treatment date flag
1 dbp 2015-04-06 1 2015-04-07
1 dbp 2015-04-06 2 2015-04-07
1 dbp 2015-04-06 3 2015-04-07 y
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
2 dbp 2015-04-06 2 2015-04-07 y
One way would be to sort the data in reverse order. e.g.:
data have;
informat date treatment_date anydtdte10.;
input id test $ date pid treatment_date;
cards;
1 dbp 2015-04-06 1 2015-04-07
1 dbp 2015-04-06 2 2015-04-07
1 dbp 2015-04-06 3 2015-04-07
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
;
proc sort data=have;
by id test descending date descending pid;
run;
data want;
set have;
by id test date;
if first.test or not first.test and pid eq lag(pid) then flag = 'y';
run;
proc sort data=want;
by id test date pid;
run;
HTH,
Art, CEO, AnalystFinder.com
Dear,
Thanks for the help. I am not getting exactly the output I need.
With your code I am also flagging for the second OBS for ID=2.
data
data
id test date pid treatment date
1 dbp 2015-04-06 1 2015-04-07
1 dbp 2015-04-06 2 2015-04-07
1 dbp 2015-04-06 3 2015-04-07
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
output need;
id test date pid treatment date flag
1 dbp 2015-04-06 1 2015-04-07
1 dbp 2015-04-06 2 2015-04-07
1 dbp 2015-04-06 3 2015-04-07 y
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 2 2015-04-07 y
2 dbp 2015-04-06 2 2015-04-07 y
output getting;
d test date pid treatment date flag
1 dbp 2015-04-06 1 2015-04-07
1 dbp 2015-04-06 2 2015-04-07
1 dbp 2015-04-06 3 2015-04-07 y
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 1 2015-04-07 y
2 dbp 2015-04-06 2 2015-04-07 y
2 dbp 2015-04-06 2 2015-04-07 y
I expanded your dataset and added an extra condition to the code. Does the following do what you want/need?
data have;
informat date treatment_date anydtdte10.;
input id test $ date pid treatment_date;
cards;
1 dbp 2015-04-06 1 2015-04-07
1 dbp 2015-04-06 2 2015-04-07
1 dbp 2015-04-06 3 2015-04-07
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
1 xbp 2015-04-06 1 2015-04-07
1 xbp 2015-04-06 2 2015-04-07
1 xbp 2015-04-06 3 2015-04-07
2 xbp 2015-04-06 1 2015-04-07
2 xbp 2015-04-06 1 2015-04-07
2 xbp 2015-04-06 2 2015-04-07
2 xbp 2015-04-06 2 2015-04-07
;
proc sort data=have;
by id test descending date descending pid;
run;
data want (drop=last_:);
set have;
by id test date;
if first.test then flag = 'y';
last_flag=lag(flag);
last_pid=lag(pid);
if not first.test and last_flag eq 'y' and pid eq last_pid then flag='y';
run;
proc sort data=want;
by id test date pid flag;
run;
Art, CEO, AnalystFinder.com
Dear,
Thanks again. But I am getting the output I need for OBS where only two OBS with equal pid values present
But when there are 3 OBS with equal PID, your code only flagging only two OBS.
For the ID=2, the three OBS with pid=2 should have flagged. By using your code I am getting the output with only for two OBS. Thanks.
example;
data
data
id test date pid treatment date
1 dbp 2015-04-06 1 2015-04-07
1 dbp 2015-04-06 2 2015-04-07
1 dbp 2015-04-06 3 2015-04-07
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
output need;
id test date pid treatment date flag
1 dbp 2015-04-06 1 2015-04-07
1 dbp 2015-04-06 2 2015-04-07
1 dbp 2015-04-06 3 2015-04-07 y
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 2 2015-04-07 y
2 dbp 2015-04-06 2 2015-04-07 y
2 dbp 2015-04-06 2 2015-04-07 y
output getting;
id test date pid treatment date flag
1 dbp 2015-04-06 1 2015-04-07
1 dbp 2015-04-06 2 2015-04-07
1 dbp 2015-04-06 3 2015-04-07 y
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 2 2015-04-07 y
2 dbp 2015-04-06 2 2015-04-07 y
2 dbp 2015-04-06 2 2015-04-07
Then I'd do it with a retain statement rather than using the lag function. e.g.:
data have;
informat date treatment_date anydtdte10.;
input id test $ date pid treatment_date;
cards;
1 dbp 2015-04-06 1 2015-04-07
1 dbp 2015-04-06 2 2015-04-07
1 dbp 2015-04-06 3 2015-04-07
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
;
proc sort data=have;
by id test descending date descending pid;
run;
data want (drop=last_:);
set have;
by id test date;
retain last_flag last_pid;
if first.test then do;
flag = 'y';
last_flag='y';
last_pid=pid;
end;
else if last_flag eq 'y' and pid eq last_pid then flag='y';
run;
proc sort data=want;
by id test date pid flag;
run;
HTH,
Art, CEO, AnalystFinder.com
You may as well try the proc sql approach
data have ;
input id test$ date:$10. pid treatment_date:$10.;
cards;
1 dbp 2015-04-06 1 2015-04-07
1 dbp 2015-04-06 2 2015-04-07
1 dbp 2015-04-06 3 2015-04-07
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 1 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
2 dbp 2015-04-06 2 2015-04-07
;
proc sql ;
create table test as select a.*,b.flag from have as a left join (select distinct id, test,pid, 'Y' as flag from have group by id, test having pid=max(pid)) as b
on a.id=b.id and a.test=b.test and a.pid=b.pid;
quit;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.