I want to select the first instance where the var1 value is >= 0.5;
Help please
data have;
input id visitid pageid date var1 start_date;
datalines;
1 1 1 10/1/2019 0.1 9/30/2019
1 1 2 10/2/2019 0.2 9/30/2019
1 1 2 10/3/2019 0.5 9/30/2019
1 2 1 10/4/2019 1 9/30/2019
1 2 2 10/5/2019 3 9/30/2019
1 2 3 10/6/2019 5 9/30/2019
2 1 1 9/28/2019 0.1 9/27/2019
2 1 2 9/29/2019 0.2 9/27/2019
2 1 2 9/29/2019 0.5 9/27/2019
2 1 3 10/4/2019 1 9/27/2019
2 2 1 10/5/2019 3 9/27/2019
2 2 2 10/6/2019 5 9/27/2019
;
run;
data want;
input id visitid pageid date var1 start_date;
datalines;
1 1 2 10/3/2019 0.5 9/30/2019
2 1 2 9/29/2019 0.5 9/27/2019
;
run;
If the data is in proper order by ID:
data want;
set have;
by id;
where var1 >= 0.5;
if first.id;
run;
The interaction of the BY and WHERE statements is important. WHERE sets up first.id and last.id based on only the observations that pass the WHERE filter.
data want;
set have;
by id;
retain flag;
if first.id then flag = 0;
if not flag and var1 >= 0.5
then do;
flag = 1;
output;
end;
drop flag;
run;
Untested, as I'm on my tablet.
data have;
input id visitid pageid date :mmddyy10. var1 start_date :mmddyy10.;
format date start_date mmddyy10.;
datalines;
1 1 1 10/1/2019 0.1 9/30/2019
1 1 2 10/2/2019 0.2 9/30/2019
1 1 2 10/3/2019 0.5 9/30/2019
1 2 1 10/4/2019 1 9/30/2019
1 2 2 10/5/2019 3 9/30/2019
1 2 3 10/6/2019 5 9/30/2019
2 1 1 9/28/2019 0.1 9/27/2019
2 1 2 9/29/2019 0.2 9/27/2019
2 1 2 9/29/2019 0.5 9/27/2019
2 1 3 10/4/2019 1 9/27/2019
2 2 1 10/5/2019 3 9/27/2019
2 2 2 10/6/2019 5 9/27/2019
;
run;
data want;
do _n_=0 by 0 until(last.id);
set have;
by id;
if var1>= 0.5 and _n_=0 then do;
output ;
_n_=1;
end;
end;
run;
Hi @GopiV
Maybe you can try this :
proc sort data=have out=have2 (where=(var1 >= 0.5));
by id var1;
run;
data want;
set have2;
by id var1;
if first.id then output;
run;
Best regards,
Very nice @ed_sas_member If one can assume, all Var1 values will be either >=0.5 following the 1st occurance, and if that holds true the where filter is fine, also I am not sure whether a SORT is needed beforehand considering OP's dataset seems to be already sorted by ID and DATE.
So a sequential where , going with assumption can be without the SORT i would think?
data want;
set have(where=(var1 >= 0.5));
by id ;
if first.id ;
run;
If the data is in proper order by ID:
data want;
set have;
by id;
where var1 >= 0.5;
if first.id;
run;
The interaction of the BY and WHERE statements is important. WHERE sets up first.id and last.id based on only the observations that pass the WHERE filter.
Thank you
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.