I want to select the first instance where the var1 value is >= 0.5;
Help please
data have;
input id visitid pageid date var1 start_date;
datalines;
1 1 1 10/1/2019 0.1 9/30/2019
1 1 2 10/2/2019 0.2 9/30/2019
1 1 2 10/3/2019 0.5 9/30/2019
1 2 1 10/4/2019 1 9/30/2019
1 2 2 10/5/2019 3 9/30/2019
1 2 3 10/6/2019 5 9/30/2019
2 1 1 9/28/2019 0.1 9/27/2019
2 1 2 9/29/2019 0.2 9/27/2019
2 1 2 9/29/2019 0.5 9/27/2019
2 1 3 10/4/2019 1 9/27/2019
2 2 1 10/5/2019 3 9/27/2019
2 2 2 10/6/2019 5 9/27/2019
;
run;
data want;
input id visitid pageid date var1 start_date;
datalines;
1 1 2 10/3/2019 0.5 9/30/2019
2 1 2 9/29/2019 0.5 9/27/2019
;
run;
If the data is in proper order by ID:
data want;
set have;
by id;
where var1 >= 0.5;
if first.id;
run;
The interaction of the BY and WHERE statements is important. WHERE sets up first.id and last.id based on only the observations that pass the WHERE filter.
data want;
set have;
by id;
retain flag;
if first.id then flag = 0;
if not flag and var1 >= 0.5
then do;
flag = 1;
output;
end;
drop flag;
run;
Untested, as I'm on my tablet.
data have;
input id visitid pageid date :mmddyy10. var1 start_date :mmddyy10.;
format date start_date mmddyy10.;
datalines;
1 1 1 10/1/2019 0.1 9/30/2019
1 1 2 10/2/2019 0.2 9/30/2019
1 1 2 10/3/2019 0.5 9/30/2019
1 2 1 10/4/2019 1 9/30/2019
1 2 2 10/5/2019 3 9/30/2019
1 2 3 10/6/2019 5 9/30/2019
2 1 1 9/28/2019 0.1 9/27/2019
2 1 2 9/29/2019 0.2 9/27/2019
2 1 2 9/29/2019 0.5 9/27/2019
2 1 3 10/4/2019 1 9/27/2019
2 2 1 10/5/2019 3 9/27/2019
2 2 2 10/6/2019 5 9/27/2019
;
run;
data want;
do _n_=0 by 0 until(last.id);
set have;
by id;
if var1>= 0.5 and _n_=0 then do;
output ;
_n_=1;
end;
end;
run;
Hi @GopiV
Maybe you can try this :
proc sort data=have out=have2 (where=(var1 >= 0.5));
by id var1;
run;
data want;
set have2;
by id var1;
if first.id then output;
run;
Best regards,
Very nice @ed_sas_member If one can assume, all Var1 values will be either >=0.5 following the 1st occurance, and if that holds true the where filter is fine, also I am not sure whether a SORT is needed beforehand considering OP's dataset seems to be already sorted by ID and DATE.
So a sequential where , going with assumption can be without the SORT i would think?
data want;
set have(where=(var1 >= 0.5));
by id ;
if first.id ;
run;
If the data is in proper order by ID:
data want;
set have;
by id;
where var1 >= 0.5;
if first.id;
run;
The interaction of the BY and WHERE statements is important. WHERE sets up first.id and last.id based on only the observations that pass the WHERE filter.
Thank you
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.