Hello,
I have a dataset in which there is a column that has all the clicks data. There are 12 main keywords and if any of this keywords are found in the data then software should give the result till the stop keyword appears which are "Home" and start keywords as well. And it should be grouped by Id. So for each group it should run and give me the output.
For example:
Id column:
1 a
1 b
1 d
1 g
1 d
2 a
2 f
2 v
2 b
2 p
2 home
The keywords are : b and f So the software should start from b and stop whenever it encounters home or b or f and this would be 1st output(b d g d ) then it should go to next group again start from F and stop at home or b or f (f v) this will be second output and then it will again start from b and stop if it encounters b,f or home (b,p,home) this will be my 3rd output. My code is able to form the group and also give me the members in each group but it does not stop at the end of the 1st id. For example I am getting the output as : 1st output: b d g d a
I should not get 'a'.
Code is:
data want ;
set have ;
retain group 0 member 0 ;
if member then member+1;
if column in ('b','f') then do;
member=1;
group+1;
end;
if member then output;
if column = 'home' then member=0;
run;
data want ;
set have ;
retain group 0 member 0 ;
by ID; *this is the id variable;
if first.id then do;
*reset things that need to reset;
end;
if member then member+1;
if column in ('b','f') then do;
member=1;
group+1;
end;
if member then output;
if column = 'home' then member=0;
run;
See the sections I added.
That's for you to determine.
@Vrushank wrote:
I tried its not working in the reset area what should I reset ? should I put the retain statement out there ?
It would also help if you used consistent naming. Your variables are shown as ID column but then the code uses different variable names.
Also pay attention to the sort order, if the order changes then you might miss some values.
Check this alternate solution:
data test_ ;
input Id column $;
datalines;
1 a
1 b
1 d
1 g
1 d
1 home
2 a
2 f
2 v
2 b
2 p
2 home
;
run;
proc sql;
create table test as
select monotonic() as row_no,* from test_
order by 1,2;
quit;
data want;
Format Str $50.;
do until (last.id);
set test;
by id row_no;
Str=strip(Str)||"\"||strip(column);
end;
Need=substr(Str,prxmatch("m/f|b/oi",strip(Str)));
run;
Try this:
data test2(drop=row_no Lag_ct);
retain count;
set test;
Lag_ct=LAG(Count);
by id;
if first.id then count=0;
if prxmatch("m/f|b|home/oi",strip(Column))>0 then count+1;
if count=0 then delete;
if last.id and Count^=Lag_ct then Count=Lag_ct;
run;
proc transpose data=test2 out=need(drop=_name_);
by id count;
var column;
run;
data want(drop=COL: Count);
set need;
Str=CATX("/",of COL:);
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.