Hello all!
I have a brainteaser here…
I have sorted data where I mark an indicator by Z or X, depending on some condition.
From the sample data, I need to count the occurrences of X within the first field. If X occurs more than one time continuously, I need to count it as 1…
Any suggestions?
Thank you in advance!
I am on SAS 9.2 using DI 4.21
sample:
000017861 | 0001 | 120 | Z | |
000017861 | 0002 | 160 | Z | |
000017861 | 0003 | 110 | X | 1 |
000017861 | 0004 | 190 | X | 1 |
000017861 | 0005 | 110 | X | 1 |
000017861 | 0006 | 140 | Z | |
000017869 | 0001 | 120 | Z | |
000017869 | 0002 | 140 | Z | |
000017869 | 0003 | 110 | X | 2 |
000017869 | 0004 | 190 | X | 2 |
000017869 | 0005 | 110 | X | 2 |
000017869 | 0006 | 140 | Z | |
000017869 | 0007 | 110 | X | 3 |
000017869 | 0008 | 190 | X | 3 |
000017869 | 0009 | 110 | X | 3 |
000017869 | 0010 | 190 | X | 3 |
000017869 | 0011 | 110 | X | 3 |
000017869 | 0012 | 140 | Z | |
000017869 | 0013 | 110 | X | 4 |
000017869 | 0014 | 140 | Z | |
000017869 | 0015 | 110 | X | 5 |
000017869 | 0016 | 140 | Z | |
000017869 | 0017 | 110 | X | 6 |
000017869 | 0018 | 140 | Z | |
000017869 | 0019 | 180 | Z | |
000017869 | 0020 | 200 | X | 7 |
000017877 | 0001 | 120 | Z | |
000017877 | 0002 | 150 | Z | |
000017877 | 0003 | 140 | Z | |
000017877 | 0004 | 110 | X | 1 |
000017877 | 0005 | 140 | Z | |
000017877 | 0006 | 110 | X | 2 |
000017877 | 0007 | 140 | Z | |
000017877 | 0008 | 110 | X | 3 |
000017877 | 0009 | 140 | Z | |
000017877 | 0010 | 110 | X | 4 |
000017878 | 0001 | 120 | Z | |
000017878 | 0002 | 140 | Z | |
000017878 | 0003 | 110 | X | 1 |
000017878 | 0005 | 110 | X | 1 |
000017878 | 0007 | 110 | X | 1 |
000017878 | 0008 | 140 | Z | |
000017878 | 0010 | 180 | Z |
Please post test data in the form of a datastep, its not fun for me to have to type that in!
data have; infile datalines dlm=" " missover; input id $ num_id $ val type $; datalines; 000017861 0001 120 Z 000017861 0002 160 Z 000017861 0003 110 X 000017861 0004 190 X 000017861 0005 110 X 000017861 0006 140 Z 000017869 0001 120 Z 000017869 0002 140 Z 000017869 0003 110 X 000017869 0004 190 X 000017869 0005 110 X 000017869 0006 140 Z 000017869 0007 110 X 000017869 0008 190 X 000017869 0009 110 X 000017869 0010 190 X 000017869 0011 110 X ; run; data want; set have; by id type notsorted; retain total curr; if first.id then total=0; if first.type and type="X" then do; total=total+1; curr=total; end; if first.type and type ne "X" then curr=.; run;
Basically keep a running total within id, and set a retained value based on type on or off.
Why is '1' in
000017877 0004 110 X 1
I can't figure the rules when to restart counting.
For the first part, up to restarting the count
assuming variables are v1 v2 v3 vX counter you can
data test;
set have;
retain phase 0 counter;
if vx = 'X' and phase=0 then do;
phase=1;
counter +1;
end;
if vx ne 'X' then phase = 0;
run;
can do:
Thank you for you quick response. The 1 is not part of the data. I put it there to show the result that I need.
Please post test data in the form of a datastep, its not fun for me to have to type that in!
data have; infile datalines dlm=" " missover; input id $ num_id $ val type $; datalines; 000017861 0001 120 Z 000017861 0002 160 Z 000017861 0003 110 X 000017861 0004 190 X 000017861 0005 110 X 000017861 0006 140 Z 000017869 0001 120 Z 000017869 0002 140 Z 000017869 0003 110 X 000017869 0004 190 X 000017869 0005 110 X 000017869 0006 140 Z 000017869 0007 110 X 000017869 0008 190 X 000017869 0009 110 X 000017869 0010 190 X 000017869 0011 110 X ; run; data want; set have; by id type notsorted; retain total curr; if first.id then total=0; if first.type and type="X" then do; total=total+1; curr=total; end; if first.type and type ne "X" then curr=.; run;
Basically keep a running total within id, and set a retained value based on type on or off.
Thank you for your quick response! Sorry about that! I attached a sample file and tried to copy data in the original message...
I will try the solution!
Runs like a champion!
Thank you!
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.