Hello all!
I have a brainteaser here…
I have sorted data where I mark an indicator by Z or X, depending on some condition.
From the sample data, I need to count the occurrences of X within the first field. If X occurs more than one time continuously, I need to count it as 1…
Any suggestions?
Thank you in advance!
I am on SAS 9.2 using DI 4.21
sample:
000017861 | 0001 | 120 | Z | |
000017861 | 0002 | 160 | Z | |
000017861 | 0003 | 110 | X | 1 |
000017861 | 0004 | 190 | X | 1 |
000017861 | 0005 | 110 | X | 1 |
000017861 | 0006 | 140 | Z | |
000017869 | 0001 | 120 | Z | |
000017869 | 0002 | 140 | Z | |
000017869 | 0003 | 110 | X | 2 |
000017869 | 0004 | 190 | X | 2 |
000017869 | 0005 | 110 | X | 2 |
000017869 | 0006 | 140 | Z | |
000017869 | 0007 | 110 | X | 3 |
000017869 | 0008 | 190 | X | 3 |
000017869 | 0009 | 110 | X | 3 |
000017869 | 0010 | 190 | X | 3 |
000017869 | 0011 | 110 | X | 3 |
000017869 | 0012 | 140 | Z | |
000017869 | 0013 | 110 | X | 4 |
000017869 | 0014 | 140 | Z | |
000017869 | 0015 | 110 | X | 5 |
000017869 | 0016 | 140 | Z | |
000017869 | 0017 | 110 | X | 6 |
000017869 | 0018 | 140 | Z | |
000017869 | 0019 | 180 | Z | |
000017869 | 0020 | 200 | X | 7 |
000017877 | 0001 | 120 | Z | |
000017877 | 0002 | 150 | Z | |
000017877 | 0003 | 140 | Z | |
000017877 | 0004 | 110 | X | 1 |
000017877 | 0005 | 140 | Z | |
000017877 | 0006 | 110 | X | 2 |
000017877 | 0007 | 140 | Z | |
000017877 | 0008 | 110 | X | 3 |
000017877 | 0009 | 140 | Z | |
000017877 | 0010 | 110 | X | 4 |
000017878 | 0001 | 120 | Z | |
000017878 | 0002 | 140 | Z | |
000017878 | 0003 | 110 | X | 1 |
000017878 | 0005 | 110 | X | 1 |
000017878 | 0007 | 110 | X | 1 |
000017878 | 0008 | 140 | Z | |
000017878 | 0010 | 180 | Z |
Please post test data in the form of a datastep, its not fun for me to have to type that in!
data have; infile datalines dlm=" " missover; input id $ num_id $ val type $; datalines; 000017861 0001 120 Z 000017861 0002 160 Z 000017861 0003 110 X 000017861 0004 190 X 000017861 0005 110 X 000017861 0006 140 Z 000017869 0001 120 Z 000017869 0002 140 Z 000017869 0003 110 X 000017869 0004 190 X 000017869 0005 110 X 000017869 0006 140 Z 000017869 0007 110 X 000017869 0008 190 X 000017869 0009 110 X 000017869 0010 190 X 000017869 0011 110 X ; run; data want; set have; by id type notsorted; retain total curr; if first.id then total=0; if first.type and type="X" then do; total=total+1; curr=total; end; if first.type and type ne "X" then curr=.; run;
Basically keep a running total within id, and set a retained value based on type on or off.
Why is '1' in
000017877 0004 110 X 1
I can't figure the rules when to restart counting.
For the first part, up to restarting the count
assuming variables are v1 v2 v3 vX counter you can
data test;
set have;
retain phase 0 counter;
if vx = 'X' and phase=0 then do;
phase=1;
counter +1;
end;
if vx ne 'X' then phase = 0;
run;
can do:
Thank you for you quick response. The 1 is not part of the data. I put it there to show the result that I need.
Please post test data in the form of a datastep, its not fun for me to have to type that in!
data have; infile datalines dlm=" " missover; input id $ num_id $ val type $; datalines; 000017861 0001 120 Z 000017861 0002 160 Z 000017861 0003 110 X 000017861 0004 190 X 000017861 0005 110 X 000017861 0006 140 Z 000017869 0001 120 Z 000017869 0002 140 Z 000017869 0003 110 X 000017869 0004 190 X 000017869 0005 110 X 000017869 0006 140 Z 000017869 0007 110 X 000017869 0008 190 X 000017869 0009 110 X 000017869 0010 190 X 000017869 0011 110 X ; run; data want; set have; by id type notsorted; retain total curr; if first.id then total=0; if first.type and type="X" then do; total=total+1; curr=total; end; if first.type and type ne "X" then curr=.; run;
Basically keep a running total within id, and set a retained value based on type on or off.
Thank you for your quick response! Sorry about that! I attached a sample file and tried to copy data in the original message...
I will try the solution!
Runs like a champion!
Thank you!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.