Hi Y'all,
Please be patient with my explanation as my request is a tad cumbersome.
I have data that looks like this:
ID | SEQNO |
101 | 1 |
102 | 6 |
103 | 2 |
103 | 3 |
104 | 9 |
104 | 10 |
104 | 12 |
104 | 17 |
105 | 1 |
105 | 2 |
105 | 5 |
105 | 7 |
105 | 8 |
105 | 12 |
106 | 27 |
106 | 28 |
106 | 29 |
For every unique ID, I created the programming variable called "SEQNO". I cleaned the data, so the SEQNO values are no longer complete; however, they still go from smallest to largest value for each ID. What I need now is to do a count called NUM based on the following three conditions:
(Quick note, SEQNO value does not always start at 1; however, the key point is the numerical order).
Here's one way of how the table I want may be:
ID | SEQNO | NUM |
101 | 1 | 1 |
102 | 6 | 1 |
103 | 2 | 1 |
103 | 3 | 1 |
104 | 9 | 1 |
104 | 10 | 1 |
104 | 12 | 2 |
104 | 17 | 3 |
105 | 1 | 1 |
105 | 2 | 1 |
105 | 5 | 2 |
105 | 7 | 3 |
105 | 8 | 3 |
105 | 12 | 4 |
106 | 27 | 1 |
106 | 28 | 1 |
106 | 29 | 1 |
Even though in the table above, I kept every observation, I do not need every row. So an alternate table that I can use could be:
ID | SEQNO | NUM |
101 | 1 | 1 |
102 | 6 | 1 |
103 | 3 | 1 |
104 | 17 | 3 |
105 | 12 | 4 |
106 | 29 | 1 |
(Do not worry about the SEQNO, it was just a programmatic variable meant to help me clean the data).
I am using SAS 9.4
Let me know if my request needs additional clarifications.
Muchas Gracias!
Assuming your data is in the order shown then BY group processing does what you want. We can make both tables in a single pass.
If I understand the requirement:
data want1 want2;
set have;
by id;
retain Num;
DifSeq= dif(Seqno);
if first.id then Num=1;
else if DifSeq>1 then Num+1;
if last.id then output want2;
output want1;
drop difseq;
run;
Or alternatively instead of separate output sets have a flag set for the second group instead of Output want1 such as:
Group2Flag=Last.id; and remove the want2 and If last.id ... statement.
Assuming your data is in the order shown then BY group processing does what you want. We can make both tables in a single pass.
If I understand the requirement:
data want1 want2;
set have;
by id;
retain Num;
DifSeq= dif(Seqno);
if first.id then Num=1;
else if DifSeq>1 then Num+1;
if last.id then output want2;
output want1;
drop difseq;
run;
Or alternatively instead of separate output sets have a flag set for the second group instead of Output want1 such as:
Group2Flag=Last.id; and remove the want2 and If last.id ... statement.
That's a good approach. A small point ... I think what you meant is:
else if DifSeq > 1 then ...
@Astounding wrote:
That's a good approach. A small point ... I think what you meant is:
else if DifSeq > 1 then ...
Good catch!.
Thanks for the help you two! I was thinking of the "By" function but wasn't sure how to handle the breaks in numerical order. Your code makes sense and best of all, it works! 🙂
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.