- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have a data set that I'm trying to add a count variable for. Here is s my dataset:
key | date | type |
1234 | 1/28/2022 | A |
1234 | 2/15/2022 | A |
1234 | 2/28/2022 | A |
1234 | 3/6/2022 | B |
1234 | 4/1/2022 | B |
1234 | 4/15/2022 | C |
1234 | 5/1/2022 | C |
1133 | 12/21/2021 | A |
1133 | 1/20/2022 | A |
1133 | 3/18/2022 | B |
1133 | 4/4/2022 | C |
1133 | 5/3/2022 | C |
3434 | 2/16/2022 | A |
3434 | 3/10/2022 | A |
3434 | 4/3/2022 |
C |
I've tried to use the following code:
data add_count;
set test_data;
count +1;
by type;
if first.type then count=1;
run;
However, this does not work. I get the following error:
ERROR: BY variables are not properly sorted on data set WORK.test_data.
I need to count the type, for each key in the order they are sorted by date. If I add in the "date" to the by statement in the code above, then my count is just 1 for every observation, since the dates are different.
How can I get this output:
key | date | type | count |
1234 | 1/28/2022 | A | 1 |
1234 | 2/15/2022 | A | 2 |
1234 | 2/28/2022 | A | 3 |
1234 | 3/6/2022 | B | 1 |
1234 | 4/1/2022 | B | 2 |
1234 | 4/15/2022 | C | 1 |
1234 | 5/1/2022 | C | 2 |
1133 | 12/21/2021 | A | 1 |
1133 | 1/20/2022 | A | 2 |
1133 | 3/18/2022 | B | 1 |
1133 | 4/4/2022 | C | 1 |
1133 | 5/3/2022 | C | 2 |
3434 | 2/16/2022 | A | 1 |
3434 | 3/10/2022 | A | 2 |
3434 | 4/3/2022 | C | 1 |
Thank you!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You need to tell the data step that the you want the BY statement to work even though the data is NOT sorted by those variables.
Use the NOTSORTED keyword.
data add_count;
set test_data;
by type notsorted;
count +1;
if first.type then count=1;
run;
If you want the numbering to reset also when KEY changes that add that variable to the BY statement also.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try this:
by key notsorted type;
This assumes that all values of KEY are consecutive, and that all values of TYPE are consecutive.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You need to tell the data step that the you want the BY statement to work even though the data is NOT sorted by those variables.
Use the NOTSORTED keyword.
data add_count;
set test_data;
by type notsorted;
count +1;
if first.type then count=1;
run;
If you want the numbering to reset also when KEY changes that add that variable to the BY statement also.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Wow, that was so easy.. Thank you, that worked perfectly!