Hi everybody,
i need to enumerate the records based on three variables. The dataset must be orderd in ascending order by date.
I tried the following,
data want;
set have;
ord + 1;
by ID CREATED PROD SUBPROD ;
if first.ID or first.PROD or first.SUBPROD
then ord = 1;
run;
but it doesn't work as you can se in the table below:
ID | CREATED | PROD | SUBPROD | ORD |
1 | 01-ott-18 | A | A | 1 |
1 | 02-ott-18 | A | A | 1 |
1 | 02-ott-18 | A | A | 2 |
1 | 02-ott-18 | A | A | 3 |
1 | 05-ott-18 | B | B | 1 |
2 | 02-ott-18 | A | A | 1 |
3 | 03-ott-18 | B | B | 1 |
3 | 04-ott-18 | B | B | 1 |
3 | 05-ott-18 | B | B | 1 |
4 | 01-ott-18 | C | C | 1 |
4 | 02-ott-18 | C | C | 1 |
4 | 02-ott-18 | C | C | 2 |
4 | 02-ott-18 | D | D | 1 |
4 | 05-ott-18 | D | D | 1 |
4 | 06-ott-18 | D | D | 1 |
The data set i want has to be like this:
ID | CREATED | PROD | SUBPROD | ORD |
1 | 01-ott-18 | A | A | 1 |
1 | 02-ott-18 | A | A | 2 |
1 | 02-ott-18 | A | A | 3 |
1 | 02-ott-18 | A | A | 4 |
1 | 05-ott-18 | B | B | 1 |
2 | 02-ott-18 | A | A | 1 |
3 | 03-ott-18 | B | B | 1 |
3 | 04-ott-18 | B | B | 2 |
3 | 05-ott-18 | B | B | 3 |
4 | 01-ott-18 | C | C | 1 |
4 | 02-ott-18 | C | C | 2 |
4 | 02-ott-18 | C | C | 3 |
4 | 02-ott-18 | D | D | 1 |
4 | 05-ott-18 | D | D | 2 |
4 | 06-ott-18 | D | D | 3 |
Thank you
Without considering date:
proc sort data=have;
by id prod subprod;
run;
data want;
set have;
by id prod subprod;
if first.subprod then ord=0;
ord+1;
run;
So you can add date into the PROC SORT at the end of the BY statement.
Ok, i'll try .
I have to ask yo one question.
Why you wrote just first.subprod in the if statement without writing the three variables?
Thank you
If your data is already sorted by date,
data have;
input (ID CREATED PROD SUBPROD) ($);
cards;
1 01-ott-18 A A 1
1 02-ott-18 A A 1
1 02-ott-18 A A 2
1 02-ott-18 A A 3
1 05-ott-18 B B 1
2 02-ott-18 A A 1
3 03-ott-18 B B 1
3 04-ott-18 B B 1
3 05-ott-18 B B 1
4 01-ott-18 C C 1
4 02-ott-18 C C 1
4 02-ott-18 C C 2
4 02-ott-18 D D 1
4 05-ott-18 D D 1
4 06-ott-18 D D 1
;
data want;
set have;
by ID PROD SUBPROD ;
if first.subprod then ord=1;
else ord+1;
run;
Just to be cute, here is a one-step solution, possibly handy with a large (but sorted) raw data file:
data have;
input (ID CREATED PROD SUBPROD) ($);
ord+1;
if catx(' ',id,prod,subprod)
^=lag(catx(' ',id,prod,subprod)) then ord=1;
cards;
1 01-ott-18 A A 1
1 02-ott-18 A A 1
1 02-ott-18 A A 2
1 02-ott-18 A A 3
1 05-ott-18 B B 1
2 02-ott-18 A A 1
3 03-ott-18 B B 1
3 04-ott-18 B B 1
3 05-ott-18 B B 1
4 01-ott-18 C C 1
4 02-ott-18 C C 1
4 02-ott-18 C C 2
4 02-ott-18 D D 1
4 05-ott-18 D D 1
4 06-ott-18 D D 1
;
@gabras wrote:
Ok, i'll try .
I have to ask yo one question.
Why you wrote just first.subprod in the if statement without writing the three variables?
Thank you
It is not necessary (and wrong) to include all three variables. You want the ORD variable to reset when SUBPROD changed, not when all three change.
I was too superficial.
I can also have the following situation:
ID | CREATED | PROD | SUBPROD | ITEM |
1 | 01-ott-18 | A | B | D |
1 | 02-ott-18 | A | B | D |
1 | 02-ott-18 | A | B | D |
1 | 02-ott-18 | A | B | D |
1 | 05-ott-18 | A | A | E |
and i want this
ID | CREATED | PROD | SUBPROD | ITEM | ORD |
1 | 01-ott-18 | A | B | D | 1 |
1 | 02-ott-18 | A | B | D | 2 |
1 | 02-ott-18 | A | B | D | 3 |
1 | 02-ott-18 | A | B | D | 4 |
1 | 05-ott-18 | A | A | E | 1 |
did you try @mkeintz code. That logic should work in your case.
data have;
input (ID CREATED PROD SUBPROD) ($);
ord+1;
if catx(' ',id,prod,subprod)
^=lag(catx(' ',id,prod,subprod)) then ord=1;
cards;
1 01-ott-18 A A 1
1 02-ott-18 A A 1
1 02-ott-18 A A 2
1 02-ott-18 A A 3
1 05-ott-18 B B 1
2 02-ott-18 A A 1
3 03-ott-18 B B 1
3 04-ott-18 B B 1
3 05-ott-18 B B 1
4 01-ott-18 C C 1
4 02-ott-18 C C 1
4 02-ott-18 C C 2
4 02-ott-18 D D 1
4 05-ott-18 D D 1
4 06-ott-18 D D 1
;
@gabras wrote:
I was too superficial.
I can also have the following situation:
ID CREATED PROD SUBPROD ITEM 1 01-ott-18 A B D 1 02-ott-18 A B D 1 02-ott-18 A B D 1 02-ott-18 A B D 1 05-ott-18 A A E
and i want this
ID CREATED PROD SUBPROD ITEM ORD 1 01-ott-18 A B D 1 1 02-ott-18 A B D 2 1 02-ott-18 A B D 3 1 02-ott-18 A B D 4 1 05-ott-18 A A E 1
Surely you can modify the code I provided to handle this situation.
You have asked for help.
Help was provided.
If you need more help, please show the code you tried, and show the result of that code, and describe what part of the result is in error.
Help us help you.
@gabras wrote:
Could you help me please
Earlier, I stated the explanation
It is not necessary (and wrong) to include all three variables. You want the ORD variable to reset when SUBPROD changed, not when all three change.
so apply this explanation to your new situation involving variable ITEM.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.