Hello
I want to add a new field (column) of sequential number by multiple groups.
The code that I wrote is not working well because the sequential numbers should be related to ID+date and not only to ID.
Data aaa;
attrib
ID length = 8 format=best8. informat = best8.
date length = 8 format =DDMMYY10. informat = DDMMYY10.
SaleAmount length = 8 format=best8. informat = best8.;
INPUT ID date SaleAmount;
cards;
1 1/1/2011 100
1 1/1/2011 50
1 5/1/2011 150
1 5/1/2011 51300
1 3/4/2011 200
2 5/1/2010 300
2 5/1/2010 290
2 5/1/2010 310
2 6/3/2010 40
2 6/3/2010 45
;
run;
proc sort data=aaa;by ID date;Run;
data bbb;
set aaa;
by ID date;
if first.ID and first.date then seq_id=0;
seq_id+1;
run;
Data aaa;
attrib
ID length = 8 format=best8. informat = best8.
date length = 8 format =DDMMYY10. informat = DDMMYY10.
SaleAmount length = 8 format=best8. informat = best8.;
INPUT ID date SaleAmount;
cards;
1 1/1/2011 100
1 1/1/2011 50
1 5/1/2011 150
1 5/1/2011 51300
1 3/4/2011 200
2 5/1/2010 300
2 5/1/2010 290
2 5/1/2010 310
2 6/3/2010 40
2 6/3/2010 45
;
run;
proc sort data=aaa;by ID date;Run;
data bbb;
set aaa;
by ID date;
if first.date then seq_id=0;
seq_id+1;
run;
Data aaa;
attrib
ID length = 8 format=best8. informat = best8.
date length = 8 format =DDMMYY10. informat = DDMMYY10.
SaleAmount length = 8 format=best8. informat = best8.;
INPUT ID date SaleAmount;
cards;
1 1/1/2011 100
1 1/1/2011 50
1 5/1/2011 150
1 5/1/2011 51300
1 3/4/2011 200
2 5/1/2010 300
2 5/1/2010 290
2 5/1/2010 310
2 6/3/2010 40
2 6/3/2010 45
;
run;
proc sort data=aaa;by ID date;Run;
data how_does_first_dot_work;
format first_id first_date;
set aaa;
by id date;
first_id=first.id;
first_date=first.date;
run;
data want;
format seq_id seq_date;
set aaa;
by id date;
if first.id then seq_id=0;
if first.date then seq_date=0;
seq_id+1;
seq_date+1;
run;
Is one of the seq* variables in data want what you're looking for?
Thank you for your reply.
You created to series of sequential number.
However, I want to create only one series that will get following numbers(related to ID+date):
1
2
1
2
1
1
2
3
1
2
Did you even look at the results of what I posted?
Data aaa;
attrib
ID length = 8 format=best8. informat = best8.
date length = 8 format =DDMMYY10. informat = DDMMYY10.
SaleAmount length = 8 format=best8. informat = best8.;
INPUT ID date SaleAmount;
cards;
1 1/1/2011 100
1 1/1/2011 50
1 5/1/2011 150
1 5/1/2011 51300
1 3/4/2011 200
2 5/1/2010 300
2 5/1/2010 290
2 5/1/2010 310
2 6/3/2010 40
2 6/3/2010 45
;
run;
proc sort data=aaa;by ID date;Run;
data bbb;
set aaa;
by ID date;
if first.date then seq_id=0;
seq_id+1;
run;
It is perfect solution.
I just dont understand.
Both ID and date define the "group".
Why do you write "if first.date" and not "if first.date and first.ID?
Why do you write "if first.date" and not "if first.ID?
May you please explain?
You have grouped the dataset by id and then all of the rows with the same id are further broken down by date. So first.id is always the start of a new id group and first.date is always the start of a new date within that idgroup.
It might be worth assigning variables to equal the first.id and first date to see how it is working exactly. Hope this makes sense. If it doesn't I'll try and create a suitable example when I'm at a computer.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.