I have several groups of data with varying number of rows of observations. How do I fill the data so that each group has the same number of rows, say 13, of observations filled with the last row of observation available?
libname x v9 'C:\Users\xiakeshan\Downloads';
data want;
set x.have;
by id;
if first.id then count=-1;
count+1;
output;
if last.id then do;
do count=count+1 to 12;
date=intnx('month',date,1,'e');
output;
end;
end;
run;
Thank you for the kind suggestion. Here's the data I have:
ID | Date | Value | Count |
A | 3/30/2001 | 26 | 1 |
A | 4/30/2001 | 39 | 2 |
B | 7/30/1999 | 28 | 1 |
B | 8/31/1999 | 81 | 2 |
B | 9/30/1999 | 22 | 3 |
B | 10/29/1999 | 8 | 4 |
B | 11/30/1999 | 79 | 5 |
B | 12/31/1999 | 94 | 6 |
C | 8/31/1994 | 79 | 1 |
C | 9/30/1994 | 10 | 2 |
C | 10/31/1994 | 61 | 3 |
C | 11/30/1994 | 23 | 4 |
C | 12/31/1994 | 49 | 5 |
C | 1/31/1995 | 77 | 6 |
C | 2/28/1995 | 23 | 7 |
C | 3/31/1995 | 67 | 8 |
C | 4/30/1995 | 86 | 9 |
C | 5/31/1995 | 62 | 10 |
C | 6/30/1995 | 47 | 11 |
Here's the date that I want:
ID | Date | Value | Count |
A | 3/30/2001 | 26 | 1 |
A | 4/30/2001 | 39 | 2 |
A | 5/31/2001 | 39 | 3 |
A | 6/30/2001 | 39 | 4 |
A | 7/30/2001 | 39 | 5 |
A | 8/31/2001 | 39 | 6 |
A | 9/30/2001 | 39 | 7 |
A | 10/31/2001 | 39 | 8 |
A | 11/30/2001 | 39 | 9 |
A | 12/31/2001 | 39 | 10 |
A | 1/31/2002 | 39 | 11 |
A | 2/28/2002 | 39 | 12 |
A | 3/30/2002 | 39 | 13 |
B | 7/30/1999 | 28 | 1 |
B | 8/31/1999 | 81 | 2 |
B | 9/30/1999 | 22 | 3 |
B | 10/29/1999 | 8 | 4 |
B | 11/30/1999 | 79 | 5 |
B | 12/31/1999 | 94 | 6 |
B | 1/31/2000 | 94 | 7 |
B | 2/28/2000 | 94 | 8 |
B | 3/31/2000 | 94 | 9 |
B | 4/30/2000 | 94 | 10 |
B | 5/31/2000 | 94 | 11 |
B | 6/30/2000 | 94 | 12 |
B | 7/30/2000 | 94 | 13 |
C | 8/31/1994 | 79 | 1 |
C | 9/30/1994 | 10 | 2 |
C | 10/31/1994 | 61 | 3 |
C | 11/30/1994 | 23 | 4 |
C | 12/31/1994 | 49 | 5 |
C | 1/31/1995 | 77 | 6 |
C | 2/28/1995 | 23 | 7 |
C | 3/31/1995 | 67 | 8 |
C | 4/30/1995 | 86 | 9 |
C | 5/31/1995 | 62 | 10 |
C | 6/30/1995 | 47 | 11 |
C | 7/31/1995 | 47 | 12 |
C | 8/31/1995 | 47 | 13 |
Thank you!
Looks like you want to make sure each ID has at least 13 observations. (Is there a REASON why that will help you??)
First let's convert your listing into a data step we have some actual data to code with. While we are at it let's use a display format for the dates that won't confuse half of the audience. Notice how not only will it make it so others can create a copy of your data it also displays the data in post using much less space without all of those lines on the screen.
data have;
input id $ date :mmddyy. value count;
format date yymmdd10.;
cards;
A 3/30/2001 26 1
A 4/30/2001 39 2
B 7/30/1999 28 1
B 8/31/1999 81 2
B 9/30/1999 22 3
B 10/29/1999 8 4
B 11/30/1999 79 5
B 12/31/1999 94 6
C 8/31/1994 79 1
C 9/30/1994 10 2
C 10/31/1994 61 3
C 11/30/1994 23 4
C 12/31/1994 49 5
C 1/31/1995 77 6
C 2/28/1995 23 7
C 3/31/1995 67 8
C 4/30/1995 86 9
C 5/31/1995 62 10
C 6/30/1995 47 11
;
So to make sure there are at least 13 observations you can use BY group processing. To increment the DATE value yoi can use the INTNX() function with the MONTH interval.
data want;
set have;
by id;
output;
if last.id then do count=count+1 to 13 ;
date=intnx('month',date,1,'e');
output;
end;
run;
Thank you for your kind admonition. Newbie here. I will heed your advice next time. I'm running an analysis where I'd like to have the same number of observations for each ID. But some IDs do not have enough observations, so I tried to fill up with the last observation available to meet this data requirement. Thank you!
libname x v9 'C:\Users\xiakeshan\Downloads';
data want;
set x.have;
by id;
if first.id then count=-1;
count+1;
output;
if last.id then do;
do count=count+1 to 12;
date=intnx('month',date,1,'e');
output;
end;
end;
run;
Thank you! They worked just as desired!
Hi,
Due to rising cybersecurity concerns, community members most likely don't download external files. And it's recommended to paste the code and dummy data in the body of your post using code inserting tools above (eg. symbol </>).
Are you trying to create the same categories for each group? Like using sparse option in proc freq?
eg.
proc freq data=sashelp.cars noprint;
tables origin*type/sparse out=want1;
run;
proc freq data=sashelp.cars noprint;
tables origin*type/out=want2;
run;
In Want1 dataset SPARSE option adds missing TYPE "Hybrid" and "Truck" for Europe , and TYPE "Hybrid" for USA.
Thank you for the kind reminder. I didn't want to take up too much space but overlooked the fact that users are wary of downloading data. I will take note in the future.
I'm not trying to create the same categories for each group. I'm running an analysis where I'd like to have the same number of observations for each group. But some groups do not have enough observations, so I tried to fill up with the last observation available to meet this data requirement.
If none of your ID's has more than 13 observations, then:
data want;
set have;
by id;
if first.id then count=0;
output;
count+1;
if last.id then do while (count<=12);
date=intnx('month',date,1,'end');
output;
count+1;
end;
run;
This produces values of COUNT from 0 through 12, per your sample WANT dataset.
But do you have any ID's with MORE than 13 obs? If so, what do you want for such ID's? The program above outputs all incoming observations, even if there are more than 13 of them.
Thank you so much! Yours worked just as wonderfully. Unfortunately only one solution can be accepted. For this dataset, I don't have any ID with more than 13 observations.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.