- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have several groups of data with varying number of rows of observations. How do I fill the data so that each group has the same number of rows, say 13, of observations filled with the last row of observation available?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
libname x v9 'C:\Users\xiakeshan\Downloads';
data want;
set x.have;
by id;
if first.id then count=-1;
count+1;
output;
if last.id then do;
do count=count+1 to 12;
date=intnx('month',date,1,'e');
output;
end;
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the kind suggestion. Here's the data I have:
ID | Date | Value | Count |
A | 3/30/2001 | 26 | 1 |
A | 4/30/2001 | 39 | 2 |
B | 7/30/1999 | 28 | 1 |
B | 8/31/1999 | 81 | 2 |
B | 9/30/1999 | 22 | 3 |
B | 10/29/1999 | 8 | 4 |
B | 11/30/1999 | 79 | 5 |
B | 12/31/1999 | 94 | 6 |
C | 8/31/1994 | 79 | 1 |
C | 9/30/1994 | 10 | 2 |
C | 10/31/1994 | 61 | 3 |
C | 11/30/1994 | 23 | 4 |
C | 12/31/1994 | 49 | 5 |
C | 1/31/1995 | 77 | 6 |
C | 2/28/1995 | 23 | 7 |
C | 3/31/1995 | 67 | 8 |
C | 4/30/1995 | 86 | 9 |
C | 5/31/1995 | 62 | 10 |
C | 6/30/1995 | 47 | 11 |
Here's the date that I want:
ID | Date | Value | Count |
A | 3/30/2001 | 26 | 1 |
A | 4/30/2001 | 39 | 2 |
A | 5/31/2001 | 39 | 3 |
A | 6/30/2001 | 39 | 4 |
A | 7/30/2001 | 39 | 5 |
A | 8/31/2001 | 39 | 6 |
A | 9/30/2001 | 39 | 7 |
A | 10/31/2001 | 39 | 8 |
A | 11/30/2001 | 39 | 9 |
A | 12/31/2001 | 39 | 10 |
A | 1/31/2002 | 39 | 11 |
A | 2/28/2002 | 39 | 12 |
A | 3/30/2002 | 39 | 13 |
B | 7/30/1999 | 28 | 1 |
B | 8/31/1999 | 81 | 2 |
B | 9/30/1999 | 22 | 3 |
B | 10/29/1999 | 8 | 4 |
B | 11/30/1999 | 79 | 5 |
B | 12/31/1999 | 94 | 6 |
B | 1/31/2000 | 94 | 7 |
B | 2/28/2000 | 94 | 8 |
B | 3/31/2000 | 94 | 9 |
B | 4/30/2000 | 94 | 10 |
B | 5/31/2000 | 94 | 11 |
B | 6/30/2000 | 94 | 12 |
B | 7/30/2000 | 94 | 13 |
C | 8/31/1994 | 79 | 1 |
C | 9/30/1994 | 10 | 2 |
C | 10/31/1994 | 61 | 3 |
C | 11/30/1994 | 23 | 4 |
C | 12/31/1994 | 49 | 5 |
C | 1/31/1995 | 77 | 6 |
C | 2/28/1995 | 23 | 7 |
C | 3/31/1995 | 67 | 8 |
C | 4/30/1995 | 86 | 9 |
C | 5/31/1995 | 62 | 10 |
C | 6/30/1995 | 47 | 11 |
C | 7/31/1995 | 47 | 12 |
C | 8/31/1995 | 47 | 13 |
Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Looks like you want to make sure each ID has at least 13 observations. (Is there a REASON why that will help you??)
First let's convert your listing into a data step we have some actual data to code with. While we are at it let's use a display format for the dates that won't confuse half of the audience. Notice how not only will it make it so others can create a copy of your data it also displays the data in post using much less space without all of those lines on the screen.
data have;
input id $ date :mmddyy. value count;
format date yymmdd10.;
cards;
A 3/30/2001 26 1
A 4/30/2001 39 2
B 7/30/1999 28 1
B 8/31/1999 81 2
B 9/30/1999 22 3
B 10/29/1999 8 4
B 11/30/1999 79 5
B 12/31/1999 94 6
C 8/31/1994 79 1
C 9/30/1994 10 2
C 10/31/1994 61 3
C 11/30/1994 23 4
C 12/31/1994 49 5
C 1/31/1995 77 6
C 2/28/1995 23 7
C 3/31/1995 67 8
C 4/30/1995 86 9
C 5/31/1995 62 10
C 6/30/1995 47 11
;
So to make sure there are at least 13 observations you can use BY group processing. To increment the DATE value yoi can use the INTNX() function with the MONTH interval.
data want;
set have;
by id;
output;
if last.id then do count=count+1 to 13 ;
date=intnx('month',date,1,'e');
output;
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your kind admonition. Newbie here. I will heed your advice next time. I'm running an analysis where I'd like to have the same number of observations for each ID. But some IDs do not have enough observations, so I tried to fill up with the last observation available to meet this data requirement. Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
libname x v9 'C:\Users\xiakeshan\Downloads';
data want;
set x.have;
by id;
if first.id then count=-1;
count+1;
output;
if last.id then do;
do count=count+1 to 12;
date=intnx('month',date,1,'e');
output;
end;
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you! They worked just as desired!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Due to rising cybersecurity concerns, community members most likely don't download external files. And it's recommended to paste the code and dummy data in the body of your post using code inserting tools above (eg. symbol </>).
Are you trying to create the same categories for each group? Like using sparse option in proc freq?
eg.
proc freq data=sashelp.cars noprint;
tables origin*type/sparse out=want1;
run;
proc freq data=sashelp.cars noprint;
tables origin*type/out=want2;
run;
In Want1 dataset SPARSE option adds missing TYPE "Hybrid" and "Truck" for Europe , and TYPE "Hybrid" for USA.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the kind reminder. I didn't want to take up too much space but overlooked the fact that users are wary of downloading data. I will take note in the future.
I'm not trying to create the same categories for each group. I'm running an analysis where I'd like to have the same number of observations for each group. But some groups do not have enough observations, so I tried to fill up with the last observation available to meet this data requirement.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If none of your ID's has more than 13 observations, then:
data want;
set have;
by id;
if first.id then count=0;
output;
count+1;
if last.id then do while (count<=12);
date=intnx('month',date,1,'end');
output;
count+1;
end;
run;
This produces values of COUNT from 0 through 12, per your sample WANT dataset.
But do you have any ID's with MORE than 13 obs? If so, what do you want for such ID's? The program above outputs all incoming observations, even if there are more than 13 of them.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you so much! Yours worked just as wonderfully. Unfortunately only one solution can be accepted. For this dataset, I don't have any ID with more than 13 observations.