Solved: Re: How to fill data to a specific number of rows

Amelia6 · Posted 02-19-2025 06:27 PM

I have several groups of data with varying number of rows of observations. How do I fill the data so that each group has the same number of rows, say 13, of observations filled with the last row of observation available?

Ksharp · Posted 02-19-2025 09:04 PM

libname x v9 'C:\Users\xiakeshan\Downloads';
data want;
 set x.have;
 by id;
 if first.id then count=-1;
 count+1;
 output;
 if last.id then do;
   do count=count+1 to 12;
    date=intnx('month',date,1,'e');
	output;
   end;
 end;
run;

View solution in original post

quickbluefish · Posted 02-19-2025 08:52 PM

Would recommend you post screenshots of your two datasets instead of providing them as downloads, as most people aren't comfortable downloading unknown files.

Amelia6 · Posted 02-19-2025 09:15 PM

Thank you for the kind suggestion. Here's the data I have:

ID	Date	Value	Count
A	3/30/2001	26	1
A	4/30/2001	39	2
B	7/30/1999	28	1
B	8/31/1999	81	2
B	9/30/1999	22	3
B	10/29/1999	8	4
B	11/30/1999	79	5
B	12/31/1999	94	6
C	8/31/1994	79	1
C	9/30/1994	10	2
C	10/31/1994	61	3
C	11/30/1994	23	4
C	12/31/1994	49	5
C	1/31/1995	77	6
C	2/28/1995	23	7
C	3/31/1995	67	8
C	4/30/1995	86	9
C	5/31/1995	62	10
C	6/30/1995	47	11

Here's the date that I want:

ID	Date	Value	Count
A	3/30/2001	26	1
A	4/30/2001	39	2
A	5/31/2001	39	3
A	6/30/2001	39	4
A	7/30/2001	39	5
A	8/31/2001	39	6
A	9/30/2001	39	7
A	10/31/2001	39	8
A	11/30/2001	39	9
A	12/31/2001	39	10
A	1/31/2002	39	11
A	2/28/2002	39	12
A	3/30/2002	39	13
B	7/30/1999	28	1
B	8/31/1999	81	2
B	9/30/1999	22	3
B	10/29/1999	8	4
B	11/30/1999	79	5
B	12/31/1999	94	6
B	1/31/2000	94	7
B	2/28/2000	94	8
B	3/31/2000	94	9
B	4/30/2000	94	10
B	5/31/2000	94	11
B	6/30/2000	94	12
B	7/30/2000	94	13
C	8/31/1994	79	1
C	9/30/1994	10	2
C	10/31/1994	61	3
C	11/30/1994	23	4
C	12/31/1994	49	5
C	1/31/1995	77	6
C	2/28/1995	23	7
C	3/31/1995	67	8
C	4/30/1995	86	9
C	5/31/1995	62	10
C	6/30/1995	47	11
C	7/31/1995	47	12
C	8/31/1995	47	13

Thank you!

Tom · Posted 02-19-2025 09:32 PM

Looks like you want to make sure each ID has at least 13 observations. (Is there a REASON why that will help you??)

First let's convert your listing into a data step we have some actual data to code with. While we are at it let's use a display format for the dates that won't confuse half of the audience. Notice how not only will it make it so others can create a copy of your data it also displays the data in post using much less space without all of those lines on the screen.

data have;
  input id $ date :mmddyy. value count;
  format date yymmdd10.;
cards;
A  3/30/2001 26  1
A  4/30/2001 39  2
B  7/30/1999 28  1
B  8/31/1999 81  2
B  9/30/1999 22  3
B 10/29/1999  8  4
B 11/30/1999 79  5
B 12/31/1999 94  6
C  8/31/1994 79  1
C  9/30/1994 10  2
C 10/31/1994 61  3
C 11/30/1994 23  4
C 12/31/1994 49  5
C  1/31/1995 77  6
C  2/28/1995 23  7
C  3/31/1995 67  8
C  4/30/1995 86  9
C  5/31/1995 62 10
C  6/30/1995 47 11
;

So to make sure there are at least 13 observations you can use BY group processing. To increment the DATE value yoi can use the INTNX() function with the MONTH interval.

data want;
  set have;
  by id;
  output;
  if last.id then do count=count+1 to 13 ;
    date=intnx('month',date,1,'e');
    output;
  end;
run;

Amelia6 · Posted 02-19-2025 10:35 PM

Thank you for your kind admonition. Newbie here. I will heed your advice next time. I'm running an analysis where I'd like to have the same number of observations for each ID. But some IDs do not have enough observations, so I tried to fill up with the last observation available to meet this data requirement. Thank you!

Ksharp · Posted 02-19-2025 09:04 PM

libname x v9 'C:\Users\xiakeshan\Downloads';
data want;
 set x.have;
 by id;
 if first.id then count=-1;
 count+1;
 output;
 if last.id then do;
   do count=count+1 to 12;
    date=intnx('month',date,1,'e');
	output;
   end;
 end;
run;

Amelia6 · Posted 02-19-2025 10:24 PM

Thank you! They worked just as desired!

A_Kh · Posted 02-19-2025 09:13 PM

Hi,
Due to rising cybersecurity concerns, community members most likely don't download external files. And it's recommended to paste the code and dummy data in the body of your post using code inserting tools above (eg. symbol </>).
Are you trying to create the same categories for each group? Like using sparse option in proc freq?
eg.

proc freq data=sashelp.cars noprint;
tables origin*type/sparse out=want1;
run; 

proc freq data=sashelp.cars noprint;
tables origin*type/out=want2;
run;

In Want1 dataset SPARSE option adds missing TYPE "Hybrid" and "Truck" for Europe , and TYPE "Hybrid" for USA.

Amelia6 · Posted 02-19-2025 10:31 PM

Thank you for the kind reminder. I didn't want to take up too much space but overlooked the fact that users are wary of downloading data. I will take note in the future.

I'm not trying to create the same categories for each group. I'm running an analysis where I'd like to have the same number of observations for each group. But some groups do not have enough observations, so I tried to fill up with the last observation available to meet this data requirement.

mkeintz · Posted 02-19-2025 09:23 PM

If none of your ID's has more than 13 observations, then:

data want;
  set have;
  by id;
  if first.id then count=0;
  output;
  count+1;
  if last.id then do while (count<=12);
    date=intnx('month',date,1,'end');
    output;
    count+1;
  end;
run;

This produces values of COUNT from 0 through 12, per your sample WANT dataset.

But do you have any ID's with MORE than 13 obs? If so, what do you want for such ID's? The program above outputs all incoming observations, even if there are more than 13 of them.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Amelia6 · Posted 02-19-2025 10:26 PM

Thank you so much! Yours worked just as wonderfully. Unfortunately only one solution can be accepted. For this dataset, I don't have any ID with more than 13 observations.

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away