BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
hein68
Quartz | Level 8

I have a long file with data like this:

 

hein68_0-1730908528973.png

 

Some projids have one or more of the 13 rows missing, but it's random as to which row(s) are missing.  What is the easiest way to write code to add the rows that are missing for each projid?

 

Thanks!!

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

This program has two data steps, but only demands the original data from storage  be retrieved once, because it uses a data set view.

 

 

proc format;
  value timept 0="Baseline" 1="1 year" 2="2 year" 3="3 year"
		4="4 year" 5="5 year" 6="6 year" 7="7 year" 8="8 year"
		9="9 year" 10="10 year" 11="11 year" 12="12 year" 13="13 year"
		14="14 year" 15="15 year" 16="16 year" 17="17 year"
		18="18 year" 19="19 year";

data years;
input id time var1;
format time timept. ;
cards;
12345  0   0.089
12345  1   0.099
12345  3   0.0762
12345  5   0.103
12345  6   0.089
12345  7   0.089
12345  8   0.089
12345  9   0.089
12345  10   0.089
12345  11   0.089
run;

data all_time_values/view=all_time_values ;
  set years (keep=id);
  by id;
  if first.id;
  do time=0 to 13;  /* Baseline to Year 13 */
    output;
  end;
run;


data want ;
  merge all_time_values years;
  by id time;
run;

 

 

This assumes the data are sorted by ID and time.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

6 REPLIES 6
andreas_lds
Jade | Level 19

How many obs should each projid have? Is the number constant?

What is the real contents of "time"? Can you show the unformatted values?

Can you please post data in usable form, so that we have something to work with?

Usable form = data step with datalines.

PaigeMiller
Diamond | Level 26

Here is my code that works on fake data. This could be extremely slow on large data sets. But really, we need answers to all the questions from @andreas_lds 

 

data years;
input id time var1;
cards;
12345  1   0.089
12345  2   0.099
12345  3   0.0762
12345  5   0.103
12345  6   0.089
12345  7   0.089
12345  8   0.089
12345  9   0.089
12345 10   0.089
12345 11   0.089
;
data thirteen;
    do time=1 to 13;
        output;
    end;
run;

proc sql;
    create table ids as select distinct id from years;
    create table all_years as select distinct a.id,b.time from years as a,thirteen as b; /* Cartesian join */
    create table want as select a.id,a.time,b.var1 from all_years as a left join years as b
        on a.id=b.id and a.time=b.time;
quit;
--
Paige Miller
hein68
Quartz | Level 8

Each projid should have 13 observations/rows.  In the example I show in the screenshot, the original data source does not include a row for observation 10 (for 10 years).  I need to get a blank row added that would be for observation 10 (10 years).

 

In another example, the projid had 2 observations (baseline and 1 year), so in this case I need to add the blank rows for years 3 through 13.

 

This code shows how the time variable is formatted.

 

value timept 0="Baseline" 1="1 year" 2="2 year" 3="3 year"
		4="4 year" 5="5 year" 6="6 year" 7="7 year" 8="8 year"
		9="9 year" 10="10 year" 11="11 year" 12="12 year" 13="13 year"
		14="14 year" 15="15 year" 16="16 year" 17="17 year"
		18="18 year" 19="19 year";

Thanks!

 

data_null__
Jade | Level 19

This will fill missing rows at the end and in the middle.

 

data have;
   do projid='28533', '29344';
      do year = 0 to 3, 5;
         y = ranuni(1234);
         output;
         end;
      end;
   run;

proc format;
   value xyear(notsorted)
      0 = 'Baseline'
      1 = '1 Year'
      2 = '2 Year'
      3 = '3 Year'
      4 = '4 Year'
      5 = '5 Year'
      6 = '6 Year'
      7 = '7 Year'
      ;
   quit;
proc summary data=have nway missing completetypes;
   by projid;
   class year / preloadfmt;
   format year xyear.;
   output out=expanded(drop=_type_);
   run;
data want;
   merge expanded have;
   by projid year;
   run;
proc print;
   run;

Capture.PNG

 

 

mkeintz
PROC Star

This program has two data steps, but only demands the original data from storage  be retrieved once, because it uses a data set view.

 

 

proc format;
  value timept 0="Baseline" 1="1 year" 2="2 year" 3="3 year"
		4="4 year" 5="5 year" 6="6 year" 7="7 year" 8="8 year"
		9="9 year" 10="10 year" 11="11 year" 12="12 year" 13="13 year"
		14="14 year" 15="15 year" 16="16 year" 17="17 year"
		18="18 year" 19="19 year";

data years;
input id time var1;
format time timept. ;
cards;
12345  0   0.089
12345  1   0.099
12345  3   0.0762
12345  5   0.103
12345  6   0.089
12345  7   0.089
12345  8   0.089
12345  9   0.089
12345  10   0.089
12345  11   0.089
run;

data all_time_values/view=all_time_values ;
  set years (keep=id);
  by id;
  if first.id;
  do time=0 to 13;  /* Baseline to Year 13 */
    output;
  end;
run;


data want ;
  merge all_time_values years;
  by id time;
run;

 

 

This assumes the data are sorted by ID and time.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
hein68
Quartz | Level 8

This worked beautifully!  Thank you!

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1014 views
  • 1 like
  • 5 in conversation