BookmarkSubscribeRSS Feed
attjooo
Calcite | Level 5

My data set MYTAB has variables ID, MONTH, QUANTITY.

If an ID has no data for a MONTH in the sequence 1 - 12, I want to add a record with ID, MONTH and QUANTITY = . (dot for missing numeric value).

So all ID:s should have 12 records each.

Have you any idea how to solve this?

12 REPLIES 12
data_null__
Jade | Level 19

For this example where all ID's have the same month values you can do that with PROC SUMMARY CLASSDATA=

data have;
   input id $ month quantitiy;
   cards;
a 2 11
a 7 12
b 10 11
b 4 9
;;;;
   run;
data classdata;
   if 0 then set have(keep=month);
   do month=1 to 12;
     
output;
     
end;
  
stop;
  
run;
proc summary nway data=have classdata=classdata;
   by id;
   class month;
   output out=filled(drop=_type_ _freq_) idgroup(out(q:)=);
   run;
proc print;
  
run;
attjooo
Calcite | Level 5

Thanks for your answer. I am sure that it works, but for a person with little experience it's hard to understand the code.

Wouldn't it be possible to use CASE WHEN in PROC SQL?

Some sort of pseudo code: When month is missing in [1,12} add record with ID, MONTH, QUANTITY (= . ).

RW9
Diamond | Level 26 RW9
Diamond | Level 26

I would suggest the easiest thing is to create a template dataset with the sequence you require and then merge that on.  For example (and I haven't tested this):

data template;

     do I=1 to 12;

          month=I;

          output;

     end;

run;

proc sql;

     create table WANT as

     select     SOME_VARIABLES,     /* note update per your table */

                    COALESCE(A.MONTH,B.MONTH) as MONTH

     from         WORK.HAVE A

     full join      WORK.TEMPLATE B

     on           A.MONTH=B.MONTH;

quit;

This will add in any missing months and the other variables will be missing.  If you need other variables populated then create your template with all necessary variables.

attjooo
Calcite | Level 5

The COALESCE function returns the first non-missing value from a list of numeric arguments.

I don't understand how that function could be of any help.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Its a trick with merging and coalescing.  Say you have these two datasets and full join them on month:

Data A                                             Data B

Month     Result                                   Month

Jan          12                                         Jan

Feb          10                                        Feb

                                                            Mar

...                                                            ...

The full join expands the first table to include values from the right, however the month would be missing if we did not do the coalesce for Mar. Eg.

select a.month...   = Month=missing, Result=missing as A.MONTH does not contain anything.

select coalesce(a.month,b.month) = Month=Mar, Result = missing, as the data from b is used as a is missing.

So, the full join creates all the necessary extra rows, the coalesce ensures that the variables are populated where necessary.

attjooo
Calcite | Level 5

Your code doesn't work.

I get the same number of records, but in ascending order with regard to MONTH. So first all MONTH =1, then all MONTH =2 , etc.

When I started each ID:s all records was displayed together( in ascending order with regard to MONTH ).

RW9
Diamond | Level 26 RW9
Diamond | Level 26

So here is the full code, tested this time.  The point is to create a dataset which will be the template for the other one.

data have;
  attrib id format=$1. month quantity format=best.;
  infile datalines delimiter=",";
  input id $ month quantity;
datalines;
a,1,100
a,2,200
a,4,150
b,1,60
;
run;

/* Create template, need to base it on your have data */
proc sql;
  create table IDS as
  select  distinct ID
  from    WORK.HAVE;
quit;
data template (drop=i);
  set ids;
  do I=1 to 12;
    month=I;
    output;
  end;
run;

proc sql;
  create table WANT as
  select  COALESCE(A.ID,B.ID) as ID,   
          COALESCE(A.MONTH,B.MONTH) as MONTH,
          A.QUANTITY
  from    WORK.HAVE A
  full join WORK.TEMPLATE B
  on      A.ID=B.ID
  and     A.MONTH=B.MONTH;
quit;

attjooo
Calcite | Level 5

Thankyou RW9 for your solution. Now it worked.

I am forced to use SAS.

This would have been an easy problem to solve in an ordinary programming language, working with arrays, and that code would have been easy to understand for anyone with basic knowledge in programming.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

The suggestion posted was merely for one method of doing such a task.  SAS also has arrays syntax:

     array months {12} 3. (1,2,3,4,5,6,7,8,9,10,11,12);

You could also look at this snippet:

retain lstmonth;

if month ne lstmonth then do;

     do I=lstmonth to month-1;

          month=I;

          output;

     end;

end;

output;

So keep the last value of month, and if the current rows month is not the previous month +1 then output rows for the difference.  Note you need to check for month=12, and set other values to missing for these additional rows.

As with most problems there is no one singular method to get what you want Smiley Happy

Kurt_Bremser
Super User

try this:

*get all ID's;

proc sort

  data=mytab (keep=ID)

  out=control

  nodupkey

;

by ID;

run;

*to be on the safe side;

proc sort data=mytab;

by ID MONTH;

run;

*make 12 months for every ID;

data control2;

set control;

do MONTH = 1 to 12;

output;

end;

run;

data result;

merge

  mytab (in=a)

  control2 (in=b)

;

if b;

by ID MONTH;

*everytime there is no record from mytab, quantity is implicitely set to .;

run;

Edit: Ups, forgot "by ID MONTH;" in the last data step.

attjooo
Calcite | Level 5

Thankyou KurtBremser.

Your code was more easy to understand.

Ksharp
Super User

EASY FOR SQL.

data have;
   input id $ month quantitiy;
   cards; 
a 2 11
a 7 12
b 10 11
b 4 9
;;;;
   run; 
data m;
 do month=1 to 12;
  output;
 end;
run;
proc sql;
create table temp as
 select * from      (select distinct id from have) as a,m;
create table want as
 select temp.*,have.quantitiy
  from have right join  temp on temp.id=have.id and temp.month=have.month;

quit;

Xia Keshan

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 5819 views
  • 2 likes
  • 5 in conversation