BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
KermitTheFrog
Fluorite | Level 6

I am running a program that creates a table new_table everyday. The very next day I want to create the deltas, meaning I want to create the insert, delete and update data sets between the old and new table.

 

Here is a minimal reproducible example:

 

data old_table;
infile datalines4 delimiter=",";
input ID $ START END;
format START END yymmddd10.-l;
datalines4;
ID1,20332,2936547
ID2,20361,22249
ID3,20240,22249
ID4,20240,22249
;;;;

data new_table;
infile datalines4 delimiter=",";
input ID $ START END;
format START END yymmddd10.-l;
datalines4;
ID1,20332,22249
ID2,20361,22249
ID4,20240,22249
ID5,20240,20820
;;;;

 

Basically I want to end up with the following as an output:

 

Delete table:

ID3,20240,22249

 

Update table:

ID5,20240,20820

 

Insert table:

ID1,20332,22249

 

In summary:

  • Deletion of ID3 (because not in the new table).
  • Update of ID1 (because END date is different in the new table).
  • Insert of ID5 (because not in old table).
  • Nothing change for ID2 and ID4 (they are both in new and oldtables and their values does not change).

I would like to know what could be the most efficient way of doing this kind of comparison/updates. I was thinking maybe a hash table that would "explode" the table into the three desired output tables might be the go-to option here.

 

In a second time, I would like to know if it was possible to directly update this table (meaning deleting/inserting and updating the observations at the same time, without creating the intermediary tables like before).

 

I have seen @SuryaKiran's answer on this Hash Output to update the Initial Dataset but this seems to be missing the delete and insert actions (it only updated UD1).

 

 

data old_table;  
  if _n_=1 then do;
    declare hash ud(dataset:'new_table');
    ud.defineKey('ID');
    ud.defineData(all:'y');
    ud.defineDone();
  end;
  modify old_table;
  rcUpdate = ud.find();
  if rcUpdate=0 then replace;
run;

 

 

Nota Bene: I know that the easy solution is just to replace the old table with the new one. However, in the actual data sets 99% of the observations does not change at all. Hence I only want to "update" the table and not replacing it.

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

For this minimal example you may use next code:

data deleted added updated;
   merge old_table(in=inold rename=(start=_start end=_end)))
              new_table(in=innew);
      by ID;
           if inold and not innew then output deleted; else
           if innew and not inold then output added; else
           if start ne _start or end ne _end then output updated;
run;

In case you need it for various pair tables with different variables and number of variables

this same code should be rewritten to get dynamically the variable names (from sashelp.vcolumn or sql dictionary.columns) of a table and generate the program to run;

View solution in original post

1 REPLY 1
Shmuel
Garnet | Level 18

For this minimal example you may use next code:

data deleted added updated;
   merge old_table(in=inold rename=(start=_start end=_end)))
              new_table(in=innew);
      by ID;
           if inold and not innew then output deleted; else
           if innew and not inold then output added; else
           if start ne _start or end ne _end then output updated;
run;

In case you need it for various pair tables with different variables and number of variables

this same code should be rewritten to get dynamically the variable names (from sashelp.vcolumn or sql dictionary.columns) of a table and generate the program to run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 372 views
  • 1 like
  • 2 in conversation