BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
gnyanendra
Calcite | Level 5

Hi,

(a).  I'm facing difficulties with UPDATING on duplicate records in a table. Can someone help me on syntax?

(b).  Also, we have some 10-12 fact tables and equal number of dimension tables in star schema and need to process over a million of records across ~30 tables on  daily basis. Can anyone suggest if EG is the best tool for performaing this kind of task?

thanks a lot!

Gnyanendra

1 ACCEPTED SOLUTION

Accepted Solutions
LarryWorley
Fluorite | Level 6

I have a couple of techniques which may work for you.

The first method assumes you can always update the complete record in your secondary table.  So it deletes the records to be updated and then inserts the new record.  In some cases this may be optimal.

The second method uses Proc SQL update functionality and will give you the flexibility to update only a subset of columns if you wish.  The code that I present below is more complicated than what might be needed with other versions of SQL.  Some versions will allow you to write the update referencing the source table only once, but I found that I had to reference the source table once for each column being updated plus again for the update overall.  I might be missing something here but I tried several variations before getting this to work.

Note that if you can update the complete record, then you may want to benchmark both methods to see which performs best.  Performance can be dependent upon the both SAS optimization and DB optimization. 

Let me know if this helps.

Setup code: This section of code sets up some tables to mimic your tables.

PROC SQL;

* code to simulate primary1 and secondary1 tables 

                        ------------------------------------------------;

  create table work.primary1 (id char(4) primary key                            

                                         ,name char(1) 

                                         ,code char(4)  

                                        ,status char(1) 

                                        )   ;

  insert into work.primary1

       values('1639','F','TA1','1')

       values('1065','M','ME3','1')

       values('1400','M','ME1','1')

       values('1561','M',null,'1')

       values('1221','F','FA3','1')

       ;

   create table work.secondary1 as

   select *

     from work.primary1

   ;

quit ;

First Method:  Here I show the first method which is to delete records to be updated and then

insert the new records.

   * two step process which deletes records from target table with

     keys for records with updated status.

     then insert changed records into table

     -----------------------------------------;

  proc sql ;

   * simulate a change in primary table ;

   update work.primary1

     set code = 'abc'

          ,status = '0'

   where id = '1561'

     ;

   delete *

     from work.secondary1

     where id in (select id

                         from work.primary1

                       where status = '0'

                      )

     ;

   insert into work.secondary1

   select id

           ,name

           ,code

          ,status

     from work.primary1

   where status = '0'

     ;

quit ;

Second Method: Here, I use explicit updating.

* one step process for updating

              -----------------------------;

proc sql ;

* simulate a change in primary table ;

   update work.primary1

        set code = 'xyz'

       ,status = '0'

    where id = '1561'  

    ; 

* This is the real update here. **************************;

   update work.secondary1 t1

        set name = ( select name                  

                              from work.primary1 t2

                            where t2.status = '0'

                          )

             ,code = ( select code

                             from work.primary1 t3

                           where t1.id = t3.id

                              and t3.status = '0'

                         )

            ,status = ( select status

                              from work.primary1 t4

                            where t1.id = t4.id

                               and t4.status = '0'

                          )

      where t1.id = (select id

                                  from work.primary1

                               where status = '0'

                              )   ;

      QUIT;

View solution in original post

5 REPLIES 5
Doc_Duke
Rhodochrosite | Level 12

I don't know about (a).

For (b), Enterprise Guide generates SQL code that it passes to the SAS Server for execution.  So, if your comparison for 'bestness' is regular SAS, they are equal.  The SAS Server does some optimization of code for each database that it supports.

Within EGuide, you can also write code nodes with SQL pass-thru, in which you pass exactly the code that you want to the SQL database.    Technically, that is the "best", but it puts the entire onus on you to write efficient code. 

Doc Muhlbaier

Duke

LarryWorley
Fluorite | Level 6

I am confused with regards to (a); your title says help with UPDATING unique key, but text referes to difficulties with UPDATING with duplicate records, which implies non-unique keys.  Can you clarify?

I can make some guesses about issue but would be best if you can provide samples of sql and error message?  Also can you provide proc contents on the table you are trying to update? Also some sample records from the UPDATE input would be useful.

Building on Doc's response to (b), I would start with EG.  Then if it is not efficient enough from processing standpoint, then look at modifying the EG generated code to make more efficient.  EG generates generic code based upon your directions and there is always the possibility that with knowledge of your database and the data, you can optimize better than it can in a specific solution.  This step could include the pass-thru code to which Doc refers.

Keep in mind that as you perform additional optimizations, that you are generally making your code more complex and less maintainable as well as using coder time to gain processing efficiencies.  The best solution balances machine effeciency against those other concerns. 

My approach is to start simple -- use easiest tool (EG for example) -- and then only complicate enough to get needed efficiency. 

gnyanendra
Calcite | Level 5

Hi Larry,

My mistake in notwriting it clearly.

in (a), I meant I need to UPDATE older record with new one if KEY is same.

below text should help you get the problem better.

thanks!

----------------------

Hereis the SQL code I’m using:

PROC SQL;

INSERT INTO BETA.secondary1 (a.id, name, code, status)

SELECT

t1.a.id,

t1.name,

t1.code,

t1.status

FROM BETA.primary1 t1

WHERE status = ‘1’;

QUIT;

I now want toUPDATE table BETA.secondary1 again as the status changes to ‘0’ for which I’mtrying UPDATE clause instead INSERT with different combinations of SELECT

Like

UPDATE secondary1 (a.id, name, code, status)

SELECT

t1.a.id,

t1.name,

t1.code,

t1.status

FROM BETA.primary1 t1

WHERE status = ‘0’;

This is a simple query, I have lot more and complex which Ican’t post but I think you get the idea.

Here is the error log

1          ;*';*";*/;quit;run;

2          OPTIONS PAGENO=MIN;

3          %LET _CLIENTTASKLABEL='Program';

4          %LET _CLIENTPROJECTPATH='';

5          %LET _CLIENTPROJECTNAME='';

6          %LET _SASPROGRAMFILE=;

7         

8          ODS _ALL_ CLOSE;

9          OPTIONS DEV=ACTIVEX;

10         GOPTIONS XPIXELS=0 YPIXELS=0;

11         FILENAME EGSR TEMP;

12         ODS tagsets.sasreport12(ID=EGSR)FILE=EGSR STYLE=Analysis

12       !STYLESHEET=(URL="file:///C:/sas93/software/x86/SASEnterpriseGuide/4.3/Styles/Analysis.css")

12       ! NOGTITLE NOGFOOTNOTE GPATH=&sasworklocation ENCODING=UTF8options(rolap="on");

NOTE: WritingTAGSETS.SASREPORT12(EGSR) Body file: EGSR

13        

14         GOPTIONS ACCESSIBLE;

15         PROC SQL;

16           UPDATE BETA.secondary1 (a.id, name,code, status)

                                         _

                                         79

                                         76

ERROR 79-322: Expecting a ).

ERROR 76-322: Syntax error, statementwill be ignored.

17                                SELECT

18                                       t1.a.id,

19                                       t1.name,

20                                       t1.code,

21                                       t1.status

22

23

24                                FROMBETA.primary1 t1

25                                WHEREstatus = ‘1’;

NOTE: PROC SQL set option NOEXECand will continue to check the syntax of statements.

26         QUIT;

NOTE: The SAS System stoppedprocessing this step because of errors.

NOTE: PROCEDURE SQL used (Totalprocess time):

      real time           0.00 seconds

      cpu time            0.00 seconds

     

27        

28         GOPTIONS NOACCESSIBLE;

29         %LET _CLIENTTASKLABEL=;

30         %LET _CLIENTPROJECTPATH=;

31         %LET _CLIENTPROJECTNAME=;

32         %LET _SASPROGRAMFILE=;

33        

34         ;*';*";*/;quit;run;

2                                           TheSAS System            06:41 Thursday,January 12, 2012

35         ODS _ALL_ CLOSE;

36        

37        

38         QUIT; RUN;

39        

LarryWorley
Fluorite | Level 6

I have a couple of techniques which may work for you.

The first method assumes you can always update the complete record in your secondary table.  So it deletes the records to be updated and then inserts the new record.  In some cases this may be optimal.

The second method uses Proc SQL update functionality and will give you the flexibility to update only a subset of columns if you wish.  The code that I present below is more complicated than what might be needed with other versions of SQL.  Some versions will allow you to write the update referencing the source table only once, but I found that I had to reference the source table once for each column being updated plus again for the update overall.  I might be missing something here but I tried several variations before getting this to work.

Note that if you can update the complete record, then you may want to benchmark both methods to see which performs best.  Performance can be dependent upon the both SAS optimization and DB optimization. 

Let me know if this helps.

Setup code: This section of code sets up some tables to mimic your tables.

PROC SQL;

* code to simulate primary1 and secondary1 tables 

                        ------------------------------------------------;

  create table work.primary1 (id char(4) primary key                            

                                         ,name char(1) 

                                         ,code char(4)  

                                        ,status char(1) 

                                        )   ;

  insert into work.primary1

       values('1639','F','TA1','1')

       values('1065','M','ME3','1')

       values('1400','M','ME1','1')

       values('1561','M',null,'1')

       values('1221','F','FA3','1')

       ;

   create table work.secondary1 as

   select *

     from work.primary1

   ;

quit ;

First Method:  Here I show the first method which is to delete records to be updated and then

insert the new records.

   * two step process which deletes records from target table with

     keys for records with updated status.

     then insert changed records into table

     -----------------------------------------;

  proc sql ;

   * simulate a change in primary table ;

   update work.primary1

     set code = 'abc'

          ,status = '0'

   where id = '1561'

     ;

   delete *

     from work.secondary1

     where id in (select id

                         from work.primary1

                       where status = '0'

                      )

     ;

   insert into work.secondary1

   select id

           ,name

           ,code

          ,status

     from work.primary1

   where status = '0'

     ;

quit ;

Second Method: Here, I use explicit updating.

* one step process for updating

              -----------------------------;

proc sql ;

* simulate a change in primary table ;

   update work.primary1

        set code = 'xyz'

       ,status = '0'

    where id = '1561'  

    ; 

* This is the real update here. **************************;

   update work.secondary1 t1

        set name = ( select name                  

                              from work.primary1 t2

                            where t2.status = '0'

                          )

             ,code = ( select code

                             from work.primary1 t3

                           where t1.id = t3.id

                              and t3.status = '0'

                         )

            ,status = ( select status

                              from work.primary1 t4

                            where t1.id = t4.id

                               and t4.status = '0'

                          )

      where t1.id = (select id

                                  from work.primary1

                               where status = '0'

                              )   ;

      QUIT;

gnyanendra
Calcite | Level 5

Hi Larry,

I think First Method should work for me. I just tested it with dummy data and it worked fine, however I will do it on original database and test on all complex queries, data consistensies and all.

thanks you so much for the help.

Just to mention, we tried with secoind method but it involved writing query for each of the value column which didn't fit our requirement and was too tedious.

Anyways, I will ask any further question if doubt arises.

thanks

Gnyanendra

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 2916 views
  • 6 likes
  • 3 in conversation