BookmarkSubscribeRSS Feed
Arpit_Agarwal
Fluorite | Level 6

Hi,

 

I am tring to insert more than 1300k records in an Oracle table using PROC SQL.

 

Few of the fields are of type XMLTYPE i.e. the value of these fields is in XML string format with all headers in place and some values for each label w.r.t each field. It is taking nearly 1 hour for every 100k records to be loaded, which is reaaaally slooowww.

 

I believe that has to do with XML validation, before the records are inserted into the table, as the INSERt statement is making sure the value for XMLTYPE variables being loaded are in line with the XML tagging rules.

 

Is there a workaround?

 

Please advise.

6 REPLIES 6
Quentin
Super User
Suggest you show your code for how you are running the insert. To test your theory about XML validation, you might try changing to load it to Oracle as a string, to see if that has an impact. If it is really Oracle validation of the XML that is slow, you might try posting to some Oracle boards.
BASUG is hosting free webinars Next up: Jane Eslinger presenting PROC REPORT and the ODS EXCEL destination on Mar 27 at noon ET. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Arpit_Agarwal
Fluorite | Level 6

Here's a sample -

 

TABLE_1 is form where we are fetching and TABLE_2 is into which we are populating.

FIELD_3/4/5/6 are the variables holding XML data.

 

PROC SQL FEEDBACK STIMER;
CONNECT TO ORACLE AS AAA (USER = "&USER." PASS = "&PASSWD." PATH = "&DBPATH.");
EXECUTE (
INSERT INTO TABLE_B (
FIELD_1,
FIELD_2,
FIELD_3,
FIELD_4,
FIELD_5,
FIELD_6, )

SELECT
......,
......,
XMLTYPE(FIELD_3),
CASE
WHEN FIELD_4 IS NULL THEN NULL
ELSE XMLTYPE(FIELD_4)
END,
CASE
WHEN FIELD_5 IS NULL THEN NULL
ELSE XMLTYPE(FIELD_5)
END,
CASE
WHEN FIELD_6 IS NULL THEN NULL
ELSE XMLTYPE(FIELD_6)
END
FROM TABLE_1 ) BY AAA;
EXECUTE ( COMMIT ) BY AAA;
DISCONNECT FROM AAA;
QUIT;

Patrick
Opal | Level 21

The code you've posted is pure Oracle (explicit pass-through). SAS involvement here is only "remote control". 

 

You might get more answers in an Oracle forum (only post the pass-through bit there).

 

BTW:

You don't need the explicit commit EXECUTE ( COMMIT ) BY AAA; SAS issues such a commit implicitely at the end of an Execute statement - and you can't suppress this implicit commit even though that would sometimes really be a good thing.

Arpit_Agarwal
Fluorite | Level 6

Thanks for your input guys.

 

However, what I am asking for is a workaround to achieve the same result with an effective solution.

 

It doesn't necessarily has to be a SQL, I am fine with a SAS solution, if one can suggest please.

 

Thanks!

Patrick
Opal | Level 21

...workaround to achieve the same result

 

What is "the same result"? If this is about loading data from a source Oracle table into a target Oracle table where some of the fields in target are of type XMLTYPE then keeping all the processing inside Oracle will be fastest.

 

Depending on what you want to do with your XML data, you could also store it in a LOB - or if you can be sure that these are always only short messages then you could even use a VARCHAR2.

 

Do you really know where your performance bottleneck is? May be you're just really loading big volumes (big XML files). 

 

If you've got an Oracle DBA at your site then try to get this person's support.

Sven111
Pyrite | Level 9

@Patrick This is somewhat of a tangent from the main topic, but in some cases it is still necessary to explicitly send an EXECUTE(COMMIT) BY AAA.  I recently discovered this when I was inserting data into a table while passing the APPEND hint.  Without the append hint I could sequentially insert multiple times into the table, but with the APPEND hint, I recieved an error ORA-12840 which states:

 

ORA-12840: cannot access a remote table after parallel/insert direct load txn
Cause: Within a transaction, an attempt was made to perform distributed access after a PDML or insert direct statement had been issued.
Action: Commit/rollback the PDML transaction first, and then perform the distributed access, or perform the distributed access before the first PDML statement in the transaction.

 

Adding the explicitly passed commit between each insert resolved the problem and allowed multiple inserts.  I don't know why this is since I also assumed everything was being committed automatically in each explicit pass-through block.

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 802 views
  • 3 likes
  • 4 in conversation