BookmarkSubscribeRSS Feed
maheshtalla
Quartz | Level 8

Hi,

we are working deduplication process, where in we have processed total bulk data and created cluster table in which it creates unique group of cluster id's. next time we are getting incremental data which may contain existing records as well as new records, now if we run same process it will be creating same set of clusterids which will be simialr to earlier. how to join this clusters. is that we need to process whole data  (inclding incremental)all the time? or is there any way how to process incremental data? also we are facing performqnce issue, we are fetching 10 million recs, processing match code clustering and inserting to database table, which is taking mor ethan 4 days. pleas e sugest on this.

thanks in advance

1 REPLY 1
TomKari
Onyx | Level 15

One piece that I can make a suggestion on is the insert to database part. If it is running slowly, investigate the bulk load options for whatever database you're using.

Tom

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1108 views
  • 0 likes
  • 2 in conversation