BookmarkSubscribeRSS Feed
maheshtalla
Quartz | Level 8

Hi,

we are working deduplication process, where in we have processed total bulk data and created cluster table in which it creates unique group of cluster id's. next time we are getting incremental data which may contain existing records as well as new records, now if we run same process it will be creating same set of clusterids which will be simialr to earlier. how to join this clusters. is that we need to process whole data  (inclding incremental)all the time? or is there any way how to process incremental data? also we are facing performqnce issue, we are fetching 10 million recs, processing match code clustering and inserting to database table, which is taking mor ethan 4 days. pleas e sugest on this.

thanks in advance

1 REPLY 1
TomKari
Onyx | Level 15

One piece that I can make a suggestion on is the insert to database part. If it is running slowly, investigate the bulk load options for whatever database you're using.

Tom

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1303 views
  • 0 likes
  • 2 in conversation