BookmarkSubscribeRSS Feed
maheshtalla
Quartz | Level 8

Hi,

we are working deduplication process, where in we have processed total bulk data and created cluster table in which it creates unique group of cluster id's. next time we are getting incremental data which may contain existing records as well as new records, now if we run same process it will be creating same set of clusterids which will be simialr to earlier. how to join this clusters. is that we need to process whole data  (inclding incremental)all the time? or is there any way how to process incremental data? also we are facing performqnce issue, we are fetching 10 million recs, processing match code clustering and inserting to database table, which is taking mor ethan 4 days. pleas e sugest on this.

thanks in advance

1 REPLY 1
TomKari
Onyx | Level 15

One piece that I can make a suggestion on is the insert to database part. If it is running slowly, investigate the bulk load options for whatever database you're using.

Tom

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1323 views
  • 0 likes
  • 2 in conversation