I am still very new to SAS VA, but am working with the append feature and see in the code tab that the working command is PROC IMSTAT. I did some research on the command and do not see a no dups feature. I when working and learning how the process works accidently ran the append feature on the same file set with what I thought was a new weeks’ worth of data twice. Questions: 1 Is there a way to stop that from happening in the first place? (Check box, custom code option, etc.) 2 If not is there an easy fix without reloading the base file and all its appended sub sets? Sorry if this is an old solved question, I did not see an answer when searching for "PROC IMSTAT and duplicates" -KJ
PS I did see a delete reference in a communities post however I am hoping for a clean no dups option on the load side.
How many rows does your LASR table have? I find it easier to construct and maintain tables to be loaded into LASR in a normal SAS data library - let's call it the LASR load library. That means if anything goes wrong with the data I fix it first in the LASR load library then I do complete table loads into LASR from there. Our biggest table is > 20m rows but it still loads in just a few minutes.
And I think you are correct. There is no feature for preventing duplicate rows in PROC IMSTAT.
4.5mil, growing at 1.1mill a year at this time but we are on a single server not a distrubuted ... Thanks for the pointers, I will look into the data library as you referance. I was able to do some apending Friday afternoon late, it does seem farly fast. The Data load however seems quite slow by comparison. Maybe the libary is the fix I need. 😎 Thank you again for your post.
I am IT support in a Higher Education setting, I just found a high detail table 'Student_test_comp' that I had not reviewed to date. It has to do with Test scores for students, for four years of data it has 23.8Million rows. Wow, I had no idea that it was growing that fast. ...5X my base data set. I have been working on making surrogate keys for the tables to make a star schema, my source data warehouse uses natural keys.
Thanks for the info.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.
Find more tutorials on the SAS Users YouTube channel.