BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kjohnsonm
Lapis Lazuli | Level 10

Hello All,

I am still very new to SAS VA, but am working with the append feature and see in the code tab that the working command is PROC IMSTAT. I did some research on the command and do not see a no dups feature. I when working and learning how the process works accidently ran the append feature on the same file set with what I thought was a new weeks’ worth of data twice. Questions: 1 Is there a way to stop that from happening in the first place? (Check box, custom code option, etc.) 2 If not is there an easy fix without reloading the base file and all its appended sub sets?  Sorry if this is an old solved question, I did not see an answer when searching for "PROC IMSTAT and duplicates" -KJ
PS I did see a delete reference in a communities post however I am hoping for a clean no dups option on the load side.

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

We also run non-distributed on Windows and the > 20m row table loads to LASR in less than 4 minutes. I'm quite happy to do full table loads with that level of performance. 

View solution in original post

4 REPLIES 4
SASKiwi
PROC Star

How many rows does your LASR table have? I find it easier to construct and maintain tables to be loaded into LASR in a normal SAS data library - let's call it the LASR load library. That means if anything goes wrong with the data I fix it first in the LASR load library then I do complete table loads into LASR from there. Our biggest table is > 20m rows but it still loads in just a few minutes.

 

And I think you are correct. There is no feature for preventing duplicate rows in PROC IMSTAT.

kjohnsonm
Lapis Lazuli | Level 10

Hi SASKiwi,

4.5mil, growing at 1.1mill a year at this time but we are on a single server not a distrubuted ...  Thanks for the pointers, I will look into the data library as you referance. I was able to do some apending Friday afternoon late, it does seem farly fast.  The Data load however seems quite slow by comparison. Maybe the libary is the fix I need. 😎  Thank you again for your post.

SASKiwi
PROC Star

We also run non-distributed on Windows and the > 20m row table loads to LASR in less than 4 minutes. I'm quite happy to do full table loads with that level of performance. 

kjohnsonm
Lapis Lazuli | Level 10

SASKiwi,

I am IT support in a Higher Education setting, I just found a high detail table 'Student_test_comp' that I had not reviewed to date. It has to do with Test scores for students, for four years of data it has 23.8Million rows.  Wow, I had no idea that it was growing that fast. ...5X my base data set. I have been working on making surrogate keys for the tables to make a star schema, my source data warehouse uses natural keys.

SASKiwi,

Thanks for the info.

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Tips for filtering data sources in SAS Visual Analytics

See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 872 views
  • 0 likes
  • 2 in conversation