Exploring, predicting and reporting with SAS Visual Analytics and SAS Visual Statistics

PROC IMSTAT Statement with no dups

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 135
Accepted Solution

PROC IMSTAT Statement with no dups

Hello All,

I am still very new to SAS VA, but am working with the append feature and see in the code tab that the working command is PROC IMSTAT. I did some research on the command and do not see a no dups feature. I when working and learning how the process works accidently ran the append feature on the same file set with what I thought was a new weeks’ worth of data twice. Questions: 1 Is there a way to stop that from happening in the first place? (Check box, custom code option, etc.) 2 If not is there an easy fix without reloading the base file and all its appended sub sets?  Sorry if this is an old solved question, I did not see an answer when searching for "PROC IMSTAT and duplicates" -KJ
PS I did see a delete reference in a communities post however I am hoping for a clean no dups option on the load side.


Accepted Solutions
Solution
‎03-19-2018 05:40 PM
Super User
Posts: 4,034

Re: PROC IMSTAT Statement with no dups

Posted in reply to kjohnsonm

We also run non-distributed on Windows and the > 20m row table loads to LASR in less than 4 minutes. I'm quite happy to do full table loads with that level of performance. 

View solution in original post


All Replies
Super User
Posts: 4,034

Re: PROC IMSTAT Statement with no dups

Posted in reply to kjohnsonm

How many rows does your LASR table have? I find it easier to construct and maintain tables to be loaded into LASR in a normal SAS data library - let's call it the LASR load library. That means if anything goes wrong with the data I fix it first in the LASR load library then I do complete table loads into LASR from there. Our biggest table is > 20m rows but it still loads in just a few minutes.

 

And I think you are correct. There is no feature for preventing duplicate rows in PROC IMSTAT.

Frequent Contributor
Posts: 135

Re: PROC IMSTAT Statement with no dups

Hi SASKiwi,

4.5mil, growing at 1.1mill a year at this time but we are on a single server not a distrubuted ...  Thanks for the pointers, I will look into the data library as you referance. I was able to do some apending Friday afternoon late, it does seem farly fast.  The Data load however seems quite slow by comparison. Maybe the libary is the fix I need. 8)  Thank you again for your post.

Solution
‎03-19-2018 05:40 PM
Super User
Posts: 4,034

Re: PROC IMSTAT Statement with no dups

Posted in reply to kjohnsonm

We also run non-distributed on Windows and the > 20m row table loads to LASR in less than 4 minutes. I'm quite happy to do full table loads with that level of performance. 

Frequent Contributor
Posts: 135

Re: PROC IMSTAT Statement with no dups

SASKiwi,

I am IT support in a Higher Education setting, I just found a high detail table 'Student_test_comp' that I had not reviewed to date. It has to do with Test scores for students, for four years of data it has 23.8Million rows.  Wow, I had no idea that it was growing that fast. ...5X my base data set. I have been working on making surrogate keys for the tables to make a star schema, my source data warehouse uses natural keys.

SASKiwi,

Thanks for the info.

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 190 views
  • 0 likes
  • 2 in conversation