BookmarkSubscribeRSS Feed
0 Likes
data a;
b=3; output;
b=.; output;
run;
 
proc datasets nolist;
modify a;
ic create not null(b);
ic create check(where=(b=3));
quit;
 
Currently both these ICs will display:
 
ERROR: Integrity constraint _CK0001_ was rejected because 1 observations failed the constraint.
 
Would be handy to have a sample of the errors, similar to the data step:
 
…was rejected because _n_=2 b=.
 
Currently after receiving the error you have to regenerate the data set (if you're in batch), then subset it based on the IC to find the failing observations.
 
(Forum software was complaining non-specifically about an HTML error, then about me spamming when I tried again, so have removed all formatting. An hour lockout seems excessive.)
4 Comments
Kurt_Bremser
Super User

Use the "little running man" button for code, and you won't get problems:

data a;
b = 3;
output;
b = .;
output;
run;
 
proc datasets nolist;
modify a;
  ic create not null(b);
  ic create check (where=(b=3));
quit;

There's no need for listing the individual occurrences; after knowing that at least one observation failed, you need to inquire anyway (usually by issuing a query with a WHERE condition), and then you'll see if it's only one, or 10 million, or all observations, and if there's a pattern.

ballardw
Super User

Along with an implementation for:

Would be handy to have a sample of the errors, similar to the data step:
 
…was rejected because _n_=2 b=.
 
Currently after receiving the error you have to regenerate the data set (if you're in batch), then subset it based on the IC to find the failing observations.

would likely come a similar default of 20 or some number of  "errors". Suppose you are  running a program with 2 million records and see 20 errors like the data step all point to the same constraint error. You should still check ALL the constraints because, just like identical invalid data errors in a data step if you only fix ONE variable value you still have other values that may need fixing that are revealed the next time you run the data step.

 

barefootguru
Fluorite | Level 6

@Kurt_Bremser @ballardw the most recent example is a job which takes 2 hours to run.  It failed on Sex in('F','M','U'), so I had to rerun it interactively, and apply the constraint in a data step to see there were some Sex='O'.  I don't need to investigate further as it's a logical value not a data error, so just need to add it into the constraint.  This is regular in our warehouse, even if you haven't encountered the scenario.

ballardw
Super User
  I don't need to investigate further as it's a logical value not a data error, so just need to add it into the constraint.  
This is regular in our warehouse, even if you haven't encountered the scenario.

I submit that if this is regular in your warehouse, i.e unexpected values appear in data, that someone failed seriously in the design, assuming by "warehouse" you mean a data warehouse.

Or perhaps something in how the data is brought into SAS is poorly designed.