BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Walternate
Obsidian | Level 7

Hi,

 

I have a data quality check program, which parses records in a data file for various data quality issues. If an issue is found, the program:

 

1. Assigns a sequential issue number

2. Outputs the record

 

I have a few checks in which a large subset of variables (say 20 variables) is checked for a given issue. The issue could occur in zero, one, two, or up to twenty (all) of the variables and there's no way to distinguish. So I want to add a column which will say exactly which variable(s) is having an issue. 

 

Here is an example of input data, in which we are looking for the issue of does not equal 1 in v1 through v10:

 

data test;
input v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 ;
cards;
1 1 1 1 1 1 1 1 1 1 
0 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 0 
1 1 1 1 0 1 1 1 1 1 
0 0 1 1 1 1 1 1 1 1 
1 1 1 0 1 1 1 0 1 1 
1 0 1 1 1 0 0 1 1 1 
0 0 0 0 1 1 1 1 1 1 
1 1 1 1 1 0 0 0 0 0 
1 0 1 0 1 0 1 0 1 0 
0 1 0 1 0 1 0 1 0 1 
1 1 0 1 0 0 0 1 0 0 
0 0 1 0 0 1 0 1 0 0 
1 0 0 0 0 0 0 0 0 1 
0 0 0 0 0 1 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 
;
run;

 

The output data would have one row per record that has an issue (one of the vars in v1-v10 not equal to 1), including an issue number and a column which summarizes which variables specifically were flagged as having issues:

 

issue_number               vars_with_issues          

1                                             v1

1                                             v10

1                                             v5

1                                         v1, v2

1                                        v4, v8

 

Etc. So:

 

1. There should be one row of output per row WITH ISSUES in the input file

2. Each row in the output file should have an issue number assigned (the same one, since we are only talking about one issue)

3. Each row in the output file should have a column which summarizes which specific variables of v1-v10 were actually flagged as having an issue. 

 

I'm good on 1 and 2, but have not been able to figure out how to set up #3. 

 

Any help is much appreciated!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @Walternate,

 

Try this:

data want(where=(vars_with_issues) drop=i v1-v10);
set test;
array a[*] v1-v10;
retain issue_number 1; /* issue "value ≠ 1" */
length vars_with_issues $100;
do i=1 to dim(a);
  if a[i] ne 1 then vars_with_issues=catx(', ',vars_with_issues,vname(a[i]));
end;
run;

Make sure that the length of vars_with_issues (100 in the example) is large enough for your real data. Different types of variable lists (or an explicit list of variable names, possibly created dynamically) could be used in the array definition.

View solution in original post

2 REPLIES 2
FreelanceReinh
Jade | Level 19

Hi @Walternate,

 

Try this:

data want(where=(vars_with_issues) drop=i v1-v10);
set test;
array a[*] v1-v10;
retain issue_number 1; /* issue "value ≠ 1" */
length vars_with_issues $100;
do i=1 to dim(a);
  if a[i] ne 1 then vars_with_issues=catx(', ',vars_with_issues,vname(a[i]));
end;
run;

Make sure that the length of vars_with_issues (100 in the example) is large enough for your real data. Different types of variable lists (or an explicit list of variable names, possibly created dynamically) could be used in the array definition.

Walternate
Obsidian | Level 7
Thank you, that's exactly what I was looking for 🙂

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1009 views
  • 0 likes
  • 2 in conversation