i have dataset with columns from R1 to R70 along with additional columns.
Values from R1 to R70 are grouped based on certain filters
Like group A will be combination of R5_chk R5a_chk R5b_chk R6_chk R6a_chk
Like group B will be like R7_chk to R22_chk and so on.
if values in group is not null then i need to create a flag in below example chk_grp_a and
concatanate values as shown in chk_grp_a_desc.
The below code works but i was wondering is there a efficient way to do below task using arrays?. So i dont have to specify variable names for group B or C which comprise of 15 to 30 variables.
data groupa (keep= R5_chk R5a_chk R5b_chk R6_chk R6a_chk chk_grp_a chk_grp_a_desc ) ;
set wk.input_chk_rules_ed;
if
R5_chk ne '' OR R5a_chk ne '' OR R5b_chk ne '' OR R6_chk ne '' OR R6a_chk eq ''
then do;
chk_grp_a = 'Yes';
chk_grp_a_desc = CATX
(' : ',
IFC( R5_chk eq '' , ' ' , vvalue(R5_chk)),
IFC( R5b_chk eq '' , ' ' , vvalue(R5b_chk)),
IFC( R5a_chk eq '' , ' ' , vvalue(R5a_chk)),
IFC( R6_chk eq '' , ' ' , vvalue(R6_chk)),
IFC( R6a_chk eq '' , ' ' , vvalue(R6a_chk))
);
end;
run;
"R5_chk" "R5a_chk" "R5b_chk" "R6_chk" "R6a_chk" "chk_grp_a " "chk_grp_a_desc"
abc abc bgcbs jfhfg shdg Yes abc:abc:bgcbs:jfhfg:shdg:
abc bgcbs shdg Yes abc:bgcbs:shdg
abc abc bgcbs jfhfg Yes abc:abc:bgcbs:jfhfg
data groupa (keep= R5_chk R5a_chk R5b_chk R6_chk R6a_chk chk_grp_a chk_grp_a_desc ) ;
set wk.input_chk_rules_ed;
array R{*} $ R5_chk R5a_chk R5b_chk R6_chk;
if cmiss(of R{*}) < dim(R) or not missing(R6a_chk) then do;
chk_grp_a = 'Yes';
chk_grp_a_desc = CATX(' : ', of R{*}, R6a_chk);
end;
run;
It really looks like you are not taking advantage of the CATX() function.
length chk_grp_a_desc $200 ;
chk_grp_a_desc = catx(':',of R5_chk R5a_chk R5b_chk R6_ch R6a_chk);
if chk_grp_a_desc ne ' ' then chk_grp_a = 'Yes';
If you have a reason to address those variables again it may be worth while to create an array otherwise not. Here's a complete data step to expand on @Tom's solution show you can see how that may work.
data want; set have; array GrpA R5_chk R5a_chk ne R5b_chk R6_chk R6a_chk; length chk_grp_a_desc $200 ; chk_grp_a_desc = catx(':',of GrpA(*)); if chk_grp_a_desc ne ' ' then chk_grp_a = 'Yes'; run;
Possible things of interest would be "how many of the GrpA are populated"
GrpA_Count = dim(GrpA) - cmiss(of GrpA(*));
data groupa (keep= R5_chk R5a_chk R5b_chk R6_chk R6a_chk chk_grp_a chk_grp_a_desc ) ;
set wk.input_chk_rules_ed;
array R{*} $ R5_chk R5a_chk R5b_chk R6_chk;
if cmiss(of R{*}) < dim(R) or not missing(R6a_chk) then do;
chk_grp_a = 'Yes';
chk_grp_a_desc = CATX(' : ', of R{*}, R6a_chk);
end;
run;
Thank you for your response. the code below runs as expected,
i was just trying to understand few lines in code, i will appreciate if you can explan.
why do we have R6a_chk seperate and not part of array?
i know cmiss option looks for count of missing value in R5_chk R5a_chk R5b_chk R6_chk and compares with
count of dim(R) which in this case is 4 i believe since R6a_chk is not part of array. if you can please explain below line of code.
if cmiss(of R{*}) < dim(R) or not missing(R6a_chk) then do;
also can i apply vvalue format vvalue(R5_chk) to those values since it translates to some kind of description based on value of R5_chk?
Thanks
I could not treat R6a_chk like the other variables because it is not subject to the same condition as the other variables in your code
R5_chk ne '' OR R5a_chk ne '' OR R5b_chk ne '' OR R6_chk ne '' OR R6a_chk eq ''
Thanks, one last question can i use vvalue function?
since concatanated field abc:abc:bgcbs:jfhfg:shdg are the codes that i need to translate to description
for example
code description
abc this is for value a
bgcbs contact customer
jfhfg wait for response
shdg should be ready
so i need to apply that description to concataned field. right now i am using proc format as below
data _null_ ;
set wk.rulelkup end=done;
if _n_ = 1 then call execute('proc format library=work;value $abc (multilabel) ');
call execute ('"' !! trim(ID) !! '" = "' !! trim(Rule) !!'" ');
if done then call execute(';run;');
run;
after applying format it should look like
abc:abc:bgcbs:jfhfg:shdg
this is for value a : this is for value a : contact customer : wait for response : should be ready
thanks
Hi anyone can you please help with above query?
Maybe it will be better if you posted a new question
Hi PGStats,
from what I understood fromt the code, the conditiond is:
<if at least one of the variables within the R array is nonmissing or R6a_chk IS missing>
Then shouldn't the code be:
if cmiss(of R{*}) < dim(R) or missing(R6a_chk) then do;
that is, shouldn't the "not" be omitted from the condition for variable R6a_chk because you do want it to be missing in order to proceed with the "then do"?
You are right @ilikesas. My error.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.