i have dataset with columns from R1 to R70 along with additional columns.
Values from R1 to R70 are grouped based on certain filters
Like group A will be combination of R5_chk R5a_chk R5b_chk R6_chk R6a_chk
Like group B will be like R7_chk to R22_chk and so on.
if values in group is not null then i need to create a flag in below example chk_grp_a and
concatanate values as shown in chk_grp_a_desc.
The below code works but i was wondering is there a efficient way to do below task using arrays?. So i dont have to specify variable names for group B or C which comprise of 15 to 30 variables.
data groupa (keep= R5_chk R5a_chk R5b_chk R6_chk R6a_chk chk_grp_a chk_grp_a_desc ) ;
set wk.input_chk_rules_ed;
if
R5_chk ne '' OR R5a_chk ne '' OR R5b_chk ne '' OR R6_chk ne '' OR R6a_chk eq ''
then do;
chk_grp_a = 'Yes';
chk_grp_a_desc = CATX
(' : ',
IFC( R5_chk eq '' , ' ' , vvalue(R5_chk)),
IFC( R5b_chk eq '' , ' ' , vvalue(R5b_chk)),
IFC( R5a_chk eq '' , ' ' , vvalue(R5a_chk)),
IFC( R6_chk eq '' , ' ' , vvalue(R6_chk)),
IFC( R6a_chk eq '' , ' ' , vvalue(R6a_chk))
);
end;
run;
"R5_chk" "R5a_chk" "R5b_chk" "R6_chk" "R6a_chk" "chk_grp_a " "chk_grp_a_desc"
abc abc bgcbs jfhfg shdg Yes abc:abc:bgcbs:jfhfg:shdg:
abc bgcbs shdg Yes abc:bgcbs:shdg
abc abc bgcbs jfhfg Yes abc:abc:bgcbs:jfhfg
data groupa (keep= R5_chk R5a_chk R5b_chk R6_chk R6a_chk chk_grp_a chk_grp_a_desc ) ;
set wk.input_chk_rules_ed;
array R{*} $ R5_chk R5a_chk R5b_chk R6_chk;
if cmiss(of R{*}) < dim(R) or not missing(R6a_chk) then do;
chk_grp_a = 'Yes';
chk_grp_a_desc = CATX(' : ', of R{*}, R6a_chk);
end;
run;
It really looks like you are not taking advantage of the CATX() function.
length chk_grp_a_desc $200 ;
chk_grp_a_desc = catx(':',of R5_chk R5a_chk R5b_chk R6_ch R6a_chk);
if chk_grp_a_desc ne ' ' then chk_grp_a = 'Yes';
If you have a reason to address those variables again it may be worth while to create an array otherwise not. Here's a complete data step to expand on @Tom's solution show you can see how that may work.
data want; set have; array GrpA R5_chk R5a_chk ne R5b_chk R6_chk R6a_chk; length chk_grp_a_desc $200 ; chk_grp_a_desc = catx(':',of GrpA(*)); if chk_grp_a_desc ne ' ' then chk_grp_a = 'Yes'; run;
Possible things of interest would be "how many of the GrpA are populated"
GrpA_Count = dim(GrpA) - cmiss(of GrpA(*));
data groupa (keep= R5_chk R5a_chk R5b_chk R6_chk R6a_chk chk_grp_a chk_grp_a_desc ) ;
set wk.input_chk_rules_ed;
array R{*} $ R5_chk R5a_chk R5b_chk R6_chk;
if cmiss(of R{*}) < dim(R) or not missing(R6a_chk) then do;
chk_grp_a = 'Yes';
chk_grp_a_desc = CATX(' : ', of R{*}, R6a_chk);
end;
run;
Thank you for your response. the code below runs as expected,
i was just trying to understand few lines in code, i will appreciate if you can explan.
why do we have R6a_chk seperate and not part of array?
i know cmiss option looks for count of missing value in R5_chk R5a_chk R5b_chk R6_chk and compares with
count of dim(R) which in this case is 4 i believe since R6a_chk is not part of array. if you can please explain below line of code.
if cmiss(of R{*}) < dim(R) or not missing(R6a_chk) then do;
also can i apply vvalue format vvalue(R5_chk) to those values since it translates to some kind of description based on value of R5_chk?
Thanks
I could not treat R6a_chk like the other variables because it is not subject to the same condition as the other variables in your code
R5_chk ne '' OR R5a_chk ne '' OR R5b_chk ne '' OR R6_chk ne '' OR R6a_chk eq ''
Thanks, one last question can i use vvalue function?
since concatanated field abc:abc:bgcbs:jfhfg:shdg are the codes that i need to translate to description
for example
code description
abc this is for value a
bgcbs contact customer
jfhfg wait for response
shdg should be ready
so i need to apply that description to concataned field. right now i am using proc format as below
data _null_ ;
set wk.rulelkup end=done;
if _n_ = 1 then call execute('proc format library=work;value $abc (multilabel) ');
call execute ('"' !! trim(ID) !! '" = "' !! trim(Rule) !!'" ');
if done then call execute(';run;');
run;
after applying format it should look like
abc:abc:bgcbs:jfhfg:shdg
this is for value a : this is for value a : contact customer : wait for response : should be ready
thanks
Hi anyone can you please help with above query?
Maybe it will be better if you posted a new question
Hi PGStats,
from what I understood fromt the code, the conditiond is:
<if at least one of the variables within the R array is nonmissing or R6a_chk IS missing>
Then shouldn't the code be:
if cmiss(of R{*}) < dim(R) or missing(R6a_chk) then do;
that is, shouldn't the "not" be omitted from the condition for variable R6a_chk because you do want it to be missing in order to proceed with the "then do"?
You are right @ilikesas. My error.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.