I have various values that I want to assign formats to. However, some of these values are substrings. For example, say I have the following:
if AccountID = '2BCD' then Client = 'A';
else if AccountID = '2ACD' then Client = 'B';
else if substr(AccountID,1,3) = '2CD' then Client = 'C';
else if...
So, instead of having a ton of if else statements I want to assign a format called $Client. to the variable Client. So, I would do the following:
proc format;
value $Client '2BCD' = 'A'
'2ACD' = 'B'
this is where I get stuck because I cannot simply say '2CD' = 'C' because it is only a substring of the AccountID.
I would use PROC FCMP to define the desired format $Client (requires SAS 9.3 or higher).
proc fcmp outlib=work.funcs.test;
function accfmt(AccountID $) $;
length Client $20; /* Please adapt the length to your needs! */
if AccountID = '2BCD' then Client = 'A';
else if AccountID = '2ACD' then Client = 'B';
else if substr(AccountID,1,3) = '2CD' then Client = 'C';
/* else if ... */
return(Client);
endsub;
quit;
options cmplib=work.funcs;
proc format;
value $Client (default=20) /* replace 20 by the length of variable CLIENT above */
other=[accfmt()];
run;
/* Test */
data _null_;
input AccountID $;
put AccountID $Client.;
cards;
2BCD
2ACD
2CDX
2CDY
;
Here's a partial workaround. Define two formats, one based on the entire ID, and a second based on 3 characters. Be sure to include the OTHER= category. For example:
proc format;
value $Client '2BCD' = 'A'
'2ACD' = 'B'
other=' ';
value $sub '2CD' = 'C'
other=' ';
run;
Then you can use a DATA step that is only slightly longer:
data want;
set have;
client = put(account_ID, $client.);
if client=' ' then client=put( substr(account_id, 1, 3), $sub.);
run;
This might be case for a multilabel format. If you use the format with anything except proc tabulate or report while specifying use of the multilabel then the result could well be what you want. Used with a Put assignment such as Client = put(accountid,$clients.) you will get the top label only.
proc format library=work;
value $clients (multilabel)
'2BCD' = 'A'
'2ACD' = 'B'
'2CDA','2CDB','2CDC'= 'C'
'2CDA'= "C: Branch A"
'2CDB'= "C: Branch B"
'2CDC'= "C: Branch C"
;
run;
Data have;
input branch $ amount;
datalines;
2BCD 501
2ACD 300
2CDA 400
2CDB 600
2CDC 800
;
RUN;
proc freq data=have;
tables branch;
format branch $clients.;
run;
proc tabulate data=have;
class branch / mlf;
var amount;
format branch $clients.;
table branch, amount*sum;
run;
proc print data=have;
format branch $clients.;
run;
I would use PROC FCMP to define the desired format $Client (requires SAS 9.3 or higher).
proc fcmp outlib=work.funcs.test;
function accfmt(AccountID $) $;
length Client $20; /* Please adapt the length to your needs! */
if AccountID = '2BCD' then Client = 'A';
else if AccountID = '2ACD' then Client = 'B';
else if substr(AccountID,1,3) = '2CD' then Client = 'C';
/* else if ... */
return(Client);
endsub;
quit;
options cmplib=work.funcs;
proc format;
value $Client (default=20) /* replace 20 by the length of variable CLIENT above */
other=[accfmt()];
run;
/* Test */
data _null_;
input AccountID $;
put AccountID $Client.;
cards;
2BCD
2ACD
2CDX
2CDY
;
An alternative to formats, which can be helpful in certain circumstances is to use datasets. The example below shows the ability to generate any form of code to apply a codelist to a value. (Note the code here is only for the substr() example given, if its more complicated then you might need to add more code):
data have;
input account_id $10.;
datalines;
2BCD
2ACD
AA2CDBB
2CDERT
ABCD
;
run;
data codelist;
length code long_value $10 logic $50;
code="2BCD"; long_value="A"; logic="account_id"; output;
code="2ACD"; long_value="B"; logic="account_id"; output;
code="2CD"; long_value="C"; logic="substr(account_id,1,3)"; output;
run;
/* This is where the code is generated */
data _null_;
set codelist end=last;
if _n_=1 then call execute('data have; set have; length long_value $10;');
call execute(' if '||strip(logic)||'="'||strip(code)||'" then long_value="'||strip(long_value)||'";');
if last then call execute('run;');
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.