The observation consists of text along with the variable names, that variable while concatenating will be converted into value and forms a complete observation. The variable names are embedded in square brackets. While concatenating they should be stripped to avoid unwanted spaces in the final observation.The number of variables are not fixed , the text may also contain some special characters as well. How to make this functional. Original is the text and output is what is expected,both are displayed below at the end.
Tried with the below code.,but something is missing the do loop while splitting the text based on delimiter ']'
data a;
text="It was found that the customer transacted an average of [AVERAGE] transaction per month in [START_YEAR] and closed account in [END_YR]." ;
call symputx('txt',countw(text,"]"));
run ;
data b;
set a ;
array txt{&txt} $200. ;
do i = 1 to countw(text,"]");
if i =1 then txt{i}=substr(text,1,find(text,"]"));
else txt{i}=substr(text,find(text,"]")+1,find(text,"]"));
end;
run ;
Original | Output |
It was found that the customer transacted an average of [AVERAGE] transaction per month in [START_YEAR] and closed account in [END_YR]. ; | "It was found that the customer transacted an average of" || strip(AVERAGE) || "transaction per month in" || strip(START_YEAR) || "and closed account in" || strip(END_YR)."; |
Is it always the same? Three tranwrds or regreplaces is the easiest:
data b; set a; text=tranwrd(text,'[AVERAGE]','"||strip(AVERAGE)||"'); text=tranwrd(text,'[START_YEAR]','"||strip(START_YEAR)||"'); text=tranwrd(text,'[END_YR]','"||strip(END_YR)||"'); run;
Thanks for immediate response. But this is not always the same , text differs for each and every iteration , the variables differs and we should also add the quotes for the string before and after identifying the variables.
the original is what is present in the dataset and output is expected . The text in the output is added with quotes ,pipe symbol is prefixed to concatenate the variable and after that again the text is placed in quoted till the next variables are identified , its repeated till the end of the string.
Original | Output |
It was found that the customer transacted an average of [AVERAGE] transaction per month in [START_YEAR] and closed account in [END_YR]. ; | "It was found that the customer transacted an average of" || strip(AVERAGE) || "transaction per month in" || strip(START_YEAR) || "and closed account in" || strip(END_YR)."; |
My code does include quotes. If it changes each time, then will the text between angular brackets always be the variable? If so then replace the text
[ with "||strip(
and the text
] with )||"
E.g.
data want; set have; text=tranwrd('[','"||strip('); text=tranwrd(']',')||"'); run;
Note the double quotes used within the replacement strings.
@keen_sas wrote:
The observation consists of text along with the variable names, that variable while concatenating will be converted into value and forms a complete observation. The variable names are embedded in square brackets. While concatenating they should be stripped to avoid unwanted spaces in the final observation.The number of variables are not fixed , the text may also contain some special characters as well. How to make this functional. Original is the text and output is what is expected,both are displayed below at the end.
Tried with the below code.,but something is missing the do loop while splitting the text based on delimiter ']'
data a;
text="It was found that the customer transacted an average of [AVERAGE] transaction per month in [START_YEAR] and closed account in [END_YR]." ;
call symputx('txt',countw(text,"]"));
run ;
data b;
set a ;
array txt{&txt} $200. ;
do i = 1 to countw(text,"]");
if i =1 then txt{i}=substr(text,1,find(text,"]"));
else txt{i}=substr(text,find(text,"]")+1,find(text,"]"));
end;
run ;
Original Output It was found that the customer transacted an average of [AVERAGE] transaction per month in [START_YEAR] and closed account in [END_YR]. ; "It was found that the customer transacted an average of" || strip(AVERAGE) || "transaction per month in" || strip(START_YEAR) || "and closed account in" || strip(END_YR).";
Why are you requiring the use of the || operator? And how is the "output" variable used?
I
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.