Your SAS programs, embedded in web apps and elsewhere

How to get %substr to read values, as opposed to variable name

Posts: 1

How to get %substr to read values, as opposed to variable name

/*I have 61 datasets, numbered A1 thru A94. All 61 datasets have a variable FourDig, with values such as 3112, 4561. I need to substring the first 2 digits and save them into a new variable TwoDig in each of the datasets (in this case, 31 and 45).

I tried following code. I got a new var TwoDig ok, but the variable is empty. I also get a new empty variable (I believe the first two digits of FourDig).

A SAS novice, I am looking forward to help from you great people! */

%macro Four2two;
%do i = 1 %to 94;
data A&i;
set A&i;
TwoDig = substr(FourDig,1,2);
%mend Four2Two;
Posts: 8,868

Re: How to get %substr to read values, as opposed to variable name

I'm not sure what this has to do with SAS Stored Processes or the BI Platform, but first, you should be aware that there is a HUGE difference between %substr and substr. Your program uses the SUBSTR function in a data step program.

Do you have a working SAS program? It is almost impossible to make a macro program from "scratch" without first having a working SAS program that you can turn into a macro program. Right now, it looks like you're reading in a WORK dataset and modifying that same WORK dataset. Where do you save the file out to a permanent library? I am not a big fan of the technique you show:
data a1;
set a1;
[/pre] it is possible to "accidentally" delete the original file named in the set statement under some error conditions. Or, if you do use this technique, I would be absolutely sure that you do NOT have any errors in your program BEFORE you convert that program to be a SAS macro program.

What error messages or warnings do you get in the log? I suspect that if you look in the log, you might see something like this:
NOTE: Numeric values have been converted to character values at the places given by: (Line)Smiley SadColumn).

You do not say whether the FourDig variable is chaa racter or numeric variable. Each of these variable types offers challenges in the use of SUBSTR-- the SUBSTR function is a CHARACTER function -- it returns a CHARACTER string, not a numeric value. Similarly, the %SUBSTR function returns a CHARACTER string from the substrining of a macro variable value or some text string created by a macro variable. But, you are using SUBSTR -- and not %SUBSTR -- so what comes next applies to the program code you showed.

--If FourDig is numeric, you are forcing SAS to convert FourDig to a character variable for the substr function using the BEST12. format. You can control this automatic conversion and the format used by modifying your SUBSTR function. Does your SUBSTR work in a non-macro program??

--If FourDig is character, then what is the length of FourDig??? Does your SUBSTR work in a non-macro program?

What do what the output from the program to be?? Should TwoDig be character or numeric?? Since SUBSTR is a character function, it will, by default, create a character variable. Is that what you want?

In addition to the SUBSTR function, you might want to consider the use of 2 other functions, the PUT function and the INPUT function. Let's say I have these 4 rows in a data file and that NAME is character and that FOURDIG is numeric:
Name FourDig
alan 3112
bob 1234
carl 2345
doug 3456
ed 4567

Then if I run the following program, followed by a PROC PRINT, I can compare what happens with the BEST12 format being used for conversion from numeric to character versus the 4.0 format being used.
data a1;
set a1;
length TD_bad TwoDig $2 TD_num 8;
TD_bad = substr(put(FourDig,best12.0),1,2);
TwoDig = substr(put(FourDig,4.0),1,2);
TD_num = input(TwoDig,2.0);

In the program, the PUT function is used to convert a numeric value to a character string -- for purposes of the SUBSTR function. If you wanted TwoDig to be numeric, you then have to use the INPUT function to convert the TwoDig character string into a numeric value.

In the program, the assignment statement that creates TD_bad is replicating what your assignment statement is doing if FourDig is numeric.If you use a numeric variable in a character function, SAS converts the variable value using the BEST12. format -- which results in blanks as the returned value of the SUBSTR function. On the other hand, if you control the conversion using the 4.0 format with the PUT function, then you have a 4 character string that can be substringed. But, as you can see in the LISTING output below, TwoDig is a character string (it is left-justified, like NAME), while FourDig and TD_num are numeric variables (they are right-justified).

Here's the output from the above program:
Obs name FourDig TwoDig TD_bad TD_num

1 alan 3112 31 31
2 bob 1234 12 12
3 carl 2345 23 23
4 doug 3456 34 34
5 ed 4567 45 45


If you do a PROC CONTENTS and discover that FourDig is a character string, then there is something else wrong with your SUBSTR function.

In any case, you need to have a working SAS program -- such as that shown above, that returns the results you want BEFORE you start to code a Macro program.

You might wish to contact SAS Technical Support for further help, especially since things may get complicated if you are working with files in the BI platform for which information maps have been designed. It is possible that even if you add a variable TwoDig to the physical data, if the new variable is not defined in the metadata for the dataset, or defined in an information map, it may not be available for further queries and/or you may render the file unusable.

Posts: 0

Re: How to get %substr to read values, as opposed to variable name

Showing us the code is often not enough. Seeing some data, or messages from the log is usually more informative than seeing what you think might work, that is producing some other (unspecified) but unsatisfactory result. I'm also not certain this is a stored process issue, but perhaps more data step. Still, I'll offer some thoughts.

If FOURDIG is a four byte character string, then this should work, but if it is numeric, then there will be an implicit numeric to character conversion to derive the string needed for the substring function, and the conversion is likely to use a Best8. format which will give you four leading spaces.

Kind regards

Ask a Question
Discussion stats
  • 2 replies
  • 3 in conversation