Using column of a table in data step

Reply
Contributor
Posts: 40

Using column of a table in data step

Hi All,

 

I am trying to run the following code:

 

%macro run_me;
data work.x;
set work.y;
if y.column_name ~=('a_15','a_16') and y.MATCH=1 // column_name is a column in y dataset
then
%let start = trim(left(substr(y.column_name, index(y.column_name, '_')+ 1)));
if a_16 = . then a_16 = a_&start;
if a_&start = . then a_&start = a_16;
Dif = a_16-a_&start;
%do i = &start+1 %to 15;
a_&i = a_&start + (((Dif)/(16-&start))*(&i-&start));
%end;
drop y.column_name;
run;
%mend;
%run_me;

 

But it is giving me error as y is not an object. How can I use/refer to a column of a table in a data step? I want to iterate over the column.

Thanks!

Super User
Posts: 7,465

Re: Using column of a table in data step

What you are probably looking for is array processing.

But you will have to state your intentions first.

 

Your code has no chance at all at working. Mixing macro and data step language like that does not work. Macro is a preprocessor to create code dynamically. But once that code has been created, the macro processor is finished with its work, and only the created code will do something in the resulting step.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 40

Re: Using column of a table in data step

Hi Kurt,

 

I am trying to iterate over the y.clumn_name column. Except for the initial 'if' condition, the code works for other part of my program.  

Super User
Posts: 7,465

Re: Using column of a table in data step

Is clumn_name a column in dataset y?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Regular Contributor
Posts: 169

Re: Using column of a table in data step

The first step is to remove "y." - you reference the set in the statement "set work.y", so SAS knows any column should come from that set.

 

Can you provide a sample dataset? I tried running an example (and indented the code for readability), but I'm still running into problems. Another issue is that if the column_name variable is NOT a_15 or 1_16, the variable start won't get initialized, which will potentially cause issues further on.

 

data work.y;
    input column_name $ MATCH;
    datalines;
a_15 1
a_15 2
a_16 1
;

%macro run_me;
    data work.x;
        set work.y;
        if column_name ~= ('a_15', 'a_16') and MATCH=1
        then
            call symputx('start', trim(left(substr(column_name, index(column_name, '_')+ 1))));
        if a_16 = . then a_16 = a_&start;
        if a_&start = . then a_&start = a_16;
        Dif = a_16-a_&start;
        %do i = &start+1 %to 15;
            a_&i = a_&start + (((Dif)/(16-&start))*(&i-&start));
        %end;
        drop column_name;
    run;
%mend;
%run_me;

Super User
Posts: 7,465

Re: Using column of a table in data step


paulkaefer wrote:

The first step is to remove "y." - you reference the set in the statement "set work.y", so SAS knows any column should come from that set.

 

Can you provide a sample dataset? I tried running an example (and indented the code for readability), but I'm still running into problems. Another issue is that if the column_name variable is NOT a_15 or 1_16, the variable start won't get initialized, which will potentially cause issues further on.

 

data work.y;
    input column_name $ MATCH;
    datalines;
a_15 1
a_15 2
a_16 1
;

%macro run_me;
    data work.x;
        set work.y;
        if column_name ~= ('a_15', 'a_16') and MATCH=1
        then
            call symputx('start', trim(left(substr(column_name, index(column_name, '_')+ 1))));
        if a_16 = . then a_16 = a_&start;
        if a_&start = . then a_&start = a_16;
        Dif = a_16-a_&start;
        %do i = &start+1 %to 15;
            a_&i = a_&start + (((Dif)/(16-&start))*(&i-&start));
        %end;
        drop column_name;
    run;
%mend;
%run_me;


You are using call symput to create macro variable start, but all your references to &start will be evaluated by the macro processor before the data step is compiled and executed.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Regular Contributor
Posts: 169

Re: Using column of a table in data step

How would that be different using the original %let?

Super User
Posts: 7,465

Re: Using column of a table in data step

%let is done by the macro processor BEFORE the data step is compiled.

call execute is done by the data step during its execution.

 

That's why I stated that the code can never work at all.

 

In the initial code, the macro variable start would receive the whole formula up to the semicolon as a string, and that string would then be inserted everywhere &start was referenced, causing the data step compiler to just create a series of "invalid statement" ERROR messages.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Regular Contributor
Posts: 169

Re: Using column of a table in data step

@KurtBremser: I see. That makes sense.

 

@akhilesh_joshi: It would be most helpful to see a sample dataset and a brief explanation in words (not code) what your goal is. It is possible that a different method or dataset layout is the best solution.

Contributor
Posts: 40

Re: Using column of a table in data step

Hi Paul,

 

If I remove y it gives me a different errors which are :

 

ERROR: Undeclared array referenced: a_trim.
ERROR: Variable a_trim has not been declared as an array.

ERROR: Required operator not found in expression: &start+1
ERROR: The %FROM value of the %DO I loop is invalid.

ERROR 388-185: Expecting an arithmetic operator.

ERROR 68-185: The function DCP_TRIM is unknown, or cannot be accessed.

 

Please find below a sample data set. Have removed some values for privacy issues.

 

IDRENTAL_DATEcolumn_nameT1T2T3T4T5T6T7T8T9T10MATCH
x20161104a_1           

 

Thanks!

Contributor
Posts: 40

Re: Using column of a table in data step

HI Kurt and Paul,

 

There will be a scenario where for a certain a_(1 to 13) to a_16, teh T1 column will have values for a_(1 to 13) and a_16. USing the formula mentioned earlier, I am trying to interpolate the missing values for a_(1 to 13) till a_16. 

Super User
Posts: 7,465

Re: Using column of a table in data step


akhilesh_joshi wrote:

HI Kurt and Paul,

 

There will be a scenario where for a certain a_(1 to 13) to a_16, teh T1 column will have values for a_(1 to 13) and a_16. USing the formula mentioned earlier, I am trying to interpolate the missing values for a_(1 to 13) till a_16. 


So you have variable a_1 to a_16 in your dataset, and need to iterate through them.

For this, SAS provides array functions:

array a_vars{*} a1-a16;
do i = 1 to 16;
  if a_vars{i} ......
end;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 7,465

Re: Using column of a table in data step

Just to get you some feel for what is wrong here, let us follow the actions of the macro preprocessor:

%macro run_me;
data work.x;
set work.y;
if y.column_name ~=('a_15','a_16') and y.MATCH=1 // column_name is a column in y dataset
then
%let start = trim(left(substr(y.column_name, index(y.column_name, '_')+ 1)));

At this point, the macro processor creates the variable start and assigns it the string(!)

trim(left(substr(y.column_name, index(y.column_name, '_')+ 1)))

No program text is created by the macro processor.

if a_16 = . then a_16 = a_&start;

The macro processor realizes that &start has been referenced, and resolves it, creating the following statement:

if a_16 = . then a_16 = a_trim(left(substr(y.column_name, index(y.column_name, '_')+ 1)));

next, in

if a_&start = . then a_&start = a_16;

two references to &start are found, and the following statement is the result:

if a_trim(left(substr(y.column_name, index(y.column_name, '_')+ 1))) = . then a_trim(left(substr(y.column_name, index(y.column_name, '_')+ 1))) = a_16;

and so on, and so on.

Since a_trim is neither a defined function nor a defined array, you'll get error messages when the data step compiler tries to compile that text.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 40

Re: Using column of a table in data step

Thanks Kurt. That was really good stuff. But how do I solve my problem then?

Super User
Posts: 7,465

Re: Using column of a table in data step


akhilesh_joshi wrote:

Thanks Kurt. That was really good stuff. But how do I solve my problem then?


First of all, provide us some test data so we get a feel for the structure of the dataset (variables and variable types),

You can use a data step with datalines; to create such a dataset.

 

Then you provide a view of the data step that you want to have once you're finished, probably with some remarks how the needed values were created depending on the input data.

 

("because x1 is 12, I took the value from a_12 and multiplied it with a_16")

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Ask a Question
Discussion stats
  • 19 replies
  • 498 views
  • 4 likes
  • 5 in conversation