DATA Step, Macro, Functions and more

How to specify dataset name as prefix in column name in SAS datastep?

Reply
Contributor
Posts: 22

How to specify dataset name as prefix in column name in SAS datastep?

Hi, 

 

I am trying to merge 2 SAS datasets on a column, and also specifying some condition on a common column. But, when I specify the column name, it is being applied in first dataset's column but I want to apply it on second dataset's column. In PROC SQL, this is easily being accomplished using a.column_name or b.column_name. But, how can I do the same in Data step?

 

Thanks in advance!

Super User
Posts: 17,828

Re: How to specify dataset name as prefix in column name in SAS datastep?

If you merge data sets and they have variables with the same name, the one on the right overwrites the other so you only end up with one variable in the final dataset. You'll need to explicitly rename the columns ahead of time or in the merge statement to avoid errors. 

Super User
Super User
Posts: 7,401

Re: How to specify dataset name as prefix in column name in SAS datastep?

Post your code, and some test data (in the form of a datastep), its hard to tell what your trying to do from the text, is it a where clause?  You can provide options and refer to datasets using alias:

data want;
  merge dataseta (in=a) datasetb (in=b);
  by <idvars>;
  if a and ...;
run;
Super User
Posts: 5,082

Re: How to specify dataset name as prefix in column name in SAS datastep?

There are many ways to approach this, depending on what you want to do.  First, Reeza's suggestion:

 

data combined;

merge a b (rename=(common_column=common_column_from_b));

by id;

run;

 

You can also apply WHERE= as a data set option without renaming:

 

data combined;

merge a b (where=(common_column > 5));

by id;

run;

 

When you specify the column name, it will always be applied to the second data set's value (at least for the first observation per ID).  The rules are more complex after that first observation.  Since you are posting that the condition is applied based on the first data set's value, this indicates that you have a many to one merge (with many observations in the first data set, to one observation in the second data set).  So  you could also code:

 

data combined;

merge a b;

by id;

if first.id then first_value = common_column;

if    /* some condition based on first_value */;

retain first_value;

run;

 

Give us a little more context ... something will definitely be possible.

Ask a Question
Discussion stats
  • 3 replies
  • 199 views
  • 0 likes
  • 4 in conversation