Create a new column in DataSet with a Macro - PROC SQL

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 14
Accepted Solution

Create a new column in DataSet with a Macro - PROC SQL

[ Edited ]

Hi everyone,

 

I have a problem to create a column with a formula that allow me group by the field named OBJECTIVE in the attachment example.

 

The "NumS_Com" column represents a complementary information of the "NumS" column, when "NumS_Com" is equal to "-" character means that "NumS" value is like a PRINCIPAL registry, and when "NumS_Com" it is equal to registry in the "NumS" row above represents the complementary information. I need a column with the PRINCIPAL register as in the field named OBJECTIVE on the example.

 

I would like to achieve the result shown in the column entitled OBJECTIVE,

 

Could you help me please!?

 

Thanks in advance!


Accepted Solutions
Solution
‎07-26-2016 05:24 PM
Super User
Super User
Posts: 6,500

Re: Create a new column in DataSet with a Macro - PROC SQL

[ Edited ]

PROC SQL does not support recursive queries that some newer SQL implementations support.

But you can use  a macro to repeatly update the data.

First let's make a version of your data with names like CHILD, PARENT and TOPNODE to make it clearer.

data nodes;
  nodeid+1;
  set have (rename=(nums=child nums_com=parent));
  topnode=parent;
  if topnode='-' then topnode=child;
run;

TOPNODE is where we will try to re-create your OBJECTIVE column.  When the CHILD is already the top node of its sub tree (ie when PARENT='-') then use the CHILD value, otherwise use the PARENT value.

 

Then let's make a macro to recursive update the TOPNODE value until no more can be updated.  

 

We want to replace TOPNODE with the PARENT of the value that it currently has.  But not when there is no PARENT.

 

It will keep making updates until there are no more updates being made.  In that case the automatic macro variable SQLOBS will be 0 (which SAS considers to mean false).

 

%macro fixit;
proc sql undo_policy=none;
%do %until(not &sqlobs);
  update nodes a
   set topnode =(select b.parent from nodes b where a.topnode=b.child)
   where exists (select c.parent from nodes c where a.topnode=c.child and c.parent ne '-')
  ;
%end;
quit;
%mend fixit;
%fixit;

Capture.PNG

View solution in original post


All Replies
Super User
Posts: 10,500

Re: Create a new column in DataSet with a Macro - PROC SQL

Please post example data preferable as a datastep ( instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... )

or as a text file. Excel files have enough potential issues that many of our IT systems will block them. Also if we have to convert Excel into a SAS data set it may not be like yours.

 

Also please show the result should be. Actuall columns and values are best.

 

Not that Proc SQL, and SQL in general does not have a concept of "before" or "after".

Occasional Contributor
Posts: 14

Re: Create a new column in DataSet with a Macro - PROC SQL

Sorry my friend @ballardw !! I modified the post already!!
Super User
Posts: 10,500

Re: Create a new column in DataSet with a Macro - PROC SQL

This manages to create the desired output:

data want;
   set have;
   length Objective $8.;
   Lnums = lag(nums);
   if _n_=1 then Objective=Nums;
   Else if lnums=Nums_com then do;
      Objective= cats(substr(Nums_com,1,4),'00');
   end;
   else do;
      objective=Nums;
   end;
   drop lnums;
run;

BUT I have no idea what the "Principal" stuff is about in your description.

 

If your values get longer for the start then you may need to find the position of the second - in the string to use the correct position in the substr call. Also since ALL of your example data had 00 for the last two characters I have no idea what contitions other than a missing NumS_Com might have a value other than 00 at the end.

Occasional Contributor
Posts: 14

Re: Create a new column in DataSet with a Macro - PROC SQL

Hi @ballardw, thank you for prompt response,

 

I´m very newbie with SAS code and i did not see that little mistake in the example data, sorry!

 

Not necessarily the last two digits are equal or cero value,

 

The PRINCIPAL it is the "NumS" that i really need to group by the example data in three row, of course, the idea is assign that PRINCIPAL´ number to each complementary row.

 

Attachment example corrected.

 

Super User
Super User
Posts: 6,500

Re: Create a new column in DataSet with a Macro - PROC SQL

I cannot follow your description. You posted this data as an attachement.

data have ;
   length NumS	 NumS_Com Objective $20;
   input NumS Nums_Com Objective ;
cards;
1-1-00	-	1-1-00
1-1-01	1-1-00	1-1-00
1-1-02	1-1-01	1-1-00
1-1-03	1-1-02	1-1-00
1-2-00	-	1-2-00
1-3-00	-	1-3-00
1-3-01	1-3-00	1-3-00
;

It sounds like you want to treat the value '-' as a missing value and just coalesce the value of first two columns to get the result.  But in that case how to explain the value of OBJECTIVE in the 3rd and 4th data rows?

data want ;
  set have ;
  if NumS_Com = '-' then call missing(NumS_Com);
  Priciple = coalescec(NumS_Com,NumS);
run;

Capture.PNG

 

Super User
Super User
Posts: 6,500

Re: Create a new column in DataSet with a Macro - PROC SQL

[ Edited ]

Perhaps what you really meant is that NUMS_COM='-' is an idication that the you should use the value of NUMS as the value of PRINCIPLE that will then be carried forward onto the following records until you see another row with NUMS_COM='-'?

In that case you need to be very sure that your data is ordered as you expect it to be.

 

data want ;
  set have ;
  if NumS_Com = '-' then Principle = NumS ;
  retain Principle ;
run;

Capture.PNG

Occasional Contributor
Posts: 14

Re: Create a new column in DataSet with a Macro - PROC SQL

Hi @Tom ,

 

I did not know that RETAIN statement, it is very usefull, thanks!

 

In my third post, I placed a second text file, more real to my data,

 

The idea, more or less; it is search the NumS_Com info in the NumS column, if find the NumS_Com then i need look for my PRINCIPAL NumS associated, THAT "NumS" has the "-" character.

 

IEj1.JPG

 

 

 

 

 

 

 

 

 

 

Sorry for the BAD drawing hahaha,

 

Sees the relation each the rows, the other values are complementary to "1-1-598". "1-1-598" is my PRINCIPAL value associted to "-" character. IMPORTANT: I can not sort.       

Super User
Super User
Posts: 6,500

Re: Create a new column in DataSet with a Macro - PROC SQL

So you what to find the top node in the sub tree that your parent-child relation table describes.

So you want to divide your relationship into subtrees and return the top level node of the tree.

 

I long time ago I use PROC NETDRAW from SAS/OR to generate subtrees.  I also made a macro that recursively inserted records use PROC SQL to generate the sub strees.

 

Now I think that someone with HASH skills could do it in a data step.

 

 

Occasional Contributor
Posts: 14

Re: Create a new column in DataSet with a Macro - PROC SQL

@Tom Exactly my friend, i really needed that drawing hehehe, 

 

My original data is generated in a PROC SQL. The Data Step is a option, but i really wish make a column into the PROC SQL code, it sound imposible, right? hahaha

 

Solution
‎07-26-2016 05:24 PM
Super User
Super User
Posts: 6,500

Re: Create a new column in DataSet with a Macro - PROC SQL

[ Edited ]

PROC SQL does not support recursive queries that some newer SQL implementations support.

But you can use  a macro to repeatly update the data.

First let's make a version of your data with names like CHILD, PARENT and TOPNODE to make it clearer.

data nodes;
  nodeid+1;
  set have (rename=(nums=child nums_com=parent));
  topnode=parent;
  if topnode='-' then topnode=child;
run;

TOPNODE is where we will try to re-create your OBJECTIVE column.  When the CHILD is already the top node of its sub tree (ie when PARENT='-') then use the CHILD value, otherwise use the PARENT value.

 

Then let's make a macro to recursive update the TOPNODE value until no more can be updated.  

 

We want to replace TOPNODE with the PARENT of the value that it currently has.  But not when there is no PARENT.

 

It will keep making updates until there are no more updates being made.  In that case the automatic macro variable SQLOBS will be 0 (which SAS considers to mean false).

 

%macro fixit;
proc sql undo_policy=none;
%do %until(not &sqlobs);
  update nodes a
   set topnode =(select b.parent from nodes b where a.topnode=b.child)
   where exists (select c.parent from nodes c where a.topnode=c.child and c.parent ne '-')
  ;
%end;
quit;
%mend fixit;
%fixit;

Capture.PNG

Occasional Contributor
Posts: 14

Re: Create a new column in DataSet with a Macro - PROC SQL

WHOOOOHUU, Thank you so much @Tom , works perfectly,

 

But, How works that Loop?? hahaha please!!, I don't understand the conditions in the WHERE statement,

Super User
Super User
Posts: 6,500

Re: Create a new column in DataSet with a Macro - PROC SQL

Updated with some more comments.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 1148 views
  • 0 likes
  • 3 in conversation