BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
act_nico_ven
Calcite | Level 5

Hi everyone,

 

I have a problem to create a column with a formula that allow me group by the field named OBJECTIVE in the attachment example.

 

The "NumS_Com" column represents a complementary information of the "NumS" column, when "NumS_Com" is equal to "-" character means that "NumS" value is like a PRINCIPAL registry, and when "NumS_Com" it is equal to registry in the "NumS" row above represents the complementary information. I need a column with the PRINCIPAL register as in the field named OBJECTIVE on the example.

 

I would like to achieve the result shown in the column entitled OBJECTIVE,

 

Could you help me please!?

 

Thanks in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

PROC SQL does not support recursive queries that some newer SQL implementations support.

But you can use  a macro to repeatly update the data.

First let's make a version of your data with names like CHILD, PARENT and TOPNODE to make it clearer.

data nodes;
  nodeid+1;
  set have (rename=(nums=child nums_com=parent));
  topnode=parent;
  if topnode='-' then topnode=child;
run;

TOPNODE is where we will try to re-create your OBJECTIVE column.  When the CHILD is already the top node of its sub tree (ie when PARENT='-') then use the CHILD value, otherwise use the PARENT value.

 

Then let's make a macro to recursive update the TOPNODE value until no more can be updated.  

 

We want to replace TOPNODE with the PARENT of the value that it currently has.  But not when there is no PARENT.

 

It will keep making updates until there are no more updates being made.  In that case the automatic macro variable SQLOBS will be 0 (which SAS considers to mean false).

 

%macro fixit;
proc sql undo_policy=none;
%do %until(not &sqlobs);
  update nodes a
   set topnode =(select b.parent from nodes b where a.topnode=b.child)
   where exists (select c.parent from nodes c where a.topnode=c.child and c.parent ne '-')
  ;
%end;
quit;
%mend fixit;
%fixit;

Capture.PNG

View solution in original post

12 REPLIES 12
ballardw
Super User

Please post example data preferable as a datastep ( instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... )

or as a text file. Excel files have enough potential issues that many of our IT systems will block them. Also if we have to convert Excel into a SAS data set it may not be like yours.

 

Also please show the result should be. Actuall columns and values are best.

 

Not that Proc SQL, and SQL in general does not have a concept of "before" or "after".

act_nico_ven
Calcite | Level 5
Sorry my friend @ballardw !! I modified the post already!!
ballardw
Super User

This manages to create the desired output:

data want;
   set have;
   length Objective $8.;
   Lnums = lag(nums);
   if _n_=1 then Objective=Nums;
   Else if lnums=Nums_com then do;
      Objective= cats(substr(Nums_com,1,4),'00');
   end;
   else do;
      objective=Nums;
   end;
   drop lnums;
run;

BUT I have no idea what the "Principal" stuff is about in your description.

 

If your values get longer for the start then you may need to find the position of the second - in the string to use the correct position in the substr call. Also since ALL of your example data had 00 for the last two characters I have no idea what contitions other than a missing NumS_Com might have a value other than 00 at the end.

act_nico_ven
Calcite | Level 5

Hi @ballardw, thank you for prompt response,

 

I´m very newbie with SAS code and i did not see that little mistake in the example data, sorry!

 

Not necessarily the last two digits are equal or cero value,

 

The PRINCIPAL it is the "NumS" that i really need to group by the example data in three row, of course, the idea is assign that PRINCIPAL´ number to each complementary row.

 

Attachment example corrected.

 

Tom
Super User Tom
Super User

I cannot follow your description. You posted this data as an attachement.

data have ;
   length NumS	 NumS_Com Objective $20;
   input NumS Nums_Com Objective ;
cards;
1-1-00	-	1-1-00
1-1-01	1-1-00	1-1-00
1-1-02	1-1-01	1-1-00
1-1-03	1-1-02	1-1-00
1-2-00	-	1-2-00
1-3-00	-	1-3-00
1-3-01	1-3-00	1-3-00
;

It sounds like you want to treat the value '-' as a missing value and just coalesce the value of first two columns to get the result.  But in that case how to explain the value of OBJECTIVE in the 3rd and 4th data rows?

data want ;
  set have ;
  if NumS_Com = '-' then call missing(NumS_Com);
  Priciple = coalescec(NumS_Com,NumS);
run;

Capture.PNG

 

Tom
Super User Tom
Super User

Perhaps what you really meant is that NUMS_COM='-' is an idication that the you should use the value of NUMS as the value of PRINCIPLE that will then be carried forward onto the following records until you see another row with NUMS_COM='-'?

In that case you need to be very sure that your data is ordered as you expect it to be.

 

data want ;
  set have ;
  if NumS_Com = '-' then Principle = NumS ;
  retain Principle ;
run;

Capture.PNG

act_nico_ven
Calcite | Level 5

Hi @Tom ,

 

I did not know that RETAIN statement, it is very usefull, thanks!

 

In my third post, I placed a second text file, more real to my data,

 

The idea, more or less; it is search the NumS_Com info in the NumS column, if find the NumS_Com then i need look for my PRINCIPAL NumS associated, THAT "NumS" has the "-" character.

 

IEj1.JPG

 

 

 

 

 

 

 

 

 

 

Sorry for the BAD drawing hahaha,

 

Sees the relation each the rows, the other values are complementary to "1-1-598". "1-1-598" is my PRINCIPAL value associted to "-" character. IMPORTANT: I can not sort.       

Tom
Super User Tom
Super User

So you what to find the top node in the sub tree that your parent-child relation table describes.

So you want to divide your relationship into subtrees and return the top level node of the tree.

 

I long time ago I use PROC NETDRAW from SAS/OR to generate subtrees.  I also made a macro that recursively inserted records use PROC SQL to generate the sub strees.

 

Now I think that someone with HASH skills could do it in a data step.

 

 

act_nico_ven
Calcite | Level 5

@Tom Exactly my friend, i really needed that drawing hehehe, 

 

My original data is generated in a PROC SQL. The Data Step is a option, but i really wish make a column into the PROC SQL code, it sound imposible, right? hahaha

 

Tom
Super User Tom
Super User

PROC SQL does not support recursive queries that some newer SQL implementations support.

But you can use  a macro to repeatly update the data.

First let's make a version of your data with names like CHILD, PARENT and TOPNODE to make it clearer.

data nodes;
  nodeid+1;
  set have (rename=(nums=child nums_com=parent));
  topnode=parent;
  if topnode='-' then topnode=child;
run;

TOPNODE is where we will try to re-create your OBJECTIVE column.  When the CHILD is already the top node of its sub tree (ie when PARENT='-') then use the CHILD value, otherwise use the PARENT value.

 

Then let's make a macro to recursive update the TOPNODE value until no more can be updated.  

 

We want to replace TOPNODE with the PARENT of the value that it currently has.  But not when there is no PARENT.

 

It will keep making updates until there are no more updates being made.  In that case the automatic macro variable SQLOBS will be 0 (which SAS considers to mean false).

 

%macro fixit;
proc sql undo_policy=none;
%do %until(not &sqlobs);
  update nodes a
   set topnode =(select b.parent from nodes b where a.topnode=b.child)
   where exists (select c.parent from nodes c where a.topnode=c.child and c.parent ne '-')
  ;
%end;
quit;
%mend fixit;
%fixit;

Capture.PNG

act_nico_ven
Calcite | Level 5

WHOOOOHUU, Thank you so much @Tom , works perfectly,

 

But, How works that Loop?? hahaha please!!, I don't understand the conditions in the WHERE statement,

Tom
Super User Tom
Super User

Updated with some more comments.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 3631 views
  • 0 likes
  • 3 in conversation