BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
IvyHeng88
Calcite | Level 5
How to construct query for the following expected outpit:-

Source table:-
Cust_id.     P_cust_id
111.            .
222.           111
333.           222
444.           333
555.           444

Expected result:-
Cust_id.      Expected_cust_id
111.             555
222.             555
333.             555
444.             555
111.             444
222.             444
333.             444
111.             333
222.             333
111.             222
1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

If there is a unique p_cust_id for any cust_id then you can go up the chain in a simple way by indexing your dataset:

 

data have;
input (Cust_id   P_cust_id) ($);
cards;
111            .
222           111
333           222
444           333
555           444
;

proc sql;
create unique index cust_id on have (cust_id);
quit;

data want;
set have;
Expected_cust_id = cust_id;
do while(not missing(p_cust_id));
    output;
    cust_id = p_cust_id;
    set have key=cust_id/unique;
    if _IORC_ ne 0 then leave;
    end;
_error_ = 0;
rename p_cust_id=cust_id;
drop cust_id;
run;

proc print data=want noobs; run;
PG

View solution in original post

16 REPLIES 16
ballardw
Super User

Is the logic that each P_cust_id to be paired with the other cust_id and cust_id is less than P_cust_id? If not explain the logic.

 

 

Reeza
Super User

Do you have dates attached at all? If you do it becomes a bit easier. 

There's a macro here that will group the ID's , exp=8 means that it looks for a max of 8 depth. If you need further you can increase it, but the slower it gets. Once it groups the ID, if you have a date, you can take the latest ID and then you'll have a master table of IDs that you can then merge with SQL.  

 

%macro SubGraphs(arcs,from=from,to=to,out=Clusters,exp=8);
data _null_;
if 0 then set &arcs(keep=&from rename=(&from=node)); /* get node data type */
length clust 8;
declare hash nodes(hashexp:&exp);
nodes.defineKey('node');
nodes.defineData('node', 'clust');
nodes.defineDone();
declare hiter nodeList('nodes');

do newClust = 1 by 1 while(not endLoop);
	set &arcs end=endLoop;
	call missing(clust); node = &from;
	if 0^=nodes.find() then nodes.add(); 
	fromClust = clust; 
	call missing(clust); node = &to;
	if 0^=nodes.find() then nodes.add(); 
	toClust = clust;
	if n(fromClust, toClust) = 0 then do;
		nodes.replace(key:&from, data:&from, data:newClust);
		nodes.replace(key:&to, data:&to, data:newClust);
	end;
	else if missing(toClust) then 
		nodes.replace(key:&to, data:&to, data:fromClust);
	else if missing(fromClust) then 
		nodes.replace(key:&from, data:&from, data:toClust);
	else if fromClust ne toClust then do;
		rc = nodeList.first();
		do while (rc = 0);
			if clust = fromClust then 
				nodes.replace(key:node, data:node, data:toClust);
			rc = nodeList.next();
		end;
	end;
end;
nodes.output(dataset:"&out");
stop;
run;
%mend SubGraphs;

data have;
infile cards dlm='.' truncover;
input Cust_id     P_cust_id;
cards;
222.           111
333.           222
444.           333
555.           444
;
run;



%SubGraphs(have,from=p_cust_id,to=cust_id,out=Clusters,exp=5);

https://gist.github.com/statgeek/14e3aa2a9f718f551cd98134e9ceed30

PGStats
Opal | Level 21

@Reeza, exp=8 in the SubGraphs macro call is the power of two that determines the size of the hash table, 2^8 = 256 here. There is no limit to the search depth.

PG
Reeza
Super User
@PGStats Thanks! @IvyHeng88 This is originally PGStats code as noted in the link.
novinosrin
Tourmaline | Level 20
data have;
input (Cust_id   P_cust_id) ($);
cards;
111            .
222           111
333           222
444           333
555           444
;

proc sql;
create table temp as
select a.Cust_id,b.P_cust_id
from have a, have b;
quit;

data _null_;
if _n_=1 then do;
 dcl hash H (ordered: "d",multidata:'y') ;
   h.definekey  ("Cust_id") ;
   h.definedata ('Cust_id',"p_cust_id") ;
   h.definedone () ;
end;
_grp+1;
do _n=1 by 1 until(last.cust_id);
set temp end=last;
by cust_id;
where not missing(P_cust_id);
if _n<=_grp-1 then h.add();
end;
if last then h.output(dataset:'want');
run;
TomKari
Onyx | Level 15

Here's a similar example using PROC BOM:

 

Tom

 

data ParComp0;
infile datalines dlm='09'x;
length Parent $ 4 Component $ 4;
input Parent Component QtyPer;
cards;
LA01	B100	1
LA01	S100	1
LA01	A100	1
B100	1100	1
B100	1200	1
B100	1300	1
B100	1400	4
A100	1500	1
A100	1600	1
A100	1700	1
1100	2100	26
1500	1400	2
1700	2200	12
1700	2300	1
run;
data PMaster0;
infile datalines dlm='09'x;
length Part $ 4 Desc $ 32 Unit $ 12;
input Part Desc Unit;
cards;
1100	Finished shaft	Each
1200	6-Diameter steel plate	Each
1300	Hub	Each
1400	1/4-20 Screw	Each
1500	Steel holder	Each
1600	One-way socket	Each
1700	Wiring assembly	Each
2100	3/8 Steel tubing	Inches
2200	16-Gauge lamp cord	Feet
2300	Standard plug terminal	Each
A100	Socket assembly	Each
B100	Base assembly	Each
LA01	Lamp LA	Each
S100	Black shade	Each
run;

/* Create the indented BOM and summarized parts list */
proc bom data=ParComp0 pmdata=PMaster0 
         out=IndBOM0 summaryout=SumBOM0; 
   structure / part=Part 
               parent=Parent
               component=Component
               quantity=QtyPer
               id=(Desc Unit);
run;
novinosrin
Tourmaline | Level 20

@TomKari   Blimey, that shows learning sas can be a never ending process. Nice and Thanks

Reeza
Super User

@novinosrin PROC BOM requires SAS/OR which isn't installed in a lot of SAS installations.

novinosrin
Tourmaline | Level 20

@Reeza right. Yes, I noticed on a quick google search. Hmm ocean of procs and modules Smiley Surprised

PGStats
Opal | Level 21

If there is a unique p_cust_id for any cust_id then you can go up the chain in a simple way by indexing your dataset:

 

data have;
input (Cust_id   P_cust_id) ($);
cards;
111            .
222           111
333           222
444           333
555           444
;

proc sql;
create unique index cust_id on have (cust_id);
quit;

data want;
set have;
Expected_cust_id = cust_id;
do while(not missing(p_cust_id));
    output;
    cust_id = p_cust_id;
    set have key=cust_id/unique;
    if _IORC_ ne 0 then leave;
    end;
_error_ = 0;
rename p_cust_id=cust_id;
drop cust_id;
run;

proc print data=want noobs; run;
PG
novinosrin
Tourmaline | Level 20
data have;
input (Cust_id   P_cust_id) ($);
cards;
111            .
222           111
333           222
444           333
555           444
;
data _null_;
if _n_=1 then do;
 dcl hash H (ordered: "d",multidata:'y') ;
   h.definekey  ("Cust_id") ;
   h.definedata ('Cust_id',"p_cust_id") ;
   h.definedone () ;
end;
set have(keep=cust_id) end=last;
_grp+1;
do n=1 to nobs ;
_k=n-1;
set have(keep=p_cust_id) point=n nobs=nobs;
if _k and  _k<=_grp-1 then h.add();
end;
if last then h.output(dataset:'want');
run;
Ksharp
Super User
data have;
input (Cust_id   P_cust_id) ($);
n+1;
cards;
111            .
222           111
333           222
444           333
555           444
;
data want;
 if 0 then set have;
 declare hash h() ;
   h.definekey  ("p_cust_id") ;
   h.definedata ("Cust_id") ;
   h.definedone ();
 declare hash hh(dataset:'have',ordered:'y');
 declare hiter hi('hh');
   hh.definekey('n');
   hh.definedata("Cust_id","p_cust_id");
   hh.definedone();

do n=hh.num_items to 1 by -1;
h.clear();
rc=hi.first();
do while(rc=0);
  h.add();
  rc=hi.next();
end;

hh.remove();
rc=hi.first();
do while(rc=0);
  _Cust_id=Cust_id;
  rx=h.check(key:Cust_id);
  do while(h.find(key:Cust_id)=0);
  end;
  if rx ne 0 then call missing(_p_cust_id);
   else _p_cust_id=Cust_id;output; 
  rc=hi.next();
end;
end;
keep _: ;
run;

proc print noobs;run;
PGStats
Opal | Level 21

@Ksharp, your program generates these notes:

 

NOTE: Character values have been converted to numeric
      values at the places given by: (Line):(Column).
      39:20
NOTE: There were 5 observations read from the data set WORK.HAVE.
NOTE: DATA STEP stopped due to looping.
NOTE: The data set WORK.WANT has 10 observations and 2 variables.

Under SAS 9.4 TS Level 1M5

 

PG
Ksharp
Super User

@PGStats

That is just NOTE not WARNING.

 

I believe should try this to get rid of this kind of NOTE.

 

data have;
input Cust_id   P_cust_id;
n+1;
cards;
111            .
222           111
333           222
444           333
555           444
;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 16 replies
  • 6838 views
  • 20 likes
  • 8 in conversation