BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JayVD305
Calcite | Level 5

Hello All,

 

I have a dataset of paired companies. The pairs appear in a pattern of the same groups, but I do not have a unique identifier for each group.

 

How can I create a unique identifier for each group of pairs in SAS? Companies do repeat themselves in the sample - for instance they could have multiple pairs and then appear again as the pair for another company. The groups are consistent throughout the sample year. Hopefully this will make more sense after reviewing the sample data below. Please let me know if more information is needed!

 

Here's what I have and the Group identifier that I'd like to create:

CompanyPaired CompanyGroup
AC1
AE1
AF1
AG1
BH2
BI2
BJ2
CA1
CE1
CF1
CG1
DK3
DL3
EA1
EC1
EF1
EG1
1 ACCEPTED SOLUTION

Accepted Solutions
Jagadishkatam
Amethyst | Level 16

Please try the below code and check the want2 dataset where you will see the expected output,

 

data have;
input Company$ Paired_Company$;
cards;	
A C  
A E  
A F  
A G  
B H  
B I  
B J  
C A  
C E  
C F  
C G  
D K  
D L  
E A  
E C  
E F  
E G
;


proc sort data=have;
by Paired_Company Company;
run;

proc sql;
create table want as select case when count(Paired_Company)>1 then 0 else count(Paired_Company) end as count, case when count(Paired_Company)>1 then 'Sample' else Company end as comp, 
* from have group by Paired_Company;
quit;

proc sort data=want;
by count comp;
run;

data want2;
set want;
by count comp;
retain group;
if first.comp then group+1;
drop count comp;
run;

proc sort data=want2;
by Company Paired_Company ;
run;
Thanks,
Jag

View solution in original post

4 REPLIES 4
Jagadishkatam
Amethyst | Level 16

Please try the below code and check the want2 dataset where you will see the expected output,

 

data have;
input Company$ Paired_Company$;
cards;	
A C  
A E  
A F  
A G  
B H  
B I  
B J  
C A  
C E  
C F  
C G  
D K  
D L  
E A  
E C  
E F  
E G
;


proc sort data=have;
by Paired_Company Company;
run;

proc sql;
create table want as select case when count(Paired_Company)>1 then 0 else count(Paired_Company) end as count, case when count(Paired_Company)>1 then 'Sample' else Company end as comp, 
* from have group by Paired_Company;
quit;

proc sort data=want;
by count comp;
run;

data want2;
set want;
by count comp;
retain group;
if first.comp then group+1;
drop count comp;
run;

proc sort data=want2;
by Company Paired_Company ;
run;
Thanks,
Jag
JayVD305
Calcite | Level 5

Hi Jag,

 

Thank you for replying so quickly...

When I run this code, I get a unique identifier for one of the pairs in each group, but the others are all assigned a value of 1. Any idea what I could be doing wrong?

 

In bold below, I have highlighted the issue. The B grouping should all have a Group value of 2 and the D grouping a value of 3. Essentially I'm creating an industry identifier for each company based on its pairs. All companies in a particular industry should have the same Group value.

 

CompanyPaired_CompanyGroupExpected Value
AC1 
AG1 
AF1 
AE1 
BJ2 
BI12
BH12
CG1 
CF1 
CE1 
CA1 
DL3 
DK13
EC1 
EA1 
EG1 
EF1 
PGStats
Opal | Level 21

proc optnet will find the connected companies:

 

data test;
input Company$ PairedCompany$;
datalines;
A   C 
A   E 
A   F 
A   G 
B   H 
B   I 
B   J 
C   A 
C   E 
C   F 
C   G 
D   K 
D   L 
E   A 
E   C 
E   F 
E   G 
;

proc optnet data_links=test out_nodes=groups GRAPH_DIRECTION=UNDIRECTED;
data_links_var from=company to=pairedcompany;
concomp;
run;

proc sql;
create table want as
select a.*, b.concomp as group
from 
	test as a left join
	groups as b on a.company=b.node;
quit;

Data want:

 

Company 	PairedCompany 	group
A 	C 	1
A 	G 	1
A 	F 	1
A 	E 	1
B 	J 	2
B 	I 	2
B 	H 	2
C 	G 	1
C 	F 	1
C 	E 	1
C 	A 	1
D 	L 	3
D 	K 	3
E 	C 	1
E 	A 	1
E 	G 	1
E 	F 	1
PG
Ksharp
Super User

If you do not have SAS/OR .

 

data test;
input Company$ PairedCompany$;
datalines;
A   C 
A   E 
A   F 
A   G 
B   H 
B   I 
B   J 
C   A 
C   E 
C   F 
C   G 
D   K 
D   L 
E   A 
E   C 
E   F 
E   G 
;


data have;
set test;
rename  Company=from PairedCompany=to;
run;
data full;
  set have end=last;
  if _n_ eq 1 then do;
   declare hash h();
    h.definekey('node');
     h.definedata('node');
     h.definedone();
  end;
  output;
  node=from; h.replace();
  from=to; to=node;
  output;
  node=from; h.replace();
  if last then h.output(dataset:'node');
  drop node;
run;


data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 20);
_ha.definekey('key');
_ha.definedone();

if 0 then set full;
declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
 from_to.definekey('from');
 from_to.definedata('to');
 from_to.definedone();

if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
 no.definekey('node');
 no.definedata('node');
 no.definedone();
 

do while(hi_no.next()=0);
 household+1; output;
 count=1;
 key=node;_ha.add();
 last=node;ha.add();
 rc=hi.first();
 do while(rc=0);
   from=last;rx=from_to.find();
   do while(rx=0);
     key=to;ry=_ha.check();
      if ry ne 0 then do;
       node=to;output;rr=no.remove(key:node);
       key=to;_ha.add();
       count+1;
       last=to;ha.add();
      end;
      rx=from_to.find_next();
   end;
   rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;

data final_want;
 if _n_=1 then do;
  if 0 then set want;
  declare hash h(dataset:'want');
  h.definekey('node');
  h.definedata('household');
  h.definedone();
 end;
set test;
h.find(key:company);
drop node;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 824 views
  • 2 likes
  • 4 in conversation