Hi, I have two datasets
data deal;
compressed_a = 'WYETH'; compressed_b = 'FORTDODGEANIMALHEALTH'; year = 1945;output;
compressed_a = 'ELILILLYANDCOMPANY'; compressed_b = 'HYBRITECHINCORPORATED'; year = 1985;output;
compressed_a = 'GENENTECH'; compressed_b = 'ABBOTTLABORATORIES'; year = 1990;output;
run;data patent;
firmname = 'GENENTECH'; pnyear = 1985; IPC = 'B05C'; output;
firmname = 'ELILILLYANDCOMPANY'; pnyear = 1983; IPC = 'A01N'; output;
run;table deal has the M&A data of two firms, including firm names (compressed_a, compressed_b) and the year of the deal was conducted.
table patent has the list of ipc codes of a firm has acquired.
my main question is this: How many number of patents does a company has, at a focal year (deal year).
expected output(ipc_a indicating number of patents, accumulative)
| compressed_a | compressed_d | year | ipc_a | ipc_b |
| WYETH | FORTDODGEANIMALHEALTH | 1945 | 0 | 0 |
| ELILILLYANDCOMPANY | HYBRITECHINCORPORATED | 1985 | 1 | 0 |
| GENENTECH | ABBOTTLABORATORIES | 1990 | 1 | 0 |
thus i have constructed a code as below:
data size_merge5;
if _n_=1 then do;
if 0 then set patent;
dcl hash H (dataset:'patent',multidata:'y') ;
h.definekey ("firmname") ;
h.definedata ("pnyear","ipc") ;
h.definedone () ;
end;
set deal;
IPC_A=0;
do rc=h.find(key:'compressed_a') by 0 while(rc=0);
if pnyear < year then IPC_A =sum(IPC_A,1);
rc=h.find_next();
end;
IPC_B=0;
do rc=h.find(key:'compressed_b') by 0 while(rc=0);
if pnyear < year then IPC_B =sum(IPC_B,1);
rc=h.find_next();
end;
run;but unfortunately this code gives me all zeros for ipc_a and ipc_b.
cannot find what's wrong. please help.
First, thank you for provide a complete program, with sample data.
There are two problems - one having to do with hash object method use, and the other with character variable length:
data deal;
length compressed_a compressed_b $24;
compressed_a = 'WYETH'; compressed_b = 'FORTDODGEANIMALHEALTH'; year = 1945;output;
compressed_a = 'ELILILLYANDCOMPANY'; compressed_b = 'HYBRITECHINCORPORATED'; year = 1985;output;
compressed_a = 'GENENTECH'; compressed_b = 'ABBOTTLABORATORIES'; year = 1990;output;
run;
data patent;
length firmname $24;
firmname = 'GENENTECH'; pnyear = 1985; IPC = 'B05C'; output;
firmname = 'ELILILLYANDCOMPANY'; pnyear = 1983; IPC = 'A01N'; output;
run;
data size_merge5;
if _n_=1 then do;
if 0 then set patent;
dcl hash H (dataset:'patent',multidata:'y') ;
h.definekey ("firmname") ;
h.definedata ("pnyear","ipc") ;
h.definedone () ;
end;
set deal;
IPC_A=0;
do rc=h.find(key:compressed_a) by 0 while(rc=0);
if pnyear < year then IPC_A =sum(IPC_A,1);
rc=h.find_next();
end;
IPC_B=0;
do rc=h.find(key:compressed_b) by 0 while(rc=0);
if pnyear < year then IPC_B =sum(IPC_B,1);
rc=h.find_next();
end;
run;
Why not just merge on based on compressed_a=firmname? Thats all you seem to want from that snippet? If it needs to match both, then merge twice (same datastep) and just rename the variable?
/* Note both assumes sorted */
data want;
merge deal
patent (rename=(firmname=compressed_a ipc=ipc1));
by compressed_a;
run;
data want;
merge want
patent (rename=(firmname=compress_b ipc=ipc2));
by compressed_b;
run;
First, thank you for provide a complete program, with sample data.
There are two problems - one having to do with hash object method use, and the other with character variable length:
data deal;
length compressed_a compressed_b $24;
compressed_a = 'WYETH'; compressed_b = 'FORTDODGEANIMALHEALTH'; year = 1945;output;
compressed_a = 'ELILILLYANDCOMPANY'; compressed_b = 'HYBRITECHINCORPORATED'; year = 1985;output;
compressed_a = 'GENENTECH'; compressed_b = 'ABBOTTLABORATORIES'; year = 1990;output;
run;
data patent;
length firmname $24;
firmname = 'GENENTECH'; pnyear = 1985; IPC = 'B05C'; output;
firmname = 'ELILILLYANDCOMPANY'; pnyear = 1983; IPC = 'A01N'; output;
run;
data size_merge5;
if _n_=1 then do;
if 0 then set patent;
dcl hash H (dataset:'patent',multidata:'y') ;
h.definekey ("firmname") ;
h.definedata ("pnyear","ipc") ;
h.definedone () ;
end;
set deal;
IPC_A=0;
do rc=h.find(key:compressed_a) by 0 while(rc=0);
if pnyear < year then IPC_A =sum(IPC_A,1);
rc=h.find_next();
end;
IPC_B=0;
do rc=h.find(key:compressed_b) by 0 while(rc=0);
if pnyear < year then IPC_B =sum(IPC_B,1);
rc=h.find_next();
end;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.