Dear All,
I have used the following code to merge two data sets,
proc sql;
create table own.want
as select a.*, b.*
from own.sub_9110 as a left join own.dir_own as b
on a.cusip=b.cusip
and fy between intnx('year', yearfiled, -2, "sameday") and yearfiled;
quit;
I want to merge these two datasets by cusip (cusips are already in the same format) and pick up values from own.dir_own data( datasetb) which its fy(year variable in b) is exactly 2 years prior to the yearfield (year variable in a ).
in a, I have thecompany list with cusip and yearfield(like 2004, 2005,...not in month and day format) and in b I have all the director ownership for each company from 1992 till 2018, cusip and fy. So, this means that in b dataset I have more than one director ownership for each company;
Lets say company A in the data set a looks like:
name yearfield cusip .....
A 2004 12345678
company A in the data set b looks like:
name fy cusip dir_own ....
A 1992 12345678 45
A 1993 12345678 12
A 2002 12345678 20
A 2004 12345678 44
the final merged data set should look like this :
name cusip yearfiled fy dir_own
A 12345678 2004 2002 20
so it picked the director ownership which is 2 years prior to the yearfield;
but the code that i used grabbed all the observation, the final merged data looked like this :
name cusip yearfiled fy dir_own
A 12345678 2004 1992 45
A 12345678 2004 1993 12
A 12345678 2004 2002 20
A 12345678 2004 2004 44
I am not sure how to correct the code, hope you guys can help,
thanks in advance.
Ziba
First, your variables FY and YEARFILED do not appear to be sas date variables. Instead they look like 4-digit numbers (unless they are really sas date values, formatted as YEAR4.). So the INTNX function is useless. If will interpret the value 2004 as 27JUN1965 (2004 days after 01jan1960). And subtracting 2 years, via the INTNX function, will generate 27JUN1963. So you could replace
and fy between intnx('year', yearfiled, -2, "sameday") and yearfiled;
with
and fy between yearfiled -2 and yearfiled;
what if I formatted the date variables in Year4. format and should I still have to replace?
Which did you do?
I used the following code as I cant format the year variable in data set a. (they all became 1965 after I used year4. format ). this code generated record that are one year prior to yearfiled date.
proc sql;
create table own.own_patent_list
as select a.*, b.*
from own.sub_9110 as a left join own.dir_own as b
on a.cusip=b.cusip
and year=yearfiled-1;
quit;
but I still have one question: I also tried the second code you provided (year between yearfiled-3 and yearfiled-1;) to replace year=yearfiled-1, and then this time I have maximum of three records for each company. what if I want to pick the first nonzero value within those three rows of records for that company, and delete the rest?
I cant use if statement in pro sql, so any ideas ??
Thanks
No clue what your objective is, but looking only at your data and the want
data b;
input name $ fy cusip dir_own ;
cards ;
A 1992 12345678 45
A 1993 12345678 12
A 2002 12345678 20
A 2004 12345678 44
;
data a;
input name $ yearfield cusip ;
cards;
A 2004 12345678
;
proc sql;
create table want
as select a.*, fy,dir_own
from a as a left join b as b
on a.cusip=b.cusip and fy=yearfield-2;
quit;
The above will get you 2002 fy resulting in just 1 record
The below
on a.cusip=b.cusip and fy between yearfield -2 and yearfield;
will result in 2002 ,2004 with 2 records in result
You should show what the actual desired output should look like. You show what it shouldn't be but that doesn't say what should.
hey
sorry I didnt say it clearly
the is actual desired output:
name cusip yearfiled fy dir_own
A 12345678 2004 2002 20
hey ,
sorry I didnt say it clearly
the is the actual desired output:
name cusip yearfiled fy dir_own
A 12345678 2004 2002 20
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.