BookmarkSubscribeRSS Feed
Ziba
Calcite | Level 5

Dear All, 

I have used the following code to merge two data sets, 

proc sql;
create table own.want
as select a.*, b.*
from own.sub_9110 as a left join  own.dir_own as b
on a.cusip=b.cusip
and fy between intnx('year', yearfiled, -2, "sameday") and yearfiled;
quit;

I want to merge these two datasets by cusip (cusips are already in the same format) and pick up values from own.dir_own data( datasetb) which its fy(year variable in b) is exactly 2 years prior to the yearfield (year variable in a ). 

in a, I have thecompany list with cusip and yearfield(like 2004, 2005,...not in month and day format) and in b I have all the director ownership for each company from 1992 till 2018, cusip and fy. So, this means that in b dataset I have more than one director ownership for each company;

Lets say company A in the data set a looks like:

name       yearfield     cusip   .....

A                2004          12345678

company A in the data set b looks like:

 name          fy          cusip                dir_own ....

A                 1992      12345678            45

A                 1993      12345678            12

A                 2002     12345678            20

A                2004     12345678            44

 

the final merged data set should look like this :

name     cusip            yearfiled     fy         dir_own

A             12345678     2004         2002    20

so it picked the director ownership which is 2 years prior to the yearfield;

but the code that i used grabbed all the observation, the final merged data looked like this :

name     cusip            yearfiled      fy         dir_own

A             12345678     2004         1992    45

A             12345678     2004         1993    12

A             12345678     2004         2002    20

A             12345678     2004         2004    44

I am not sure how to correct the code, hope you guys can help, 

thanks in advance. 

Ziba 

 

8 REPLIES 8
mkeintz
PROC Star

First, your variables FY and YEARFILED do not appear to be sas date variables.  Instead they look like 4-digit numbers (unless they are really sas date values, formatted as YEAR4.).  So the INTNX function is useless.  If will interpret the value 2004 as 27JUN1965 (2004 days after 01jan1960).  And subtracting 2 years, via the INTNX function, will generate 27JUN1963.  So you could replace

 

and fy between intnx('year', yearfiled, -2, "sameday") and yearfiled;

with

and fy between yearfiled -2  and yearfiled;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ziba
Calcite | Level 5

what if I formatted the date variables in Year4. format and should I still have to replace?

 

mkeintz
PROC Star

Which did you do?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ziba
Calcite | Level 5

I used the following code as I cant format the year variable in data set a. (they all became 1965 after I used year4. format ). this code generated record that are one year prior to yearfiled date.

proc sql;
create table own.own_patent_list
as select a.*, b.*
from own.sub_9110 as a left join  own.dir_own as b
on a.cusip=b.cusip
and year=yearfiled-1;
quit;

but I still have one question: I also tried  the second code you provided (year between yearfiled-3 and yearfiled-1;) to replace year=yearfiled-1, and then this time I have maximum of three records for each company. what if I want to pick the first nonzero value within those three rows of records for that company, and delete the rest?

I cant use if statement in pro sql, so any ideas ?? 

Thanks 

novinosrin
Tourmaline | Level 20

No clue what your objective is, but looking only at your data and the want

 

data b;
input name   $       fy          cusip                dir_own ;
cards ;
A                 1992      12345678            45
A                 1993      12345678            12
A                 2002     12345678            20
A                2004     12345678            44
;

data a;
input name    $   yearfield     cusip  ;
cards;
A                2004          12345678
;

proc sql;
create table want
as select a.*, fy,dir_own
from a as a left join  b as b
on a.cusip=b.cusip and  fy=yearfield-2;
quit;

The above will get you 2002 fy resulting in just 1 record

 

The below 

on a.cusip=b.cusip and fy between yearfield -2  and yearfield;

will result in 2002 ,2004 with 2 records in result

ballardw
Super User

You should show what the actual desired output should look like. You show what it shouldn't be but that doesn't say what should.

Ziba
Calcite | Level 5

hey 

sorry I didnt say it clearly 

the is actual desired output:

name     cusip            yearfiled     fy         dir_own

A             12345678     2004         2002    20

Ziba
Calcite | Level 5

hey , 

sorry I didnt say it clearly 

the is the actual desired output:

name        cusip             yearfiled     fy           dir_own

A             12345678       2004           2002        20

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 756 views
  • 0 likes
  • 4 in conversation