BookmarkSubscribeRSS Feed
lixuan
Obsidian | Level 7

Hi, I have problem with date format. Here is my dataset and my code. I dont think I made any mistake, and I cant find the cause. Pls help me , thanks.

data WORK.ONE;
  infile datalines dsd truncover;
  input CUSIP:$8. g:32. yymm_a:YYMMN6.;
  format yymm_a YYMMN6.;
  label CUSIP="CUSIP/SEDOL" g="g";
datalines4;
00036020,22.25,199801
00036020,17,199801
00036020,12,199801
00036020,15,199801
00036020,17,199801
00036020,15,199801
00036020,17,199801
00036020,12,199801
00036020,15,199801
00036110,19.6,199705
00036110,19.6,199810
00036110,11,199705
00036110,11,199705
00036110,16.75,199810
00036110,11,199705
00036110,16.8,199705
00036110,16.8,199810
00036110,16.6,200009
00036110,16.6,199810
00036110,15,200009
;;;;
data WORK.TWO;
  infile datalines dsd truncover;
  input CUSIP:$8. g:32. yymm_e:YYMMN6.;
  format yymm_e YYMMN6.;
  label CUSIP="CUSIP/SEDOL" g="g";
datalines4;
00036020,25,199201
00036020,25,199201
00036020,25,199201
00036020,25,199201
00036110,16.75,199706
00036110,16.6,199706
00036110,15,199706
00036110,15,199810
00036110,12.5,201004
00036110,12.5,200803
00036110,17.5,200803
00036110,17.5,200701
00036110,15.7,201004
00036110,12,201004
00036110,13.4,200009
00036110,13.4,199810
00036110,14.38,200803
00036110,14.38,200701
00036110,15,200803
00036110,15,200701
;;;;

proc sql;
create table three as select a.*,b.g as g1,b.yymm_e
from one a,
	 two b
where a.cusip=b.cusip and
	 b.yymm_e<= intnx('month',yymm_a,12,'same');
quit;
21 REPLIES 21
Reeza
Super User

What's the issue?

lixuan
Obsidian | Level 7

Hi Reeza, my issue is about date format. I have changed my date to date format, but I can't get the result I want 

Reeza
Super User

Step back for a second. 

What does this information (below) tell me? The code works, so this is a logic issue. But you've only shown the code logic, which works so we can't know what you want. 

 

Hi, I have problem with date format. Here is my dataset and my code. I dont think I made any mistake, and I cant find the cause. Pls help me , thanks.

 

Similarly with the following:

 

The results are strange

 

WIth only the information you've posted there, what response are you looking for?

lixuan
Obsidian | Level 7

Reeza, I want to merge the two databases. but the only same variable is cusip, and I am sure that yymm_e is no more than 12m after yymm_a, so I do this.

Reeza
Super User

@lixuan wrote:

 I am sure that yymm_e is no more than 12m after yymm_a


But you've only stated that it needs to be less than a date value. 

 

 b.yymm_e<= intnx('month',yymm_a,12,'same');

You'll join to any record that's less then the 12 months of the variable, which will end up with a large data set. What do you expect as output?

 

Unfortunately I need to log out shortly for a few days (vacation!!!) so you should spend some time building an example that shows what you have and what you need. Using my 'mind reading machine', since you still haven't explained your problem, are you trying to merge the data with the closest date in the data set or with multiple records that are within a specified range? 

lixuan
Obsidian | Level 7

Hi Reeza, I will check my logic again, thank you and have a good vacation.

lixuan
Obsidian | Level 7

The result is very strang, and the same data I used before in the intnx function. The result was  right.

SAS Output



Obs CUSIP g yymm_a g1 yymm_e 1 2 3 4 5 6 7 8 9 10 11 12 13
0003602022.2519980125.00199201
0003602022.2519980125.00199201
0003602022.2519980125.00199201
0003602022.2519980125.00199201
0003602017.0019980125.00199201
0003602017.0019980125.00199201
0003602017.0019980125.00199201
0003602017.0019980125.00199201
0003602012.0019980125.00199201
0003602012.0019980125.00199201
0003602012.0019980125.00199201
0003602012.0019980125.00199201
0003602015.0019980125.00199201



 
ballardw
Super User

Since your data set two has the first four records identical then anything that matches the first will match all 4. Similarly all the other duplicates is two will generate additional duplicate records.

 

instead of Intnx perhaps you want to look at using the the two dates in an INTCK function call and see if the result is between 0 and 12?

without an example of the desired output I'm not sure of something more specific

Shmuel
Garnet | Level 18

@lixuan, try and change your last step to:

 

proc sql;
create table three as select a.*,b.g as g1,b.yymm_e
from one a,
	 two b
where a.cusip=b.cusip and
	 abs(intck('month',yymm_a,yymm_e))  <= 12;
quit;
lixuan
Obsidian | Level 7

Sure, I understand that my problem is not the matter of date format, and I use the code to get the result. But there are too many duplicate observations producted by the rule of proc sql.  I only have variables cusip and yymm_  which have some connections . I will be very appreciated if any suggestion. Thanks

a.yymm_a<= b.yymm_e<= intnx('month',a.yymm_a,12,'same');
Shmuel
Garnet | Level 18

You did not marked is my previous posted code give the wanted results, and if not - what is wrong with it ?!

 

You ave not posted desired or expected results in order to check suggested code. Please do it - 

otherwise we are just guessing.

 

One more guess - try next code:

data want;
  merge one two;
  by;  /* concatenates datasets observations horizontally one by one */
if yymm_a <= yymm_e <= intnx('month',yymm_a,12,'same');
run;

Kurt_Bremser
Super User

There are 9 observations from one and 4 observations from two that mutually fulfill the condition of being within 12 months, so you get 9*4 = 36 observations in the output. That's what an outer join in SQL does.

 

So you need to refine your rule. Start by formulating it in plain language, then exercise your rule logically, and once it works, convert it to code.

Or you show the intended result you expected (from the tow datasets as posted in your OP), so we can work on that.

lixuan
Obsidian | Level 7

Hi, Thank you all for your patience. Now the problem has become the matter of merge which had also puzzled me for a long time. The following is my example an the result I  want. My logic is very simple: I just wanna the company and date are same in both  datasets, and roe, roa are as they were. I always find it's hard to control the result in 'many-many' merge. thanks& regards.

data one;
input company$ roe date;
cards;
A 0.5 1995
A 0.2 1994
B 0.6 1998
;
data two;
input company$ roa date;
cards;
A 0.2 1995
A 0.3 1996
A 0.8 1995
B 0.7 1998
B 0.5 1998
C 0.2 1995
;
*result; data want; input company$ roe roa date; cards; A 0.5 0.2 1995 A . 0.8 1995 B 0.6 0.7 1998 B . 0.5 1998 ;
Kurt_Bremser
Super User

@lixuan wrote:

Hi, Thank you all for your patience. Now the problem has become the matter of merge which had also puzzled me for a long time. The following is my example an the result I  want. My logic is very simple: I just wanna the company and date are same in both  datasets, and roe, roa are as they were. I always find it's hard to control the result in 'many-many' merge. thanks& regards.

data one;
input company$ roe date;
cards;
A 0.5 1995
A 0.2 1994
B 0.6 1998
;
data two;
input company$ roa date;
cards;
A 0.2 1995
A 0.3 1996
A 0.8 1995
B 0.7 1998
B 0.5 1998
C 0.2 1995
;
*result; data want; input company$ roe roa date; cards; A 0.5 0.2 1995 A . 0.8 1995 B 0.6 0.7 1998 B . 0.5 1998 ;

This can be done with a little trickery in a data step:

proc sort data=one;
by company date;
run;

proc sort data=two;
by company date;
run;

data want;
format company roe roa date; * strictly for order of variables;
roe = .;
roa = .; * these two statements override the automatic retain of variables in datasets;
merge
  one (in=a)
  two (in=b)
;
by company date;
if a and b;
run;

proc print data=want noobs;
run;

Result:

company    roe    roa    date

   A       0.5    0.2    1995
   A        .     0.8    1995
   B       0.6    0.7    1998
   B        .     0.5    1998

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 21 replies
  • 1747 views
  • 2 likes
  • 5 in conversation