The subject of date format

lixuan · Posted 08-02-2017 04:25 PM

Hi, I have problem with date format. Here is my dataset and my code. I dont think I made any mistake, and I cant find the cause. Pls help me , thanks.

data WORK.ONE;
  infile datalines dsd truncover;
  input CUSIP:$8. g:32. yymm_a:YYMMN6.;
  format yymm_a YYMMN6.;
  label CUSIP="CUSIP/SEDOL" g="g";
datalines4;
00036020,22.25,199801
00036020,17,199801
00036020,12,199801
00036020,15,199801
00036020,17,199801
00036020,15,199801
00036020,17,199801
00036020,12,199801
00036020,15,199801
00036110,19.6,199705
00036110,19.6,199810
00036110,11,199705
00036110,11,199705
00036110,16.75,199810
00036110,11,199705
00036110,16.8,199705
00036110,16.8,199810
00036110,16.6,200009
00036110,16.6,199810
00036110,15,200009
;;;;
data WORK.TWO;
  infile datalines dsd truncover;
  input CUSIP:$8. g:32. yymm_e:YYMMN6.;
  format yymm_e YYMMN6.;
  label CUSIP="CUSIP/SEDOL" g="g";
datalines4;
00036020,25,199201
00036020,25,199201
00036020,25,199201
00036020,25,199201
00036110,16.75,199706
00036110,16.6,199706
00036110,15,199706
00036110,15,199810
00036110,12.5,201004
00036110,12.5,200803
00036110,17.5,200803
00036110,17.5,200701
00036110,15.7,201004
00036110,12,201004
00036110,13.4,200009
00036110,13.4,199810
00036110,14.38,200803
00036110,14.38,200701
00036110,15,200803
00036110,15,200701
;;;;

proc sql;
create table three as select a.*,b.g as g1,b.yymm_e
from one a,
	 two b
where a.cusip=b.cusip and
	 b.yymm_e<= intnx('month',yymm_a,12,'same');
quit;

Reeza · Posted 08-02-2017 04:28 PM

What's the issue?

lixuan · Posted 08-02-2017 04:31 PM

Hi Reeza, my issue is about date format. I have changed my date to date format, but I can't get the result I want

Reeza · Posted 08-02-2017 04:40 PM

Step back for a second.

What does this information (below) tell me? The code works, so this is a logic issue. But you've only shown the code logic, which works so we can't know what you want.

Hi, I have problem with date format. Here is my dataset and my code. I dont think I made any mistake, and I cant find the cause. Pls help me , thanks.

Similarly with the following:

The results are strange

WIth only the information you've posted there, what response are you looking for?

lixuan · Posted 08-02-2017 04:47 PM

Reeza, I want to merge the two databases. but the only same variable is cusip, and I am sure that yymm_e is no more than 12m after yymm_a, so I do this.

Reeza · Posted 08-02-2017 04:51 PM

@lixuan wrote:

I am sure that yymm_e is no more than 12m after yymm_a

But you've only stated that it needs to be less than a date value.

 b.yymm_e<= intnx('month',yymm_a,12,'same');

You'll join to any record that's less then the 12 months of the variable, which will end up with a large data set. What do you expect as output?

Unfortunately I need to log out shortly for a few days (vacation!!!) so you should spend some time building an example that shows what you have and what you need. Using my 'mind reading machine', since you still haven't explained your problem, are you trying to merge the data with the closest date in the data set or with multiple records that are within a specified range?

lixuan · Posted 08-02-2017 04:59 PM

Hi Reeza, I will check my logic again, thank you and have a good vacation.

lixuan · Posted 08-02-2017 04:37 PM

The result is very strang, and the same data I used before in the intnx function. The result was right.

SAS Output

Obs CUSIP g yymm_a g1 yymm_e 1 2 3 4 5 6 7 8 9 10 11 12 13

00036020	22.25	199801	25.00	199201
00036020	22.25	199801	25.00	199201
00036020	22.25	199801	25.00	199201
00036020	22.25	199801	25.00	199201
00036020	17.00	199801	25.00	199201
00036020	17.00	199801	25.00	199201
00036020	17.00	199801	25.00	199201
00036020	17.00	199801	25.00	199201
00036020	12.00	199801	25.00	199201
00036020	12.00	199801	25.00	199201
00036020	12.00	199801	25.00	199201
00036020	12.00	199801	25.00	199201
00036020	15.00	199801	25.00	199201

ballardw · Posted 08-02-2017 06:42 PM

Since your data set two has the first four records identical then anything that matches the first will match all 4. Similarly all the other duplicates is two will generate additional duplicate records.

instead of Intnx perhaps you want to look at using the the two dates in an INTCK function call and see if the result is between 0 and 12?

without an example of the desired output I'm not sure of something more specific

Shmuel · Posted 08-02-2017 10:25 PM

@lixuan, try and change your last step to:

proc sql;
create table three as select a.*,b.g as g1,b.yymm_e
from one a,
	 two b
where a.cusip=b.cusip and
	 abs(intck('month',yymm_a,yymm_e))  <= 12;
quit;

lixuan · Posted 08-03-2017 05:01 AM

Sure, I understand that my problem is not the matter of date format, and I use the code to get the result. But there are too many duplicate observations producted by the rule of proc sql. I only have variables cusip and yymm_ which have some connections . I will be very appreciated if any suggestion. Thanks

a.yymm_a<= b.yymm_e<= intnx('month',a.yymm_a,12,'same');

Shmuel · Posted 08-03-2017 05:26 AM

You did not marked is my previous posted code give the wanted results, and if not - what is wrong with it ?!

You ave not posted desired or expected results in order to check suggested code. Please do it -

otherwise we are just guessing.

One more guess - try next code:

data want;
  merge one two;
  by;  /* concatenates datasets observations horizontally one by one */
  if yymm_a <= yymm_e <= intnx('month',yymm_a,12,'same');
run;

h

Kurt_Bremser · Posted 08-03-2017 06:17 AM

There are 9 observations from one and 4 observations from two that mutually fulfill the condition of being within 12 months, so you get 9*4 = 36 observations in the output. That's what an outer join in SQL does.

So you need to refine your rule. Start by formulating it in plain language, then exercise your rule logically, and once it works, convert it to code.

Or you show the intended result you expected (from the tow datasets as posted in your OP), so we can work on that.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

lixuan · Posted 08-03-2017 08:39 AM

Hi, Thank you all for your patience. Now the problem has become the matter of merge which had also puzzled me for a long time. The following is my example an the result I want. My logic is very simple: I just wanna the company and date are same in both datasets, and roe, roa are as they were. I always find it's hard to control the result in 'many-many' merge. thanks& regards.

data one;
input company$ roe date;
cards;
A 0.5 1995
A 0.2 1994
B 0.6 1998
;
data two;
input company$ roa date;
cards;
A 0.2 1995
A 0.3 1996
A 0.8 1995
B 0.7 1998
B 0.5 1998
C 0.2 1995
;
*result;
data want;
input company$ roe roa date;
cards;
A 0.5 0.2 1995
A  .  0.8 1995
B 0.6 0.7 1998
B  .  0.5 1998
;

Kurt_Bremser · Posted 08-03-2017 09:41 AM

@lixuan wrote:

Hi, Thank you all for your patience. Now the problem has become the matter of merge which had also puzzled me for a long time. The following is my example an the result I want. My logic is very simple: I just wanna the company and date are same in both datasets, and roe, roa are as they were. I always find it's hard to control the result in 'many-many' merge. thanks& regards.
data one;
input company$ roe date;
cards;
A 0.5 1995
A 0.2 1994
B 0.6 1998
;
data two;
input company$ roa date;
cards;
A 0.2 1995
A 0.3 1996
A 0.8 1995
B 0.7 1998
B 0.5 1998
C 0.2 1995
;
*result;
data want;
input company$ roe roa date;
cards;
A 0.5 0.2 1995
A  .  0.8 1995
B 0.6 0.7 1998
B  .  0.5 1998
;

This can be done with a little trickery in a data step:

proc sort data=one;
by company date;
run;

proc sort data=two;
by company date;
run;

data want;
format company roe roa date; * strictly for order of variables;
roe = .;
roa = .; * these two statements override the automatic retain of variables in datasets;
merge
  one (in=a)
  two (in=b)
;
by company date;
if a and b;
run;

proc print data=want noobs;
run;

Result:

company    roe    roa    date

   A       0.5    0.2    1995
   A        .     0.8    1995
   B       0.6    0.7    1998
   B        .     0.5    1998

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

The subject of date format

Re: The subject of date format

Re: The subject of date format

Re: The subject of date format

Re: The subject of date format

Re: The subject of date format

Re: The subject of date format

Re: The subject of date format

Re: The subject of date format

Re: The subject of date format

Re: The subject of date format

Re: The subject of date format

Re: The subject of date format

Re: The subject of date format

Re: The subject of date format

SAS Innovate 2025: Call for Content

Classroom Training Available!