BookmarkSubscribeRSS Feed
NN
Quartz | Level 8 NN
Quartz | Level 8
Guys,
I have 2 data sets. one contains ID , Indate , outdate
The other contains Leave_dates

I wish to find for each id in DATA1 the count of leave dates in data2 which are greater than or equal to indate and less than or equal to outdate.I could do this in PROC SQL bu the DATA size is very large and it takes hours to process hence i was hoping if this could be done in the data step to help reduce the processing time.
Please help

data1
ID INDATE OUTDATE
1 01oct2009 06oct2009
2 01oct2009 03oct2009

DATA2
LEAVE_DATES
01oct2009
02oct2009
04oct2009
05oct2009

Message was edited by: NN Message was edited by: NN
4 REPLIES 4
deleted_user
Not applicable
proc sort data=data2; by leavedate;

data counted;
set data1;

count = 0;
do i=1 to N;
set data2 nobs=N;
if indate <= leavedate <= outdate then count+1;
end;
output;
run;

Or something like that.
NN
Quartz | Level 8 NN
Quartz | Level 8
Thanks for the reply chuck
but is there something missing in the Above code bcause
The Output that i was looking for was

ID COUNT_OF_LEAVEDATES
1 4
2 2
DanielSantos
Barite | Level 11
Hello NN.

Being DATA1 the large table and DATA2 a very small one, you could force an index iteration between the the two and perform the count, like this:

* create "phoney" index on DATA2;
data data3 (index=(IDX));
set data2;
IDX=0;
run;

* do counts;
data counts;
set data1;
keep ID COUNT;
IDX=0; COUNT=0; * init index and counter;
do until (IDX eq 2); * do until no more matches;
set data3 key=IDX; * match;
if not _ERROR_ and LEAVE_DATES le OUTDATE then do;
if INDATE le LEAVE_DATES then COUNT+1; * indate/outdate match;
end;
else IDX+1; * reset phoney index;
end;
output; * output total;
_ERROR_=0; * reset _ERROR_ flag;
run;

This is a technique that I've been using for some years (before hashing appeared into SAS) with great success for a big table X very small table match scenario with N to N matching relationship. Do not try to do this with a DATA2 with more then 1000 rows, performance will be disastrous.

Another way of doing the same, would be to load the smaller table (DATA2) into a hash iterator object and iterate through it for each observation of the large dataset (DATA1).
Here you could have a DATA2 table up to 1.5M without compromising performance.

Check the online doc for the hash iterator object here:
http://support.sas.com/documentation/cdl/en/lrdict/62618/HTML/default/a002576871.htm

Cheers from Portugal.

Daniel Santos @ www.cgd.pt
deleted_user
Not applicable
I didn't wrap the code in '\[pre\] ... \[/pre\]' so it got cut off in the posting.

[pre]
count = 0;
do i=1 to N;
set data2 nobs=N;
if indate LE leave_dates LE outdate then count+1;
end;
[/pre]

I was using angle brackets and equal signs, and they weren't making it through the html stuff, so I replaced them with 'LE'.

But this isn't perfect because you won't necessarily run all the way through data2 each time. But, this should point you in the right direction. You may need to use "FETCH" or observation pointers with data2 to insure stepping through it each time. REad through the documentation for base SAS statements. If you have PC SAS, then you can get there through "help" otherwise, you can get the documentation via the SAS support site. That's how I learn what I can do with SAS, and I've been programming in SAS since 1996, and am still picking up new stuff fairly regularly.

Message was edited by: Chuck Message was edited by: Chuck

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 675 views
  • 0 likes
  • 3 in conversation