BookmarkSubscribeRSS Feed
ari
Quartz | Level 8 ari
Quartz | Level 8

dataset 1:

IDDATE
110/13/2010
110/20/2010
110/27/2010
111/3/2010
21/31/2011
21/31/2011
22/2/2011
22/7/2011
22/7/2011
37/26/2011
37/26/2011
37/28/2011
38/1/2011
38/2/2011
38/2/2011
38/4/2011
38/8/2011
38/9/2011

 

dataset 2:

IDCountstart end
1110/18/201010/18/2010
212/1/20112/6/2011
317/27/20117/27/2011
338/2/2011 

 

want:

IDDATECount
110/13/2010 
110/20/2010 
110/27/2010 
111/3/2010 
21/31/2011 
21/31/2011 
22/2/20111
22/7/2011 
22/7/2011 
37/26/2011 
37/26/2011 
37/28/2011 
38/1/2011 
38/2/20113
38/2/20113
38/4/20113
38/8/20113
38/9/20113

 

I want to merge the dataset1 and dataset2 by id and date, if the date in dataset1 falls in between start and end date of dataset2. in case if the end date is missing in the dataset2 then the count value should be retained across the rest of the observations in dataset1.

Any help to achieve this in sas?

Thanks

4 REPLIES 4
anoopmohandas7
Quartz | Level 8

Do you have a field that's unique for each date range (in dataset2). I mean in the data below you gave has ID that looks like the key but is having same value (3) for two different date range ?

PGStats
Opal | Level 21

If I understand the question :

 

data d1;
input ID	DATE :mmddyy.;
format date yymmdd10.;
datalines;
1	10/13/2010
1	10/20/2010
1	10/27/2010
1	11/3/2010
2	1/31/2011
2	1/31/2011
2	2/2/2011
2	2/7/2011
2	2/7/2011
3	7/26/2011
3	7/26/2011
3	7/28/2011
3	8/1/2011
3	8/2/2011
3	8/2/2011
3	8/4/2011
3	8/8/2011
3	8/9/2011
;

data d2;
input ID	Count	(start 	end) (:mmddyy10.);
format start end yymmdd10.;
datalines;
1	1	10/18/2010	10/18/2010
2	1	2/1/2011	2/6/2011
3	1	7/27/2011	7/27/2011
3	3	8/2/2011	 .
;

data d3;
set d1;
n = _n_;
run;

proc sql;
create table want as
select d3.id, d3.date, 
    max(case when date >= start and date <= coalesce(end, '01jan2100'd) 
        then count
        else . end) as count
from d3 left join d2 on d3.id=d2.id
group by n, d3.id, date;
select * from want;
quit;

You could use SUM instead of MAX if you want the sum of counts when multiple ranges match for a given id. 

PG
ari
Quartz | Level 8 ari
Quartz | Level 8

@PGStats: thanks. when i run it on the complete dataset the following error occurs.

WARNING: A GROUP BY clause has been transformed into an ORDER BY clause because neither the
SELECT clause nor the optional HAVING clause of the associated table-expression
referenced a summary function

 

I have checked the date format of all variables in both input datasets but  not sure why  this error is occuring. Any ideas?

PGStats
Opal | Level 21

This warning indicates that your query didn't include a max or sum summary function. Please post the code that you ran. 

PG

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 2224 views
  • 0 likes
  • 3 in conversation