BookmarkSubscribeRSS Feed
Solph
Pyrite | Level 9

Hi,

I have two datasets. The main dataset contains start year and end year for a list of IDs and I need to 

1) identify for each ID if the person is present in a list of years I'm interested (thus the 2nd dataset) interest, and

2) then aggregate by year to count the number of persons present in a given year.

Step 1 is the one I have trouble with.

The example here is a simplified version: 1) the actual data has more than 100K cases, 2) the start and end points are actually in dates, not year, and 3) the list of years of interest has more than 150 date values. So I want to bring in those values from an external dataset (and run one at a time?)  

 

 

data have; input id startyr endyr;
datalines;
1 2015 2017
2 2019 2020
3 2017 2020
4 2018 2019
5 2019 2019
6 2017 2021
7 2020 2020
;
data year; input year;
datalines;
2016
2017
2019
2020
;

 

If I'm to write code one year at a time, then here it is, to pull a list of IDs whose start and end year fall between my year of interest (step 1)

 

 

%macro get (year);
proc sql; create table year2020 as select id, startyr, endyr
from have where startyr <= 2016 <= endyr;
quit;
%get(2016);
%get(2017);
%get(2020);

 

So for

with year= 2016, I'd pull out ID 1

with year= 2017, I'd pull out ID 1,3,6

with year= 2020, I'd pull out ID 2,3,6,7

 

Ultimately I'll aggregate each data to count and have my WANT data by year as the following:

 

Year  Count
2016   1
2017   3
2020   4

 

 

Thanks in advance for any tips you can give.

2 REPLIES 2
Peter_C
Rhodochrosite | Level 12
Might be a bit memory intensive for a full cartesian join in SQL, but hardly challenging for an array based approach in a data step
Kurt_Bremser
Super User

No need for a macro at all:

proc sql;
create table id_year as
  select have.id, year.year
  from have left join year
  on have.startyr le year.year le have.endyr
;
create table want as
  select year, count(id) as count
  from id_year
  group by year
;
quit;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 588 views
  • 0 likes
  • 3 in conversation