BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kb179208
Fluorite | Level 6

Hello all,

 

I am wondering if it is possible to select observations from a by group containing a specific set of dates. Take the following data for example...

 

ID    Date

1     2016

1     2017

1     2018

2     2016

2     2017

3     2016

3     2017

3     2018

 

Based on the data above, I want outputs where IDs can be matched to 2016, 2017, AND 2018 (See example output below)

 

ID    Date

1     2016

1     2017

1     2018

3     2016

3     2017

3     2018

 

 

Any help would be greatly appreciated. Thank you. 

 

P.S. I am using SAS EG version 7.1.

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20
data have;
input ID    Date;
cards;
1     2016
1     2017
1     2018
2     2016
2     2017
3     2016
3     2017
3     2018
;

proc sql;
create table want as
select *
from have
where id in (select id from have where date in (2016:2018) group by id having count(distinct date)=3);
quit;

View solution in original post

9 REPLIES 9
novinosrin
Tourmaline | Level 20
data have;
input ID    Date;
cards;
1     2016
1     2017
1     2018
2     2016
2     2017
3     2016
3     2017
3     2018
;

proc sql;
create table want as
select *
from have
where id in (select id from have where date in (2016:2018) group by id having count(distinct date)=3);
quit;
kb179208
Fluorite | Level 6

One additional question... say that I had duplicate values of some dates (e.g., ID:1 has 2 observations where the date is 2016. Is there a way to modify the code such that it works for these instances as well keeping duplicate values?

novinosrin
Tourmaline | Level 20

Hi @kb179208  Assuming what you mean is the following modified sample

data have;
input ID    Date;
cards;
1     2016
1	  2016
1     2017
1     2017
1     2018
1     2018
2     2016
2     2017
3     2016
3     2017
3     2018
;

The code doesn't exclude the dups in the final result. Is my understanding correct?

novinosrin
Tourmaline | Level 20

It's still the same code. Test and let me know

kb179208
Fluorite | Level 6

You are right, the code works for the example I provided so I should have been more clear. I have some IDs that may have 3 observations, but each with a date of 2016. I want to exclude these as they do not also contain date values of 2017 and 2018.

novinosrin
Tourmaline | Level 20

Basically the Idea to build the algorithm is simple:

PART1

1. Fiiter by subsetting ID's with dates falling in the range 2016:2018

2. 1 will have some ID's with all 3 or just 1 or 2 in the range . Example ID A may have 2016 10 times and only 2016. ID B may have all three occuring any number of times , ID C with 2017 and 2018 occuring any number of times

3. Within the subset, we do a "distinct count," i.e if the distinct count is equal 3 that means that ID must have all three 2016-2018 .

4. So this makes ID B our our target

PART2

5. Read all records from HAVE, the outer query and subset only those ID's that resulted from sub query(PART A)

 

I hope this helps you grasp the solution 🙂

 

 

kb179208
Fluorite | Level 6

I think I know what my problem is. I reformatted the DATE (mmddyyy) variable into a different variable DATE2, where it only shows the year. However, since I set DATE2=DATE it is still picking up the values as distinct.

 

Thank you once again for all the help!! I am not the best with SQL queries in sas.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1866 views
  • 4 likes
  • 2 in conversation