BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
SAS_SB
Obsidian | Level 7

Thank you @SASJedi !

 

I changed the time period including the months. So, I am having two time periods: (i) March to May 2018 and (ii) June to August 2018 and trying to find if the clients were newly enrolled from (i) to (ii), exited from (i) to (ii), and retained in (ii). I would be grateful if you could review the code. 

 

I am a beginner with SAS, so apologies on silly questions.

 

proc sql;
create table prepost_q1 as
select id
	   , SUM(year=2018 and (month=3 or month=5))=0 and SUM(year=2018 and (month=6 or month=8))>0 as preq2_newenrol
	   , SUM(year=2018 and (month=3 or month=5))>0 and SUM(year=2018 and (month=6 or month=8))=0 as preq2_exited
       , SUM(year=2018 and (month=3 or month=5))>0 and SUM(year=2019 and (month=6 or month=8))>0 as preq2_retained	
	from work.prepost
	group by id
;
quit;
SASJedi
Ammonite | Level 13

As the question evolved here, the name of your input data set keeps changing as do the input data set vales. It makes it hard to refined the answer from previous work. So I'm going to provide both sample data and commented code here to help you apply this to your actual data. 

This code creates the input data set:

data prepost;
input id$ year month;
datalines;
1 2018 2
1 2018 3
1 2018 4
1 2019 2
1 2019 3
1 2019 4
2 2018 8
2 2018 9
2 2018 10
3 2018 3
3 2018 4
3 2018 5
3 2018 8
3 2018 9
3 2018 10
;

 Which looks like this:

id year month
1 2018 2
1 2018 3
1 2018 4
1 2019 2
1 2019 3
1 2019 4
2 2018 8
2 2018 9
2 2018 10
3 2018 3
3 2018 4
3 2018 5
3 2018 8
3 2018 9
3 2018 10

This code does the summarization:

proc sql;
create table prepost_q1 as
select id
		  /* Each logical expression produces a 1 if true, 0 if false */
	   ,sum( month in (3,5))=0 /* Not enrolled first */ 
		 AND 
		 sum( month in (6,8))>0 /* Enrolled second */ as preq2_newenrol

	   ,sum( month in (3,5))>0 /* Enrolled first */ 
		 AND 
		 sum( month in (6,8))=0 /* Not enrolled second */ as preq2_exited

	   ,sum( month in (3,5))>0 /* Enrolled first */ 
		 AND 
		 sum( month in (6,8))>0 /* Enrolled second */ as preq2_retained
	from work.prepost
	where year=2018
	group by id
;
quit; 

Producing this result:

id preq2_newenrol preq2_exited preq2_retained
1 0 1 0
2 1 0 0
3 0 0 1

 

 

 

 
Check out my Jedi SAS Tricks for SAS Users
SAS_SB
Obsidian | Level 7

Thanks @SASJedi ! The code worked for my data. Just to confirm, although we have multiple observations for each individual, the report is producing the values (1, 0) for unique individuals, right? 

SASJedi
Ammonite | Level 13

In the result table in my example, there is only 1 row produced for each unique ID value. In the SQL query SELECT statement, the only column note defined by a summary function is ID, and ID is the GROUP BY value. So no, this query should never produce more than one row per ID.

 

Check out my Jedi SAS Tricks for SAS Users
SAS_SB
Obsidian | Level 7

Hi @SASJedi,

I wanted to check which months the logic was picking up, by including both year and month variables in the code as:

proc sql;
create table prepost_q1 as
select id, year, month
		  /* Each logical expression produces a 1 if true, 0 if false */
	   ,sum( month in (3,5))=0 /* Not enrolled first */ 
		 AND 
		 sum( month in (6,8))>0 /* Enrolled second */ as preq2_newenrol

	   ,sum( month in (3,5))>0 /* Enrolled first */ 
		 AND 
		 sum( month in (6,8))=0 /* Not enrolled second */ as preq2_exited

	   ,sum( month in (3,5))>0 /* Enrolled first */ 
		 AND 
		 sum( month in (6,8))>0 /* Enrolled second */ as preq2_retained
	from work.prepost
	where year=2018
	group by id
;
quit;

What I saw was the logic was assigning values to unwanted observations as well. For an example, month=2 was assigned 1 for preq2_exited for id=1. Similarly, months 9 and 10 were assigned the value of 1 for id = 2 and 3. We were looking for data between the time periods, May to June 2018 and July to August 2018 and do not want to use other months as they will be present in different time periods like September to Nov 2018 and so on. How could I solve this issue?

 

The log file was giving this message:

"NOTE: The query requires remerging summary statistics back with the original data.
NOTE: Table WORK.PREPOST_Q1TT created, with 12 rows and 6 columns.

"

The table looks as:

SAS_SB_0-1675869833661.png

 

Is there anything that I missed or incorrectly wrote the code?

 

Many thanks!

 

 

SASJedi
Ammonite | Level 13

Any column in the SELECT list that is not being computed with a summary function must also be included in the GROUP BY clause, or you will get remerged results that are probably not what you desire.

Check out my Jedi SAS Tricks for SAS Users
SAS_SB
Obsidian | Level 7

Got it, thanks @SASJedi ! One quick question. Do "sum( month in (3,5))>0" and "sum( month in (3,4,5))" mean the same thing?

SASJedi
Ammonite | Level 13

>Do"sum( month in (3,4,5))>0" and "sum(month in (3,4,5))" mean the same thing?

No, they don't. The first will produce only a 1 (true) or a 0 (false) no matter how many rows were in the list, the second will provide a count of  number of rows in the group that met the criteria: Consider this demo data:

data demo;
	ID=1;
	month=1; output;
	month=2; output;
	do ID=2 to 3;
		do Month=1 to 6 by ID;
			output;
		end;
	end;
run;
Obs ID month
1 1 1
2 1 2
3 2 1
4 2 3
5 2 5
6 3 1
7 3 4

 

This SQL query demonstrates the difference:

ID sum(month in
(3,4,5))
sum(month in
(3,4,5))>0
1 0 0
2 2 1
3 1 1

 

This table kind of shows what's going on under the covers:

ID month In345 sum(month in (3,4,5)) sum(month in (3,4,5))>0 Comment
1 1 0      
1 2 0 0 0 No rows met the criteria for ID1
2 1 0      
2 3 1      
2 5 1 2 1 2 rows met the criteria for ID2
3 1 0      
3 4 1 1 1 1 row met the criteria for ID3

 

Check out my Jedi SAS Tricks for SAS Users
SAS_SB
Obsidian | Level 7

That was crystal clear! Thanks again, @SASJedi ! I encountered another challenge. I have two time periods: (i) Sep, Oct, Nov 2018 and (ii) Dec 2018, Jan, Feb 2019. I would also like to see the numbers of unique clients newly enrolled, exited, and retained from time period (i) to (ii). I used the following sample data:

 

data prepost;
input id$ year month;
datalines;
1 2018 10
1 2018 11
1 2018 12
1 2019 1
1 2019 2
2 2019 10
2 2019 11
2 2019 12
3 2018 3
3 2018 4
3 2018 5
3 2019 1
3 2019 12
3 2020 1
3 2020 2
4 2018 9
4 2018 10
4 2018 11
4 2018 12
4 2019 1
4 2019 2
5 2018 12
5 2019 1
5 2019 2
5 2019 3
5 2019 4
5 2019 5
5 2019 6
5 2019 7
5 2019 8
5 2019 9
5 2019 10
5 2019 11
5 2019 12
5 2020 1
5 2020 2
6 2019 1
6 2019 2
6 2019 3
6 2019 4
6 2019 5
6 2019 6
6 2019 7
6 2019 8
6 2019 9
6 2019 10
6 2019 11
6 2019 12
6 2020 1
6 2020 2
6 2020 3
6 2020 4
6 2020 5
6 2020 6
6 2020 7
6 2020 8
6 2020 9
6 2020 10
6 2020 11
6 2020 12
6 2021 1
6 2021 2
6 2021 3
6 2021 4
6 2021 5
6 2021 6
6 2021 7
6 2021 8
6 2021 19
6 2021 10
6 2021 11
6 2021 12
6 2022 1
6 2022 2
;

I modified the code you wrote to get the numbers but I could not get the desired result. The code is:

 

proc sql;
create table prepost_q6 as
select id
		  /* Each logical expression produces a 1 if true, 0 if false */
	   ,sum( month in (9,10,11))=0 AND year=2018 /* Not enrolled first */ 
		 AND 
		 (sum( month in (12))>0 AND year=2018
		 OR  
		sum ( month in (1,2))>0 AND year=2019) /* Enrolled second */ as preq6_newenrol
	   ,sum( month in (9,10,11))>0 AND year=2018 /* Enrolled first */ 
		 AND 
		 (sum( month in (12))=0 AND year=2018
			OR
		sum ( month in (1,2))=0 AND year=2019) /* Not enrolled second */ as preq6_exited

	   ,sum( month in (9,19,11))>0 /* Enrolled first */ 
		 AND 
		 (sum( month in (12))>0 AND year=2018
			OR
		sum ( month in (1,2))>0 AND year=2019) /* Enrolled second */ as preq6_retained
	from work.prepost
	group by id
;
quit; 

Could you please help me in getting the code correct to have the numbers of unique clients newly enrolled, exited, and retained from time period (i) to (ii)?

 

Thank you!

SAS_SB
Obsidian | Level 7

@SASJedi, I am waiting on your suggestions. I appreciate your help!

SASJedi
Ammonite | Level 13
I'm out of the country on holidays - will check back in after 3/13
Check out my Jedi SAS Tricks for SAS Users
PaigeMiller
Diamond | Level 26

Normally, it is frowned upon to post the same question more than once. Can someone please merge this with @SAS_SB other thread on this matter?

--
Paige Miller
SAS_SB
Obsidian | Level 7

I tried to merge with another thread, but was unable to do so. Is there a way we can merge threads? If you have any idea, please share. It would be helpful. Thanks.

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 28 replies
  • 5620 views
  • 6 likes
  • 4 in conversation