BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
septemberbulb
Obsidian | Level 7

Hello, 

I am trying to do this: as long as each customer_ID has records (vertically) with segment meeting the logic below, then we can have a type associated with the customer_ID. 

For example A1 (in the table below) has different account ID (7 records)  and in the 7 records, the segment has P2 and no T so that the A1 is Type A; 

  • if a client ID has records with segment in ('D', 'M', 'P2', 'P1') and not equal to ‘T’-> Type A
  • if a client ID has records with segment equal to ‘T’ and none in ('D', 'M', 'P1', 'P2') -> Type B
  • if a client ID has records with segment equal to 'I'  and none in ('D', 'M', 'P1', 'P2',’T’) -> Type C
  • if a client ID has no records with segment in ('D', 'M', 'P2', 'P1',’T’, 'I') -> Other
  • if a client ID does not meet any of above requirements but revenue1 >0 and revenue2>0 ->Type A
  • if a client ID does not meet any of above requirements but has  revenue1 <=0 and revenue2<=0 -> Type B

I am not sure with two parts:

  1. how to write the 5/6 conditions (highlighted in orange) that illustrate the conditions that not meet any of above requirements
  2. I am using a case when but that one assigns a type for each row instead of looking at the records in a vertical view and then assign a type for client_ID

proc sql;
create table flag_testing as
select a.customer_ID
,case
when segment in ('D','M','P1','P2') and segment ne 'T' then 'Type A'
when segment eq 'T' and not in ('D', 'M', 'P1', 'P2') then 'Type B'
when segment eq 'I' and not in ('D', 'M', 'P1', 'P2','T') then 'Type C'
when segment not in ('D', 'M', 'P1', 'P2','T','I') then 'Other'
when segment not in ( )and revene1>0 and revene2>0 then 'Type A'
when segment not in ( )and revene1<=0 and revene2<=0 then 'Type B'
else ' '
end as type
from nodup_sub_testing a
group by customer_ID;
quit;

 

Customer_IDAccount_IDRecord_DateSegmentRevenue1Revenue2
A100117-Dec-18N00
A1001  00
A100217-Apr-19N05000
A100217-Feb-19M100004000
A10031-Jan-20P240103000
A10031-Feb-20N46004444
A10032-Feb-20P243303509
A20044-Apr-21I00
A200418-Oct-21I00
A300517-Aug-20 00
A300619-Oct-20N13561389
A300717-Dec-18SS00
A300821-Dec-20N00
B100915-Apr-19N89977689
B100919-Aug-19N758456
B100921-Oct-19N-869-467
B100916-Dec-19T00
B201116-Oct-17I-97-97
B201118-Dec-17N0-98
B201119-Feb-18Q00
B201121-Apr-20Q00
B201125-Jun-20Q00
B301125-Jun-20T00
B301216-Aug-21D-50-900
B301218-Oct-21T-90-89

 

desired result:

Customer_IDType
A1Type A
A2Type C
A3Other
B1Type B
B2Type C
B3Type B

Thanks a lot. 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Here's how I'd do it via a data step, leveraging BY group processing. 

 

1. Add an indicator variable for each of the TYPES, ('D', 'M', 'P1', 'P2','T','I'). As you iterate through each ID, set the values to 1 and retain it to the last row. So if one row as T then it will be 1 to the end of the ID.

2. Sum Revenue1 and Revenue 2 across the rows as well

3. On the last record of each ID, check your indicators and revenue variables for you criteria and set the type. 

4. Output your ID and Type. 

 

Give that a try and if you need help post the code you've managed to work out. 

 

 


@septemberbulb wrote:

Hello, 

I am trying to do this: as long as each customer_ID has records (vertically) with segment meeting the logic below, then we can have a type associated with the customer_ID. 

For example A1 (in the table below) has different account ID (7 records)  and in the 7 records, the segment has P2 and no T so that the A1 is Type A; 

  • if a client ID has records with segment in ('D', 'M', 'P2', 'P1') and not equal to ‘T’-> Type A
  • if a client ID has records with segment equal to ‘T’ and none in ('D', 'M', 'P1', 'P2') -> Type B
  • if a client ID has records with segment equal to 'I'  and none in ('D', 'M', 'P1', 'P2',’T’) -> Type C
  • if a client ID has no records with segment in ('D', 'M', 'P2', 'P1',’T’, 'I') -> Other
  • if a client ID does not meet any of above requirements but revenue1 >0 and revenue2>0 ->Type A
  • if a client ID does not meet any of above requirements but has  revenue1 <=0 and revenue2<=0 -> Type B

I am not sure with two parts:

  1. how to write the 5/6 conditions (highlighted in orange) that illustrate the conditions that not meet any of above requirements
  2. I am using a case when but that one assigns a type for each row instead of looking at the records in a vertical view and then assign a type for client_ID

proc sql;
create table flag_testing as
select a.customer_ID
,case
when segment in ('D','M','P1','P2') and segment ne 'T' then 'Type A'
when segment eq 'T' and not in ('D', 'M', 'P1', 'P2') then 'Type B'
when segment eq 'I' and not in ('D', 'M', 'P1', 'P2','T') then 'Type C'
when segment not in ('D', 'M', 'P1', 'P2','T','I') then 'Other'
when segment not in ( )and revene1>0 and revene2>0 then 'Type A'
when segment not in ( )and revene1<=0 and revene2<=0 then 'Type B'
else ' '
end as type
from nodup_sub_testing a
group by customer_ID;
quit;

 

Customer_ID Account_ID Record_Date Segment Revenue1 Revenue2
A1 001 17-Dec-18 N 0 0
A1 001     0 0
A1 002 17-Apr-19 N 0 5000
A1 002 17-Feb-19 M 10000 4000
A1 003 1-Jan-20 P2 4010 3000
A1 003 1-Feb-20 N 4600 4444
A1 003 2-Feb-20 P2 4330 3509
A2 004 4-Apr-21 I 0 0
A2 004 18-Oct-21 I 0 0
A3 005 17-Aug-20   0 0
A3 006 19-Oct-20 N 1356 1389
A3 007 17-Dec-18 SS 0 0
A3 008 21-Dec-20 N 0 0
B1 009 15-Apr-19 N 8997 7689
B1 009 19-Aug-19 N 758 456
B1 009 21-Oct-19 N -869 -467
B1 009 16-Dec-19 T 0 0
B2 011 16-Oct-17 I -97 -97
B2 011 18-Dec-17 N 0 -98
B2 011 19-Feb-18 Q 0 0
B2 011 21-Apr-20 Q 0 0
B2 011 25-Jun-20 Q 0 0
B3 011 25-Jun-20 T 0 0
B3 012 16-Aug-21 D -50 -900
B3 012 18-Oct-21 T -90 -89

 

desired result:

Customer_ID Type
A1 Type A
A2 Type C
A3 Other
B1 Type B
B2 Type C
B3 Type B

Thanks a lot. 


 

View solution in original post

4 REPLIES 4
Reeza
Super User
That code doesn't look correct for starters, are you sure that's the code you're running?
Are you set on using SQL or is a data step an option?
septemberbulb
Obsidian | Level 7

Hi, I know I did not complete the last two scenarios so just put a blank bracket there. a data step is also fine. 

ballardw
Super User

I am concerned about this:

as long as each customer_ID has records (vertically) with segment meeting the logic below

If "records (vertically)" means that processing has to be done in a specific order then SQL may not be the proper approach. SQL deals with sets of records without any guarantee of sequence of processing.

If the rule involved is more of (description text NOT code)

when ALL segment values in ('D','M','P1','P2') and NO segment values = 'T' then 'Type A'

that is quite different than the record by record value assigned by the Case. Group by expects something more like:

When you use a GROUP BY clause, you also use an aggregate function in the SELECT clause or in a HAVING clause to instruct PROC SQL in how to summarize the data for each group.
 
Your code is not doing any summarization by group values. (emphasis added for aggregate and summarize)
 
I think that you may want to consider a data step with By processing retaining some variables that let you know if the current record meets conditions and then set a final status at the end.
Reeza
Super User

Here's how I'd do it via a data step, leveraging BY group processing. 

 

1. Add an indicator variable for each of the TYPES, ('D', 'M', 'P1', 'P2','T','I'). As you iterate through each ID, set the values to 1 and retain it to the last row. So if one row as T then it will be 1 to the end of the ID.

2. Sum Revenue1 and Revenue 2 across the rows as well

3. On the last record of each ID, check your indicators and revenue variables for you criteria and set the type. 

4. Output your ID and Type. 

 

Give that a try and if you need help post the code you've managed to work out. 

 

 


@septemberbulb wrote:

Hello, 

I am trying to do this: as long as each customer_ID has records (vertically) with segment meeting the logic below, then we can have a type associated with the customer_ID. 

For example A1 (in the table below) has different account ID (7 records)  and in the 7 records, the segment has P2 and no T so that the A1 is Type A; 

  • if a client ID has records with segment in ('D', 'M', 'P2', 'P1') and not equal to ‘T’-> Type A
  • if a client ID has records with segment equal to ‘T’ and none in ('D', 'M', 'P1', 'P2') -> Type B
  • if a client ID has records with segment equal to 'I'  and none in ('D', 'M', 'P1', 'P2',’T’) -> Type C
  • if a client ID has no records with segment in ('D', 'M', 'P2', 'P1',’T’, 'I') -> Other
  • if a client ID does not meet any of above requirements but revenue1 >0 and revenue2>0 ->Type A
  • if a client ID does not meet any of above requirements but has  revenue1 <=0 and revenue2<=0 -> Type B

I am not sure with two parts:

  1. how to write the 5/6 conditions (highlighted in orange) that illustrate the conditions that not meet any of above requirements
  2. I am using a case when but that one assigns a type for each row instead of looking at the records in a vertical view and then assign a type for client_ID

proc sql;
create table flag_testing as
select a.customer_ID
,case
when segment in ('D','M','P1','P2') and segment ne 'T' then 'Type A'
when segment eq 'T' and not in ('D', 'M', 'P1', 'P2') then 'Type B'
when segment eq 'I' and not in ('D', 'M', 'P1', 'P2','T') then 'Type C'
when segment not in ('D', 'M', 'P1', 'P2','T','I') then 'Other'
when segment not in ( )and revene1>0 and revene2>0 then 'Type A'
when segment not in ( )and revene1<=0 and revene2<=0 then 'Type B'
else ' '
end as type
from nodup_sub_testing a
group by customer_ID;
quit;

 

Customer_ID Account_ID Record_Date Segment Revenue1 Revenue2
A1 001 17-Dec-18 N 0 0
A1 001     0 0
A1 002 17-Apr-19 N 0 5000
A1 002 17-Feb-19 M 10000 4000
A1 003 1-Jan-20 P2 4010 3000
A1 003 1-Feb-20 N 4600 4444
A1 003 2-Feb-20 P2 4330 3509
A2 004 4-Apr-21 I 0 0
A2 004 18-Oct-21 I 0 0
A3 005 17-Aug-20   0 0
A3 006 19-Oct-20 N 1356 1389
A3 007 17-Dec-18 SS 0 0
A3 008 21-Dec-20 N 0 0
B1 009 15-Apr-19 N 8997 7689
B1 009 19-Aug-19 N 758 456
B1 009 21-Oct-19 N -869 -467
B1 009 16-Dec-19 T 0 0
B2 011 16-Oct-17 I -97 -97
B2 011 18-Dec-17 N 0 -98
B2 011 19-Feb-18 Q 0 0
B2 011 21-Apr-20 Q 0 0
B2 011 25-Jun-20 Q 0 0
B3 011 25-Jun-20 T 0 0
B3 012 16-Aug-21 D -50 -900
B3 012 18-Oct-21 T -90 -89

 

desired result:

Customer_ID Type
A1 Type A
A2 Type C
A3 Other
B1 Type B
B2 Type C
B3 Type B

Thanks a lot. 


 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 319 views
  • 1 like
  • 3 in conversation