Hello,
I am trying to do this: as long as each customer_ID has records (vertically) with segment meeting the logic below, then we can have a type associated with the customer_ID.
For example A1 (in the table below) has different account ID (7 records) and in the 7 records, the segment has P2 and no T so that the A1 is Type A;
I am not sure with two parts:
proc sql;
create table flag_testing as
select a.customer_ID
,case
when segment in ('D','M','P1','P2') and segment ne 'T' then 'Type A'
when segment eq 'T' and not in ('D', 'M', 'P1', 'P2') then 'Type B'
when segment eq 'I' and not in ('D', 'M', 'P1', 'P2','T') then 'Type C'
when segment not in ('D', 'M', 'P1', 'P2','T','I') then 'Other'
when segment not in ( )and revene1>0 and revene2>0 then 'Type A'
when segment not in ( )and revene1<=0 and revene2<=0 then 'Type B'
else ' '
end as type
from nodup_sub_testing a
group by customer_ID;
quit;
Customer_ID | Account_ID | Record_Date | Segment | Revenue1 | Revenue2 |
A1 | 001 | 17-Dec-18 | N | 0 | 0 |
A1 | 001 | 0 | 0 | ||
A1 | 002 | 17-Apr-19 | N | 0 | 5000 |
A1 | 002 | 17-Feb-19 | M | 10000 | 4000 |
A1 | 003 | 1-Jan-20 | P2 | 4010 | 3000 |
A1 | 003 | 1-Feb-20 | N | 4600 | 4444 |
A1 | 003 | 2-Feb-20 | P2 | 4330 | 3509 |
A2 | 004 | 4-Apr-21 | I | 0 | 0 |
A2 | 004 | 18-Oct-21 | I | 0 | 0 |
A3 | 005 | 17-Aug-20 | 0 | 0 | |
A3 | 006 | 19-Oct-20 | N | 1356 | 1389 |
A3 | 007 | 17-Dec-18 | SS | 0 | 0 |
A3 | 008 | 21-Dec-20 | N | 0 | 0 |
B1 | 009 | 15-Apr-19 | N | 8997 | 7689 |
B1 | 009 | 19-Aug-19 | N | 758 | 456 |
B1 | 009 | 21-Oct-19 | N | -869 | -467 |
B1 | 009 | 16-Dec-19 | T | 0 | 0 |
B2 | 011 | 16-Oct-17 | I | -97 | -97 |
B2 | 011 | 18-Dec-17 | N | 0 | -98 |
B2 | 011 | 19-Feb-18 | Q | 0 | 0 |
B2 | 011 | 21-Apr-20 | Q | 0 | 0 |
B2 | 011 | 25-Jun-20 | Q | 0 | 0 |
B3 | 011 | 25-Jun-20 | T | 0 | 0 |
B3 | 012 | 16-Aug-21 | D | -50 | -900 |
B3 | 012 | 18-Oct-21 | T | -90 | -89 |
desired result:
Customer_ID | Type |
A1 | Type A |
A2 | Type C |
A3 | Other |
B1 | Type B |
B2 | Type C |
B3 | Type B |
Thanks a lot.
Here's how I'd do it via a data step, leveraging BY group processing.
1. Add an indicator variable for each of the TYPES, ('D', 'M', 'P1', 'P2','T','I'). As you iterate through each ID, set the values to 1 and retain it to the last row. So if one row as T then it will be 1 to the end of the ID.
2. Sum Revenue1 and Revenue 2 across the rows as well
3. On the last record of each ID, check your indicators and revenue variables for you criteria and set the type.
4. Output your ID and Type.
Give that a try and if you need help post the code you've managed to work out.
@septemberbulb wrote:
Hello,
I am trying to do this: as long as each customer_ID has records (vertically) with segment meeting the logic below, then we can have a type associated with the customer_ID.
For example A1 (in the table below) has different account ID (7 records) and in the 7 records, the segment has P2 and no T so that the A1 is Type A;
- if a client ID has records with segment in ('D', 'M', 'P2', 'P1') and not equal to ‘T’-> Type A
- if a client ID has records with segment equal to ‘T’ and none in ('D', 'M', 'P1', 'P2') -> Type B
- if a client ID has records with segment equal to 'I' and none in ('D', 'M', 'P1', 'P2',’T’) -> Type C
- if a client ID has no records with segment in ('D', 'M', 'P2', 'P1',’T’, 'I') -> Other
- if a client ID does not meet any of above requirements but revenue1 >0 and revenue2>0 ->Type A
- if a client ID does not meet any of above requirements but has revenue1 <=0 and revenue2<=0 -> Type B
I am not sure with two parts:
- how to write the 5/6 conditions (highlighted in orange) that illustrate the conditions that not meet any of above requirements
- I am using a case when but that one assigns a type for each row instead of looking at the records in a vertical view and then assign a type for client_ID
proc sql;
create table flag_testing as
select a.customer_ID
,case
when segment in ('D','M','P1','P2') and segment ne 'T' then 'Type A'
when segment eq 'T' and not in ('D', 'M', 'P1', 'P2') then 'Type B'
when segment eq 'I' and not in ('D', 'M', 'P1', 'P2','T') then 'Type C'
when segment not in ('D', 'M', 'P1', 'P2','T','I') then 'Other'
when segment not in ( )and revene1>0 and revene2>0 then 'Type A'
when segment not in ( )and revene1<=0 and revene2<=0 then 'Type B'
else ' '
end as type
from nodup_sub_testing a
group by customer_ID;
quit;
Customer_ID Account_ID Record_Date Segment Revenue1 Revenue2 A1 001 17-Dec-18 N 0 0 A1 001 0 0 A1 002 17-Apr-19 N 0 5000 A1 002 17-Feb-19 M 10000 4000 A1 003 1-Jan-20 P2 4010 3000 A1 003 1-Feb-20 N 4600 4444 A1 003 2-Feb-20 P2 4330 3509 A2 004 4-Apr-21 I 0 0 A2 004 18-Oct-21 I 0 0 A3 005 17-Aug-20 0 0 A3 006 19-Oct-20 N 1356 1389 A3 007 17-Dec-18 SS 0 0 A3 008 21-Dec-20 N 0 0 B1 009 15-Apr-19 N 8997 7689 B1 009 19-Aug-19 N 758 456 B1 009 21-Oct-19 N -869 -467 B1 009 16-Dec-19 T 0 0 B2 011 16-Oct-17 I -97 -97 B2 011 18-Dec-17 N 0 -98 B2 011 19-Feb-18 Q 0 0 B2 011 21-Apr-20 Q 0 0 B2 011 25-Jun-20 Q 0 0 B3 011 25-Jun-20 T 0 0 B3 012 16-Aug-21 D -50 -900 B3 012 18-Oct-21 T -90 -89
desired result:
Customer_ID Type A1 Type A A2 Type C A3 Other B1 Type B B2 Type C B3 Type B Thanks a lot.
Hi, I know I did not complete the last two scenarios so just put a blank bracket there. a data step is also fine.
I am concerned about this:
as long as each customer_ID has records (vertically) with segment meeting the logic below
If "records (vertically)" means that processing has to be done in a specific order then SQL may not be the proper approach. SQL deals with sets of records without any guarantee of sequence of processing.
If the rule involved is more of (description text NOT code)
when ALL segment values in ('D','M','P1','P2') and NO segment values = 'T' then 'Type A'
that is quite different than the record by record value assigned by the Case. Group by expects something more like:
When you use a GROUP BY clause, you also use an aggregate function in the SELECT clause or in a HAVING clause to instruct PROC SQL in how to summarize the data for each group.
Here's how I'd do it via a data step, leveraging BY group processing.
1. Add an indicator variable for each of the TYPES, ('D', 'M', 'P1', 'P2','T','I'). As you iterate through each ID, set the values to 1 and retain it to the last row. So if one row as T then it will be 1 to the end of the ID.
2. Sum Revenue1 and Revenue 2 across the rows as well
3. On the last record of each ID, check your indicators and revenue variables for you criteria and set the type.
4. Output your ID and Type.
Give that a try and if you need help post the code you've managed to work out.
@septemberbulb wrote:
Hello,
I am trying to do this: as long as each customer_ID has records (vertically) with segment meeting the logic below, then we can have a type associated with the customer_ID.
For example A1 (in the table below) has different account ID (7 records) and in the 7 records, the segment has P2 and no T so that the A1 is Type A;
- if a client ID has records with segment in ('D', 'M', 'P2', 'P1') and not equal to ‘T’-> Type A
- if a client ID has records with segment equal to ‘T’ and none in ('D', 'M', 'P1', 'P2') -> Type B
- if a client ID has records with segment equal to 'I' and none in ('D', 'M', 'P1', 'P2',’T’) -> Type C
- if a client ID has no records with segment in ('D', 'M', 'P2', 'P1',’T’, 'I') -> Other
- if a client ID does not meet any of above requirements but revenue1 >0 and revenue2>0 ->Type A
- if a client ID does not meet any of above requirements but has revenue1 <=0 and revenue2<=0 -> Type B
I am not sure with two parts:
- how to write the 5/6 conditions (highlighted in orange) that illustrate the conditions that not meet any of above requirements
- I am using a case when but that one assigns a type for each row instead of looking at the records in a vertical view and then assign a type for client_ID
proc sql;
create table flag_testing as
select a.customer_ID
,case
when segment in ('D','M','P1','P2') and segment ne 'T' then 'Type A'
when segment eq 'T' and not in ('D', 'M', 'P1', 'P2') then 'Type B'
when segment eq 'I' and not in ('D', 'M', 'P1', 'P2','T') then 'Type C'
when segment not in ('D', 'M', 'P1', 'P2','T','I') then 'Other'
when segment not in ( )and revene1>0 and revene2>0 then 'Type A'
when segment not in ( )and revene1<=0 and revene2<=0 then 'Type B'
else ' '
end as type
from nodup_sub_testing a
group by customer_ID;
quit;
Customer_ID Account_ID Record_Date Segment Revenue1 Revenue2 A1 001 17-Dec-18 N 0 0 A1 001 0 0 A1 002 17-Apr-19 N 0 5000 A1 002 17-Feb-19 M 10000 4000 A1 003 1-Jan-20 P2 4010 3000 A1 003 1-Feb-20 N 4600 4444 A1 003 2-Feb-20 P2 4330 3509 A2 004 4-Apr-21 I 0 0 A2 004 18-Oct-21 I 0 0 A3 005 17-Aug-20 0 0 A3 006 19-Oct-20 N 1356 1389 A3 007 17-Dec-18 SS 0 0 A3 008 21-Dec-20 N 0 0 B1 009 15-Apr-19 N 8997 7689 B1 009 19-Aug-19 N 758 456 B1 009 21-Oct-19 N -869 -467 B1 009 16-Dec-19 T 0 0 B2 011 16-Oct-17 I -97 -97 B2 011 18-Dec-17 N 0 -98 B2 011 19-Feb-18 Q 0 0 B2 011 21-Apr-20 Q 0 0 B2 011 25-Jun-20 Q 0 0 B3 011 25-Jun-20 T 0 0 B3 012 16-Aug-21 D -50 -900 B3 012 18-Oct-21 T -90 -89
desired result:
Customer_ID Type A1 Type A A2 Type C A3 Other B1 Type B B2 Type C B3 Type B Thanks a lot.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.