Solved: Re: Sincerely asking about how to define a variable when hit different...

septemberbulb · Posted 06-07-2022 11:30 AM

Hello,

I am trying to do this: as long as each customer_ID has records (vertically) with segment meeting the logic below, then we can have a type associated with the customer_ID.

For example A1 (in the table below) has different account ID (7 records) and in the 7 records, the segment has P2 and no T so that the A1 is Type A;

if a client ID has records with segment in ('D', 'M', 'P2', 'P1') and not equal to ‘T’-> Type A
if a client ID has records with segment equal to ‘T’ and none in ('D', 'M', 'P1', 'P2') -> Type B
if a client ID has records with segment equal to 'I' and none in ('D', 'M', 'P1', 'P2',’T’) -> Type C
if a client ID has no records with segment in ('D', 'M', 'P2', 'P1',’T’, 'I') -> Other
if a client ID does not meet any of above requirements but revenue1 >0 and revenue2>0 ->Type A
if a client ID does not meet any of above requirements but has revenue1 <=0 and revenue2<=0 -> Type B

I am not sure with two parts:

how to write the 5/6 conditions (highlighted in orange) that illustrate the conditions that not meet any of above requirements
I am using a case when but that one assigns a type for each row instead of looking at the records in a vertical view and then assign a type for client_ID

proc sql;
create table flag_testing as
select a.customer_ID
,case
when segment in ('D','M','P1','P2') and segment ne 'T' then 'Type A'
when segment eq 'T' and not in ('D', 'M', 'P1', 'P2') then 'Type B'
when segment eq 'I' and not in ('D', 'M', 'P1', 'P2','T') then 'Type C'
when segment not in ('D', 'M', 'P1', 'P2','T','I') then 'Other'
when segment not in ( )and revene1>0 and revene2>0 then 'Type A'
when segment not in ( )and revene1<=0 and revene2<=0 then 'Type B'
else ' '
end as type
from nodup_sub_testing a
group by customer_ID;
quit;

Customer_ID	Account_ID	Record_Date	Segment	Revenue1	Revenue2
A1	001	17-Dec-18	N	0	0
A1	001			0	0
A1	002	17-Apr-19	N	0	5000
A1	002	17-Feb-19	M	10000	4000
A1	003	1-Jan-20	P2	4010	3000
A1	003	1-Feb-20	N	4600	4444
A1	003	2-Feb-20	P2	4330	3509
A2	004	4-Apr-21	I	0	0
A2	004	18-Oct-21	I	0	0
A3	005	17-Aug-20		0	0
A3	006	19-Oct-20	N	1356	1389
A3	007	17-Dec-18	SS	0	0
A3	008	21-Dec-20	N	0	0
B1	009	15-Apr-19	N	8997	7689
B1	009	19-Aug-19	N	758	456
B1	009	21-Oct-19	N	-869	-467
B1	009	16-Dec-19	T	0	0
B2	011	16-Oct-17	I	-97	-97
B2	011	18-Dec-17	N	0	-98
B2	011	19-Feb-18	Q	0	0
B2	011	21-Apr-20	Q	0	0
B2	011	25-Jun-20	Q	0	0
B3	011	25-Jun-20	T	0	0
B3	012	16-Aug-21	D	-50	-900
B3	012	18-Oct-21	T	-90	-89

desired result:

Customer_ID	Type
A1	Type A
A2	Type C
A3	Other
B1	Type B
B2	Type C
B3	Type B

Thanks a lot.

Reeza · Posted 06-07-2022 03:19 PM

Here's how I'd do it via a data step, leveraging BY group processing.

1. Add an indicator variable for each of the TYPES, ('D', 'M', 'P1', 'P2','T','I'). As you iterate through each ID, set the values to 1 and retain it to the last row. So if one row as T then it will be 1 to the end of the ID.

2. Sum Revenue1 and Revenue 2 across the rows as well

3. On the last record of each ID, check your indicators and revenue variables for you criteria and set the type.

4. Output your ID and Type.

Give that a try and if you need help post the code you've managed to work out.

@septemberbulb wrote:

Hello,

I am trying to do this: as long as each customer_ID has records (vertically) with segment meeting the logic below, then we can have a type associated with the customer_ID.

For example A1 (in the table below) has different account ID (7 records) and in the 7 records, the segment has P2 and no T so that the A1 is Type A;

if a client ID has records with segment in ('D', 'M', 'P2', 'P1') and not equal to ‘T’-> Type A

if a client ID has records with segment equal to ‘T’ and none in ('D', 'M', 'P1', 'P2') -> Type B

if a client ID has records with segment equal to 'I' and none in ('D', 'M', 'P1', 'P2',’T’) -> Type C

if a client ID has no records with segment in ('D', 'M', 'P2', 'P1',’T’, 'I') -> Other

if a client ID does not meet any of above requirements but revenue1 >0 and revenue2>0 ->Type A

if a client ID does not meet any of above requirements but has revenue1 <=0 and revenue2<=0 -> Type B

I am not sure with two parts:

how to write the 5/6 conditions (highlighted in orange) that illustrate the conditions that not meet any of above requirements

I am using a case when but that one assigns a type for each row instead of looking at the records in a vertical view and then assign a type for client_ID

proc sql;
create table flag_testing as
select a.customer_ID
,case
when segment in ('D','M','P1','P2') and segment ne 'T' then 'Type A'
when segment eq 'T' and not in ('D', 'M', 'P1', 'P2') then 'Type B'
when segment eq 'I' and not in ('D', 'M', 'P1', 'P2','T') then 'Type C'
when segment not in ('D', 'M', 'P1', 'P2','T','I') then 'Other'
when segment not in ( )and revene1>0 and revene2>0 then 'Type A'
when segment not in ( )and revene1<=0 and revene2<=0 then 'Type B'
else ' '
end as type
from nodup_sub_testing a
group by customer_ID;
quit;

Customer_ID Account_ID Record_Date Segment Revenue1 Revenue2

A1 001 17-Dec-18 N 0 0

A1 001 0 0

A1 002 17-Apr-19 N 0 5000

A1 002 17-Feb-19 M 10000 4000

A1 003 1-Jan-20 P2 4010 3000

A1 003 1-Feb-20 N 4600 4444

A1 003 2-Feb-20 P2 4330 3509

A2 004 4-Apr-21 I 0 0

A2 004 18-Oct-21 I 0 0

A3 005 17-Aug-20 0 0

A3 006 19-Oct-20 N 1356 1389

A3 007 17-Dec-18 SS 0 0

A3 008 21-Dec-20 N 0 0

B1 009 15-Apr-19 N 8997 7689

B1 009 19-Aug-19 N 758 456

B1 009 21-Oct-19 N -869 -467

B1 009 16-Dec-19 T 0 0

B2 011 16-Oct-17 I -97 -97

B2 011 18-Dec-17 N 0 -98

B2 011 19-Feb-18 Q 0 0

B2 011 21-Apr-20 Q 0 0

B2 011 25-Jun-20 Q 0 0

B3 011 25-Jun-20 T 0 0

B3 012 16-Aug-21 D -50 -900

B3 012 18-Oct-21 T -90 -89

desired result:

Customer_ID Type

A1 Type A

A2 Type C

A3 Other

B1 Type B

B2 Type C

B3 Type B

Thanks a lot.

View solution in original post

Reeza · Posted 06-07-2022 11:43 AM

That code doesn't look correct for starters, are you sure that's the code you're running?
Are you set on using SQL or is a data step an option?

septemberbulb · Posted 06-07-2022 11:50 AM

Hi, I know I did not complete the last two scenarios so just put a blank bracket there. a data step is also fine.

ballardw · Posted 06-07-2022 11:45 AM

I am concerned about this:

as long as each customer_ID has records (vertically) with segment meeting the logic below

If "records (vertically)" means that processing has to be done in a specific order then SQL may not be the proper approach. SQL deals with sets of records without any guarantee of sequence of processing.

If the rule involved is more of (description text NOT code)

when ALL segment values in ('D','M','P1','P2') and NO segment values = 'T' then 'Type A'

that is quite different than the record by record value assigned by the Case. Group by expects something more like:

When you use a GROUP BY clause, you also use an aggregate function in the SELECT clause or in a HAVING clause to instruct PROC SQL in how to summarize the data for each group.

Your code is not doing any summarization by group values. (emphasis added for aggregate and summarize)

I think that you may want to consider a data step with By processing retaining some variables that let you know if the current record meets conditions and then set a final status at the end.

Reeza · Posted 06-07-2022 03:19 PM

Here's how I'd do it via a data step, leveraging BY group processing.

1. Add an indicator variable for each of the TYPES, ('D', 'M', 'P1', 'P2','T','I'). As you iterate through each ID, set the values to 1 and retain it to the last row. So if one row as T then it will be 1 to the end of the ID.

2. Sum Revenue1 and Revenue 2 across the rows as well

3. On the last record of each ID, check your indicators and revenue variables for you criteria and set the type.

4. Output your ID and Type.

Give that a try and if you need help post the code you've managed to work out.

@septemberbulb wrote:

Hello,

I am trying to do this: as long as each customer_ID has records (vertically) with segment meeting the logic below, then we can have a type associated with the customer_ID.

For example A1 (in the table below) has different account ID (7 records) and in the 7 records, the segment has P2 and no T so that the A1 is Type A;

if a client ID has records with segment in ('D', 'M', 'P2', 'P1') and not equal to ‘T’-> Type A

if a client ID has records with segment equal to ‘T’ and none in ('D', 'M', 'P1', 'P2') -> Type B

if a client ID has records with segment equal to 'I' and none in ('D', 'M', 'P1', 'P2',’T’) -> Type C

if a client ID has no records with segment in ('D', 'M', 'P2', 'P1',’T’, 'I') -> Other

if a client ID does not meet any of above requirements but revenue1 >0 and revenue2>0 ->Type A

if a client ID does not meet any of above requirements but has revenue1 <=0 and revenue2<=0 -> Type B

I am not sure with two parts:

how to write the 5/6 conditions (highlighted in orange) that illustrate the conditions that not meet any of above requirements

I am using a case when but that one assigns a type for each row instead of looking at the records in a vertical view and then assign a type for client_ID

proc sql;
create table flag_testing as
select a.customer_ID
,case
when segment in ('D','M','P1','P2') and segment ne 'T' then 'Type A'
when segment eq 'T' and not in ('D', 'M', 'P1', 'P2') then 'Type B'
when segment eq 'I' and not in ('D', 'M', 'P1', 'P2','T') then 'Type C'
when segment not in ('D', 'M', 'P1', 'P2','T','I') then 'Other'
when segment not in ( )and revene1>0 and revene2>0 then 'Type A'
when segment not in ( )and revene1<=0 and revene2<=0 then 'Type B'
else ' '
end as type
from nodup_sub_testing a
group by customer_ID;
quit;

Customer_ID Account_ID Record_Date Segment Revenue1 Revenue2

A1 001 17-Dec-18 N 0 0

A1 001 0 0

A1 002 17-Apr-19 N 0 5000

A1 002 17-Feb-19 M 10000 4000

A1 003 1-Jan-20 P2 4010 3000

A1 003 1-Feb-20 N 4600 4444

A1 003 2-Feb-20 P2 4330 3509

A2 004 4-Apr-21 I 0 0

A2 004 18-Oct-21 I 0 0

A3 005 17-Aug-20 0 0

A3 006 19-Oct-20 N 1356 1389

A3 007 17-Dec-18 SS 0 0

A3 008 21-Dec-20 N 0 0

B1 009 15-Apr-19 N 8997 7689

B1 009 19-Aug-19 N 758 456

B1 009 21-Oct-19 N -869 -467

B1 009 16-Dec-19 T 0 0

B2 011 16-Oct-17 I -97 -97

B2 011 18-Dec-17 N 0 -98

B2 011 19-Feb-18 Q 0 0

B2 011 21-Apr-20 Q 0 0

B2 011 25-Jun-20 Q 0 0

B3 011 25-Jun-20 T 0 0

B3 012 16-Aug-21 D -50 -900

B3 012 18-Oct-21 T -90 -89

desired result:

Customer_ID Type

A1 Type A

A2 Type C

A3 Other

B1 Type B

B2 Type C

B3 Type B

Thanks a lot.

Sincerely asking about how to define a variable when hit different logic scenarios

Re: Sincerely asking about how to define a variable when hit different logic scenarios

Re: Sincerely asking about how to define a variable when hit different logic scenarios

Re: Sincerely asking about how to define a variable when hit different logic scenarios

Re: Sincerely asking about how to define a variable when hit different logic scenarios

Re: Sincerely asking about how to define a variable when hit different logic scenarios

Sincerely asking about how to define a variable when hit different logic scenarios

Re: Sincerely asking about how to define a variable when hit different logic scenarios

Re: Sincerely asking about how to define a variable when hit different logic scenarios

Re: Sincerely asking about how to define a variable when hit different logic scenarios

Re: Sincerely asking about how to define a variable when hit different logic scenarios

Re: Sincerely asking about how to define a variable when hit different logic scenarios

SAS Innovate 2025: Save the Date