BookmarkSubscribeRSS Feed
Sandeep77
Lapis Lazuli | Level 10

Hi, I have a dataset which has debt_code, icustomerid and datekey. I am using a case when statement using these three columns. I want to find if debt_code, icustomerid and datekey are same then 0 and if debt_code, icustomerid are same and datekey is different then 1 but I am not sure of using the multiple case. Can you please assist?

proc sql;
create table Multiple_entry as
select *,
case when debt_code = icustomerid = Datekey then 0 
else 1 end as Flag
from Previous_Status;
quit;
Error log:
29         proc sql;
30         create table Multiple_entry as
31         select *,
32         case when debt_code = icustomerid = Datekey then 0
33         else 1 end as Flag
34         from Previous_Status;
ERROR: Expression using equals (=) has components that are of different data types.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
35         quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.01 seconds
      user cpu time       0.00 seconds
      system cpu time     0.01 seconds
      memory              5589.87k
      OS Memory           32348.00k
      Timestamp           10/20/2022 04:54:56 PM
      Step Count                        10  Switch Count  0

Sample dataset in Previous_status table:
rep_code	debt_code	icustomerid	accountstatusprevious	datekey
133	286338454	16418	133	20220406
133	295191266	16418	133	20220406
133	299889733	16418	133	20220406
133	266208776	21436	133	20190207
133	266208776	21436	133	20190911
133	266208776	21436	133	20200109
133	266208776	21436	133	20201207
133	291458610	21436	133	20190911
133	291458610	21436	133	20190917
133	291458610	21436	133	20200109
133	291458610	21436	133	20201207

I want to flag the 3 rows as 0 because the date is same and 4-7 rows as 1 as the debt code and icustomerid is same but date is different.
4 REPLIES 4
Reeza
Super User

= tests within the same row.
You are testing between rows.
I think you're more looking for a GROUP BY statement as well as CASE in this situation.

 

proc sql;
create table want as 
select *, 
    case when max(datekey)=min(datekey) then 0 
              else 1 end as flag
from have 
group by icustomerid, debt_code;
quit;
Sandeep77
Lapis Lazuli | Level 10

Thank you Reeza, It did gave me the 0 and 1 flag but what if I want 1 row per datekey for same icustomerid? By doing this I will be able to remove the duplicate records.  For e.g., the first 3 rows should show 1 record as the icustomerid and datekey is same. Row 4 to 7 seems to be good as the icustomerid is same but it shows the datekey different. Can you suggest? 

Below is the sample date after adding the flag.

rep_code	debt_code	icustomerid	accountstatusprevious	datekey	flag
133	286338454	16418	133	20220406	0
133	295191266	16418	133	20220406	0
133	299889733	16418	133	20220406	0
133	266208776	21436	133	20190207	1
133	266208776	21436	133	20190911	1
133	266208776	21436	133	20200109	1
133	266208776	21436	133	20201207	1

 

Reeza
Super User

Easy enough but not the question asked initially. Proc sort will keep only unique values based on a set of key variables when you specify the UNIQUEKEY option. It will keep the rest of the variables for that record as well, and this is easier to manage in PROC SORT than SQL. 

 

proc sort data=have out=want uniquekey;
by icustomerID debt_code datekey;
run;

 


@Sandeep77 wrote:

Thank you Reeza, It did gave me the 0 and 1 flag but what if I want 1 row per datekey for same icustomerid? By doing this I will be able to remove the duplicate records.  For e.g., the first 3 rows should show 1 record as the icustomerid and datekey is same. Row 4 to 7 seems to be good as the icustomerid is same but it shows the datekey different. Can you suggest? 

Below is the sample date after adding the flag.

rep_code	debt_code	icustomerid	accountstatusprevious	datekey	flag
133	286338454	16418	133	20220406	0
133	295191266	16418	133	20220406	0
133	299889733	16418	133	20220406	0
133	266208776	21436	133	20190207	1
133	266208776	21436	133	20190911	1
133	266208776	21436	133	20200109	1
133	266208776	21436	133	20201207	1

 




 

Kurt_Bremser
Super User

Have you verified that all three variables used in the comparison are of type character?

 

Edit:

from your example data, this condition can never be true. What you want is a comparison across multiple observations, not a comparison within a single observation.

proc sort data=have;
by debt_code icustomerid datekey;
run;

data want;
set have;
by debt_code icustomerid datekey;
flag = (first.datekey = last.datekey);
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 356 views
  • 2 likes
  • 3 in conversation