Hi, I have a dataset which has debt_code, icustomerid and datekey. I am using a case when statement using these three columns. I want to find if debt_code, icustomerid and datekey are same then 0 and if debt_code, icustomerid are same and datekey is different then 1 but I am not sure of using the multiple case. Can you please assist?
proc sql;
create table Multiple_entry as
select *,
case when debt_code = icustomerid = Datekey then 0
else 1 end as Flag
from Previous_Status;
quit;
Error log:
29 proc sql;
30 create table Multiple_entry as
31 select *,
32 case when debt_code = icustomerid = Datekey then 0
33 else 1 end as Flag
34 from Previous_Status;
ERROR: Expression using equals (=) has components that are of different data types.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
35 quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.01 seconds
user cpu time 0.00 seconds
system cpu time 0.01 seconds
memory 5589.87k
OS Memory 32348.00k
Timestamp 10/20/2022 04:54:56 PM
Step Count 10 Switch Count 0
Sample dataset in Previous_status table:
rep_code debt_code icustomerid accountstatusprevious datekey
133 286338454 16418 133 20220406
133 295191266 16418 133 20220406
133 299889733 16418 133 20220406
133 266208776 21436 133 20190207
133 266208776 21436 133 20190911
133 266208776 21436 133 20200109
133 266208776 21436 133 20201207
133 291458610 21436 133 20190911
133 291458610 21436 133 20190917
133 291458610 21436 133 20200109
133 291458610 21436 133 20201207
I want to flag the 3 rows as 0 because the date is same and 4-7 rows as 1 as the debt code and icustomerid is same but date is different.
= tests within the same row.
You are testing between rows.
I think you're more looking for a GROUP BY statement as well as CASE in this situation.
proc sql;
create table want as
select *,
case when max(datekey)=min(datekey) then 0
else 1 end as flag
from have
group by icustomerid, debt_code;
quit;
Thank you Reeza, It did gave me the 0 and 1 flag but what if I want 1 row per datekey for same icustomerid? By doing this I will be able to remove the duplicate records. For e.g., the first 3 rows should show 1 record as the icustomerid and datekey is same. Row 4 to 7 seems to be good as the icustomerid is same but it shows the datekey different. Can you suggest?
Below is the sample date after adding the flag.
rep_code debt_code icustomerid accountstatusprevious datekey flag
133 286338454 16418 133 20220406 0
133 295191266 16418 133 20220406 0
133 299889733 16418 133 20220406 0
133 266208776 21436 133 20190207 1
133 266208776 21436 133 20190911 1
133 266208776 21436 133 20200109 1
133 266208776 21436 133 20201207 1
Easy enough but not the question asked initially. Proc sort will keep only unique values based on a set of key variables when you specify the UNIQUEKEY option. It will keep the rest of the variables for that record as well, and this is easier to manage in PROC SORT than SQL.
proc sort data=have out=want uniquekey;
by icustomerID debt_code datekey;
run;
@Sandeep77 wrote:
Thank you Reeza, It did gave me the 0 and 1 flag but what if I want 1 row per datekey for same icustomerid? By doing this I will be able to remove the duplicate records. For e.g., the first 3 rows should show 1 record as the icustomerid and datekey is same. Row 4 to 7 seems to be good as the icustomerid is same but it shows the datekey different. Can you suggest?
Below is the sample date after adding the flag. rep_code debt_code icustomerid accountstatusprevious datekey flag 133 286338454 16418 133 20220406 0 133 295191266 16418 133 20220406 0 133 299889733 16418 133 20220406 0 133 266208776 21436 133 20190207 1 133 266208776 21436 133 20190911 1 133 266208776 21436 133 20200109 1 133 266208776 21436 133 20201207 1
Have you verified that all three variables used in the comparison are of type character?
Edit:
from your example data, this condition can never be true. What you want is a comparison across multiple observations, not a comparison within a single observation.
proc sort data=have;
by debt_code icustomerid datekey;
run;
data want;
set have;
by debt_code icustomerid datekey;
flag = (first.datekey = last.datekey);
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.