Guys, I have a dataset i have and I want the output to look like this. I want the output to look like this
BBB B2546 C1254
BBB B4578 C12456
BBB B7995 C14576
BBB B1245 C1259
I want to build a logic where on my ID group if there is a match of account1 and account2 then I want that ID group to be deleted (eg: AAA A2345 A2345 & CCC D4568 D4568) and I want to have the ID group that has all different account1 as well as account2 . I tried the code below which didn't work. I appreciate any help on this in advance.
data check;
input id $ account1 $ account2 $ ;
cards;
AAA A1234 B1456
AAA A1234 B74156
AAA A2345 A2345
BBB B2546 C1254
BBB B4578 C12456
BBB B7995 C14576
BBB B1245 C1259
CCC D4568 F1254
CCC D4568 G1458
CCC D4568 D4568
;
RUN;
PROC SQL;
CREATE TABLE NEW AS
SELECT *
FROM CHECK
GROUP BY ID
HAVING ACCCOUNT1 NOT IN (SELECT DISTINCT ACCOUNT2 FROM CHECK );
QUIT;
please try the below code
proc sql;
create table want as select * from check where id not in ( select distinct id from check where account1=account2);
quit;
Almost identical to the logic in this question
https://communities.sas.com/t5/SAS-Programming/How-to-label-rows-that-don-t-match/m-p/643983#M192280
and the solution ought to work for you as well.
please try the below code
proc sql;
create table want as select * from check where id not in ( select distinct id from check where account1=account2);
quit;
Hi @Jagadishkatam Hope all's well and you are safe in among the world's Covid19 fight. I'm afraid your code will miss some intricacies should the data be slightly different like:
AAA A1234 B1456
AAA A1234 A2345
AAA A2345 B74156
However, I would have to agree with you that's not the case in the sample posted by @buddha_d . Nonetheless, this is a kind of situation I would often refer back to @ChrisNZ aka Mr. Perfect's teachings to never assume. On the other hand, I would also think genius @mkeintz would prefer a Hash check method solution as he deems a lot of Proc SQL is rather "ready meals" though fun and easy. 🙂 We have our banters every so often as it's been a privilege to learn from these marvelous people.
Therefore, something like the below would be needed to address in full
data check;
input id $ account1 $ account2 $ ;
cards;
AAA A1234 B1456
AAA A1234 A2345
AAA A2345 B74156
BBB B2546 C1254
BBB B4578 C12456
BBB B7995 C14576
BBB B1245 C1259
CCC D4568 F1254
CCC D4568 G1458
CCC D4568 D4568
;
RUN;
proc sql;
create table want as
select distinct a.*
from check a left join check b
on a.account1=b.account2
group by a.id
having not n(b.account2);
quit;
proc print noobs;run;
id | account1 | account2 |
---|---|---|
BBB | B1245 | C1259 |
BBB | B2546 | C1254 |
BBB | B4578 | C12456 |
BBB | B7995 | C14576 |
@novinosrin lol love your nicknames! 🙂 Thank you! 👍
A data step solution:
data want;
merge
check (
in=to_delete
where=(account1 = account2)
)
check
;
by id;
if not to_delete;
run;
I think a data step is the way to go:
data new;
merge
check(where=(account1=account2) in=duplicate)
check;
by id;
if not duplicate;
run;
You may get a message about more than one data set in the merge having repeat of BY values, but that is OK in this case.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.