Guys, I have a dataset i have and I want the output to look like this. I want the output to look like this
BBB B2546 C1254
BBB B4578 C12456
BBB B7995 C14576
BBB B1245 C1259
I want to build a logic where on my ID group if there is a match of account1 and account2 then I want that ID group to be deleted (eg: AAA A2345 A2345 & CCC D4568 D4568) and I want to have the ID group that has all different account1 as well as account2 . I tried the code below which didn't work. I appreciate any help on this in advance.
data check;
input id $ account1 $ account2 $ ;
cards;
AAA A1234 B1456
AAA A1234 B74156
AAA A2345 A2345
BBB B2546 C1254
BBB B4578 C12456
BBB B7995 C14576
BBB B1245 C1259
CCC D4568 F1254
CCC D4568 G1458
CCC D4568 D4568
;
RUN;
PROC SQL;
CREATE TABLE NEW AS
SELECT *
FROM CHECK
GROUP BY ID
HAVING ACCCOUNT1 NOT IN (SELECT DISTINCT ACCOUNT2 FROM CHECK );
QUIT;
please try the below code
proc sql;
create table want as select * from check where id not in ( select distinct id from check where account1=account2);
quit;
Almost identical to the logic in this question
https://communities.sas.com/t5/SAS-Programming/How-to-label-rows-that-don-t-match/m-p/643983#M192280
and the solution ought to work for you as well.
please try the below code
proc sql;
create table want as select * from check where id not in ( select distinct id from check where account1=account2);
quit;
Hi @Jagadishkatam Hope all's well and you are safe in among the world's Covid19 fight. I'm afraid your code will miss some intricacies should the data be slightly different like:
AAA A1234 B1456
AAA A1234 A2345
AAA A2345 B74156
However, I would have to agree with you that's not the case in the sample posted by @buddha_d . Nonetheless, this is a kind of situation I would often refer back to @ChrisNZ aka Mr. Perfect's teachings to never assume. On the other hand, I would also think genius @mkeintz would prefer a Hash check method solution as he deems a lot of Proc SQL is rather "ready meals" though fun and easy. 🙂 We have our banters every so often as it's been a privilege to learn from these marvelous people.
Therefore, something like the below would be needed to address in full
data check;
input id $ account1 $ account2 $ ;
cards;
AAA A1234 B1456
AAA A1234 A2345
AAA A2345 B74156
BBB B2546 C1254
BBB B4578 C12456
BBB B7995 C14576
BBB B1245 C1259
CCC D4568 F1254
CCC D4568 G1458
CCC D4568 D4568
;
RUN;
proc sql;
create table want as
select distinct a.*
from check a left join check b
on a.account1=b.account2
group by a.id
having not n(b.account2);
quit;
proc print noobs;run;
| id | account1 | account2 |
|---|---|---|
| BBB | B1245 | C1259 |
| BBB | B2546 | C1254 |
| BBB | B4578 | C12456 |
| BBB | B7995 | C14576 |
@novinosrin lol love your nicknames! 🙂 Thank you! 👍
A data step solution:
data want;
merge
check (
in=to_delete
where=(account1 = account2)
)
check
;
by id;
if not to_delete;
run;
I think a data step is the way to go:
data new;
merge
check(where=(account1=account2) in=duplicate)
check;
by id;
if not duplicate;
run;
You may get a message about more than one data set in the merge having repeat of BY values, but that is OK in this case.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.