Hi,
My data looks something like this:
ID A B C D E
1 a a b a a
1 a b b b b
2 c c b a a
2 b c c c a
I would like to output where the observations differ for any given column (A-E):
ID B D E
1 a a a
1 b b b
ID A C D
2 c b a
2 b c c
Could someone please suggest some code for this? Thanks.
Like this?
data have;
input ID (A B C D E) ($);
cards;
1 a a b a a
1 a b b b b
2 c c b a a
2 b c c c a
;
data temp;
length var $12;
set have;
array x $ _character_;
do i = 2 to dim(x);
var = vname(x{i});
value = x{i};
output;
end;
keep id var value;
run;
proc sql;
create table want as
select id, var
from temp
group by id, var
having count(distinct value) > 1;
quit;
Do you simply want to know which variables differ for each ID, or create one new dataset for each ID ?
I'd just like to know which variables differ for each ID. Thank you.
Like this?
data have;
input ID (A B C D E) ($);
cards;
1 a a b a a
1 a b b b b
2 c c b a a
2 b c c c a
;
data temp;
length var $12;
set have;
array x $ _character_;
do i = 2 to dim(x);
var = vname(x{i});
value = x{i};
output;
end;
keep id var value;
run;
proc sql;
create table want as
select id, var
from temp
group by id, var
having count(distinct value) > 1;
quit;
Great, thanks, I'll give this a try.
Assuming I understand what you mean.
data have;
input ID (A B C D E) ($);
cards;
1 a a b a a
1 a b b b b
2 c c b a a
2 b c c c a
;
run;
proc sql;
create table temp as
select id,
(count(distinct A)=count(*)) as A,
(count(distinct B)=count(*)) as B,
(count(distinct C)=count(*)) as C,
(count(distinct D)=count(*)) as D,
(count(distinct E)=count(*)) as E
from have
group by id;
quit;
proc transpose data=temp out=key(where=(col1=1));
by id;
run;
data _null_;
set key;
by id;
if first.id then call execute(cats('data want',id,';set have;if id=',id,';keep id '));
call execute(_name_);
if last.id then call execute(';run;');
run;
Wow, thank you. I'll give this a try. Can you suggest a shorthand notation for the explanatory (A-E) variables? I have about 100. Thanks.
Sure.
data have;
input ID (A B C D E) ($);
cards;
1 a a b a a
1 a b b b b
2 c c b a a
2 b c c c a
;
run;
proc transpose data=have(obs=0) out=vnames;
var _all_;
run;
proc sql noprint;
select catx(' ','(count(distinct',_name_,')=count(*)) as',_name_)
into : list separated by ','
from vnames
where upcase(_name_) ne 'ID';
create table temp as
select id,&list
from have
group by id;
quit;
proc transpose data=temp out=key(where=(col1=1));
by id;
run;
data _null_;
set key;
by id;
if first.id then call execute(cats('data want',id,';set have;if id=',id,';keep id '));
call execute(_name_);
if last.id then call execute(';run;');
run;
Wow, thank you very much Keshan.
This works, too! Thanks!
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.