Solved
Contributor
Posts: 44

# Subset for multiple observations per subject

Hi,

My data looks something like this:

ID  A B  C D  E

1   a  a  b  a  a

1   a  b  b  b  b

2   c  c  b  a  a

2   b  c  c  c  a

I would like to output where the observations differ for any given column (A-E):

ID B D E

1  a a  a

1  b b  b

ID A C D

2  c  b  a

2  b  c  c

Could someone please suggest some code for this? Thanks.

Accepted Solutions
Solution
‎06-24-2016 02:12 PM
Posts: 5,533

## Re: Subset for multiple observations per subject

Like this?

``````data have;
input ID  (A B  C D  E) (\$);
cards;
1   a  a  b  a  a
1   a  b  b  b  b
2   c  c  b  a  a
2   b  c  c  c  a
;

data temp;
length var \$12;
set have;
array x \$ _character_;
do i = 2 to dim(x);
var = vname(x{i});
value = x{i};
output;
end;
keep id var value;
run;

proc sql;
create table want as
select id, var
from temp
group by id, var
having count(distinct value) > 1;
quit;
``````
PG

All Replies
Posts: 5,533

## Re: Subset for multiple observations per subject

Do you simply want to know which variables differ for each ID, or create one new dataset for each ID ?

PG
Contributor
Posts: 44

## Re: Subset for multiple observations per subject

I'd just like to know which variables differ for each ID. Thank you.

Solution
‎06-24-2016 02:12 PM
Posts: 5,533

## Re: Subset for multiple observations per subject

Like this?

``````data have;
input ID  (A B  C D  E) (\$);
cards;
1   a  a  b  a  a
1   a  b  b  b  b
2   c  c  b  a  a
2   b  c  c  c  a
;

data temp;
length var \$12;
set have;
array x \$ _character_;
do i = 2 to dim(x);
var = vname(x{i});
value = x{i};
output;
end;
keep id var value;
run;

proc sql;
create table want as
select id, var
from temp
group by id, var
having count(distinct value) > 1;
quit;
``````
PG
Contributor
Posts: 44

## Re: Subset for multiple observations per subject

Great, thanks, I'll give this a try.

Super User
Posts: 10,784

## Re: Subset for multiple observations per subject

Assuming I understand what you mean.

``````data have;
input ID  (A B  C D  E) (\$);
cards;
1   a  a  b  a  a
1   a  b  b  b  b
2   c  c  b  a  a
2   b  c  c  c  a
;
run;

proc sql;
create table temp as
select id,
(count(distinct A)=count(*)) as A,
(count(distinct B)=count(*)) as B,
(count(distinct C)=count(*)) as C,
(count(distinct D)=count(*)) as D,
(count(distinct E)=count(*)) as E
from have
group by id;
quit;
proc transpose data=temp out=key(where=(col1=1));
by id;
run;
data _null_;
set key;
by id;
if first.id then call execute(cats('data want',id,';set have;if id=',id,';keep id '));
call execute(_name_);
if last.id then call execute(';run;');
run;
``````
Contributor
Posts: 44

## Re: Subset for multiple observations per subject

Wow, thank you. I'll give this a try. Can you suggest a shorthand notation for the explanatory (A-E) variables? I have about 100. Thanks.

Super User
Posts: 10,784

## Re: Subset for multiple observations per subject

Sure.

``````data have;
input ID  (A B  C D  E) (\$);
cards;
1   a  a  b  a  a
1   a  b  b  b  b
2   c  c  b  a  a
2   b  c  c  c  a
;
run;

proc transpose data=have(obs=0) out=vnames;
var _all_;
run;

proc sql noprint;
select catx(' ','(count(distinct',_name_,')=count(*)) as',_name_)
into : list separated by ','
from vnames
where upcase(_name_) ne 'ID';

create table temp as
select id,&list
from have
group by id;
quit;
proc transpose data=temp out=key(where=(col1=1));
by id;
run;
data _null_;
set key;
by id;
if first.id then call execute(cats('data want',id,';set have;if id=',id,';keep id '));
call execute(_name_);
if last.id then call execute(';run;');
run;
``````
Contributor
Posts: 44

## Re: Subset for multiple observations per subject

Wow, thank you very much Keshan.

Contributor
Posts: 44

## Re: Subset for multiple observations per subject

This works, too! Thanks!

🔒 This topic is solved and locked.