DATA Step, Macro, Functions and more

Splitting a data set by unique values of a variable

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 76
Accepted Solution

Splitting a data set by unique values of a variable

I have a dataset with the following variables: ID (character), genotype (character). The dataste looks like this

 

     ID         gene

     A1        WT

     A2        Het

     A3        Hom

     B1        Het

     B2        Hom

     C1        WT

 

 

I want to create a table with the distinct values of gene as the column headers and the ID variables as observations, and then export to an excel document. How can I go about doing this?

 

     WT     Het     Hom

     A1      A2       A3

     C1      B1      B2

 

 

 

 


Accepted Solutions
Solution
‎05-18-2017 02:07 PM
PROC Star
Posts: 7,471

Re: Splitting a data set by unique values of a variable

I don't think @Jagadishkatam's code will work without some additional code, but here is a version using the same concept:

data have;
   input (ID gene) ($);
   cards;
     A1        WT
     A2        Het
     A3        Hom
     B1        Het
     B2        Hom
     C1        WT
     D1        WT
;

proc sort data=have out=need;
  by gene;
run;

data need;
  set need;
  by gene;
  if first.gene then _counter=1;
  else _counter+1;
run;

proc sort data=need;
  by _counter;
run;

proc transpose data=need out=want (drop=_:);
  by _counter;
  var id;
  id gene;
run;

Art, CEO, AnalystFinder.com

View solution in original post


All Replies
Trusted Advisor
Posts: 1,137

Re: Splitting a data set by unique values of a variable

This is an untested code which you could try

 

data have;
input ID $2. gene $;
cards;
A1 WT
A2 Het
A3 Hom
B1 Het
B2 Hom
C1 WT
;

proc transpose data=have out=want;
id gene;
var id ;
run;
Thanks,
Jag
Solution
‎05-18-2017 02:07 PM
PROC Star
Posts: 7,471

Re: Splitting a data set by unique values of a variable

I don't think @Jagadishkatam's code will work without some additional code, but here is a version using the same concept:

data have;
   input (ID gene) ($);
   cards;
     A1        WT
     A2        Het
     A3        Hom
     B1        Het
     B2        Hom
     C1        WT
     D1        WT
;

proc sort data=have out=need;
  by gene;
run;

data need;
  set need;
  by gene;
  if first.gene then _counter=1;
  else _counter+1;
run;

proc sort data=need;
  by _counter;
run;

proc transpose data=need out=want (drop=_:);
  by _counter;
  var id;
  id gene;
run;

Art, CEO, AnalystFinder.com

Frequent Contributor
Posts: 76

Re: Splitting a data set by unique values of a variable

Worked perfectly, thanks!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 329 views
  • 3 likes
  • 3 in conversation