DATA Step, Macro, Functions and more

Creating new dataset

Reply
Contributor
Posts: 49

Creating new dataset

[ Edited ]

I have data which contain 

No. Name Age Scheme
1    A         35      TVF
2    B         21      OPJ
3    C         54      OPJ
4    D         63     JKP
5    A         35     JKP

and now I want in this type
i.e.
if Name is repeated then that scheme flag(1 or 0) will be separate into a new column

 


like this,

 

No.Name Age      TVF OPJ JKP
 1      A        35       1     0      1
 2     B        21        0     1      0
 3     C        54        0     1      0
 4     D        63        0     0     1

 

please give me a suggestion .

 

 

Thank you very much. 

 

 

 

 

Super User
Super User
Posts: 7,392

Re: Creating new dataset

First suggestion would be to follow the guidance provided un the Post button n the new question page, and in the pop up which appears.  Post your test data in the form of a dataste (not done), and what the output shoudl look like.  Then clearly explain the logic between the two.  Why does B and C appear as 1 in col2 but D in col3?  There is no logical reason for this that I can see other than scheme=TVF=1, OPJ=2, JKP=3.  If that is indeed the case, then maybe clarify that simple by using a proc transpose (assumes sorted, and not tested as I am not here to type in test data):

proc transpose data=have out=want;
  by no name age;
  var scheme;
  idlabel scheme;
run;

This should create col1-3, with the label showing the scheme data, however if not sorted, you have multiples etc then it will fail.  If you need them in a particular order, then add to the data that order number, and use that as id <ordernumber>;

Super User
Posts: 9,662

Re: Creating new dataset

you are trying to get design matrix .

1) IML code

data have;
input No Name $ Age Scheme $;
cards;
1    A         35      TVF
2    B         21      OPJ
3    C         54      OPJ
4    D         63     JKP
5    A         35     JKP
;
run;

proc iml;
use have;
read all var{scheme};
close;
vname=unique(scheme);
design=design(scheme);
create want from design[c=vname];
append from design;
close;
quit;
data want;
 merge have want;
run;

2) PROC to get design matrix

data have;
input No Name $ Age Scheme $;
cards;
1    A         35      TVF
2    B         21      OPJ
3    C         54      OPJ
4    D         63     JKP
5    A         35     JKP
;
run;

data have;
 set have;
 dummy=1;
run;
proc logistic data=have outdesign=design outdesignonly;
class scheme/ param=glm;
model dummy=scheme;
run;
data want;
 merge have design(drop=dummy intercept);
run;
Super User
Posts: 5,069

Re: Creating new dataset

[ Edited ]

Close to what you ask, and relatively easy to get:

 

proc sort data=have;

by name age;

run;

 

data have2;

set have;

x=1;

run;

 

proc transpose data=have2;

by name age;

var x;

id scheme;

run;

 

This gets rid of the first column (No.), but you can always add that later.  And it leaves variables with a missing value instead of a 0 (again, you can always fix that later if you need to).  But it produces the right variables, and assigns them a value of 1 in the proper cases.

Ask a Question
Discussion stats
  • 3 replies
  • 146 views
  • 0 likes
  • 4 in conversation