BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ilikesas
Barite | Level 11

hi,

suppose I have the following small sample from a bigger data set:

IDCategory
1CC
2Z1
3B+
4CC
5Z2
6B

What I would like to have is for each unique category a unique variable category:

IDCategoryVar_Cat
1CC1
2Z12
3B+3
4CC1
5Z24
6B5

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

Code: Program



data have;
infile cards truncover expandtabs;
input ID Category $;
cards;
1 CC
2 Z1
3 B+
4 CC
5 Z2
6 B
;
run;

data want;
if _n_ =1 then do;
if 0 then set have;
declare hash h();
h.definekey('Category');
h.definedata('n');
h.definedone();
end;
set have;
if h.find() ne 0 then do; count+1;n=count;h.add();end;
drop count;
run;

View solution in original post

5 REPLIES 5
Ksharp
Super User

Code: Program



data have;
infile cards truncover expandtabs;
input ID Category $;
cards;
1 CC
2 Z1
3 B+
4 CC
5 Z2
6 B
;
run;

data want;
if _n_ =1 then do;
if 0 then set have;
declare hash h();
h.definekey('Category');
h.definedata('n');
h.definedone();
end;
set have;
if h.find() ne 0 then do; count+1;n=count;h.add();end;
drop count;
run;
Astounding
PROC Star

Lots of questions need to be answered here ...

Is the "bigger" data set small enough that you can process it many times as needed to get the result?

Is the new variable supposed to be numeric or character?

If it is supposed to be character ... Why?  (What difference would it make using the original values vs. the new variable?)  Should "1" become "001" if there are hundreds of categories?

If it supposed to be numeric ... Do the numbers have to be consecutive, or would any unique numeric value be sufficient?  If they need to be consecutive, do they have to be assigned using the order that appears in the data, or can they be assigned using some other order (such as alphabetical order of CATEGORY)?

There are many ways to approach this, depending on the answers you provide.  For example, if the new variable should be numeric, can take on any integer value, but needs to be a unique match for the original value, you can do this in a DATA step with one statement:

var_cat = input(put(category, $hex4.), hex4.);

But the requirements have to be explained a bit more to make sure the solution matches what you need.

Good luck.

TomKari
Onyx | Level 15

Give this a try. If your data is intrinsically sorted by ID, just get rid of all the "SeqNo" stuff and use ID instead.

Tom

data have;
   set have;
   SeqNo = _n_;
run;

proc sort data=have;
   by Category;
run;

data have;
   set have;
   by Category;
   retain Var_Cat 0;

   if first.Category then
      Var_Cat = Var_Cat + 1;
run;

proc sort data=have out=have(drop=SeqNo);
   by SeqNo;
run;

ballardw
Super User

If you have a smallish number of values you could also make an informat that could create the new variable.

proc format;

invalue Var_cat

'CC' = 1

'Z1' = 2

'B+' = 3;

run;


data want ;

     set have;

     var_cat = input(category,var_cat.);

run;

slchen
Lapis Lazuli | Level 10

data want;

   do until (last);

   set have end=last;

   array temp [1000]$ _temporary_;

   if whichc(category,of temp(*))=0 then do;

       n+1;

       temp(n)=category;

   end;

   end;

   do p=1 to nobs;

   set have point=p nobs=nobs;

   count=whichc(category,of temp(*));

   output;

   end;

   stop;

   drop n;

run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1845 views
  • 9 likes
  • 6 in conversation