BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jacob_klimek
Obsidian | Level 7

Hello,

 

I am trying to create a class variable based on a variable that is currently character. By class variable I mean a variable that is populated by integers which can be linked to a certain character string, but there are over 100 character strings and they change monthly so I would like to not have to manually write out lines for every one.

 

My data looks like this:

 

id_var       program_id

1                 ABC

2                 DEF

3                 ABC

4                 GHI

5                 GHI

 

and I want it to look like this:

 

id_var       program_id     prog_ind

1                 ABC                     1

2                 DEF                     2

3                 ABC                     1

4                 GHI                      3

5                 GHI                      3

 

 

In the past, I had just been doing them by hand like so:

 

data want; set have;

if program_id="ABC" then prog_ind=1;

run;

 

But as I said, if there is a way to make the process go quicker (in coding, not in computation speed) that would be really helpful.

 

Thank You

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Here is an example using your data to create an Informat and associated format.

 

proc sql;
   create table work.temp as
   select distinct program_id as Start
   from Have;
quit;

data tempcntlin;
   set work.temp;
   length FmtName $ 10.  ;
   FmtName='Program';
   label=put(_n_,best4.);
   type='I';
   output;
   FmtName='ProgramTxt';
   label=Start;
   start=put(_n_,best4.);
   type='N';
   output;
run;
proc sort data=tempcntlin; 
  by Type FmtName start;
run;

Proc format library=work cntlin=tempcntlin out=work.cntlout;
run;

data want;
   set have;
   Prog_ind = input(program_id,Program.);
run;

proc print data=want;
   var Prog_ind;
   format Prog_ind Programtxt.;
run;

NOTE: this approach replaces the format constantly so you should not use it on any previously created data set.

 

I have provided an associated format to display the original value as needed.

A side effect of the Proc SQL step to select the names is that they will come out in alphabetic order and hence 1 will be assigne to the first ordered program_id. If that is not the desired result you will need to specify such.

 

 

I don't know how you are using this but you may not need the numeric variable at all.

View solution in original post

6 REPLIES 6
subin
Calcite | Level 5
You can create a format on the program_id .Use the cntlin option to make formats out of that dataset.

Now apply the format on program_id variable on have dataset like..

Proc sql;
Select (program_I'd,$newfmt.) As program_id.....

subin
Calcite | Level 5
Proc sql;
Select (program_I'd,$newfmt.) As prog_id...
ballardw
Super User

Here is an example using your data to create an Informat and associated format.

 

proc sql;
   create table work.temp as
   select distinct program_id as Start
   from Have;
quit;

data tempcntlin;
   set work.temp;
   length FmtName $ 10.  ;
   FmtName='Program';
   label=put(_n_,best4.);
   type='I';
   output;
   FmtName='ProgramTxt';
   label=Start;
   start=put(_n_,best4.);
   type='N';
   output;
run;
proc sort data=tempcntlin; 
  by Type FmtName start;
run;

Proc format library=work cntlin=tempcntlin out=work.cntlout;
run;

data want;
   set have;
   Prog_ind = input(program_id,Program.);
run;

proc print data=want;
   var Prog_ind;
   format Prog_ind Programtxt.;
run;

NOTE: this approach replaces the format constantly so you should not use it on any previously created data set.

 

I have provided an associated format to display the original value as needed.

A side effect of the Proc SQL step to select the names is that they will come out in alphabetic order and hence 1 will be assigne to the first ordered program_id. If that is not the desired result you will need to specify such.

 

 

I don't know how you are using this but you may not need the numeric variable at all.

Rick_SAS
SAS Super FREQ

You example seems to indicate that you want the prog_ind variable to be the rank of the sorted values of the program_id variable, regardless of its contents. If you don't mind sorting the data, do this:

 

data Have;
input id_var program_id $;
datalines;
1                 ABC
2                 DEF
3                 ABC
4                 GHI
5                 GHI
;

proc sort data=have out=want;
by program_id;
run;

data want;
set want;
by program_id;
if first.program_id then 
   prog_ind + 1;
run;

proc print; run;

If you don't want to leave the data in sorted order, you can restore the original order by resorting on the id_var variable.

Ksharp
Super User
I don't understand your logic . 
Why  GHI    is 3 ?

Ksharp
Super User
OK. Assuming I understand what you mean.




data have;
input id_var       program_id $;
cards;
1                 ABC
2                 DEF
3                 ABC
4                 GHI
5                 GHI
;
run;
data want;
 if _n_=1 then do;
  if 0 then set have;
  declare hash h();
  h.definekey('program_id');
  h.definedata('n');
  h.definedone();
 end;
 set have;
 if h.find() ne 0 then do;
  count+1;n=count;h.add();
 end;
 drop count;
run;
  
 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 2878 views
  • 3 likes
  • 5 in conversation