Help using Base SAS procedures

trying to convert a Character variables to numeric for proc corr analysis

Reply
New Contributor
Posts: 4

trying to convert a Character variables to numeric for proc corr analysis

   I am working on SAS  dataset consists of 31 variables and 164,450 obs.
 
And there are 2 to 3 character variables, iam trying to convert them into numeric to check correlation using proc cor.
i have written a code its very generic would you suggest any smart code.
Any suggestion will be highly appreciated
 
Thanks Kindly
 
 
Here is my code:
 
data all;
set all;
if  STATECOD ="AE" then state =1 ;
if  STATECOD ="AK" then state =2 ;
if  STATECOD ="AL" then state =3 ;
if  STATECOD ="AP" then state =4 ;
if  STATECOD ="AR" then state =5 ;
if  STATECOD ="AZ" then state =6 ;
if  STATECOD ="CA" then state =7 ;
if  STATECOD ="CO" then state =8 ;
if  STATECOD ="CT" then state =9 ;
if  STATECOD ="DC" then state =10 ;
if  STATECOD ="DE" then state =11 ;
if  STATECOD ="FL" then state =12 ;
if  STATECOD ="GA" then state =13 ;
if  STATECOD ="HI" then state =14 ;
if  STATECOD ="IA" then state =15 ;
if  STATECOD ="ID" then state =16 ;
if  STATECOD ="IL" then state =17 ;
if  STATECOD ="IN" then state =18 ;
if  STATECOD ="KS" then state =19 ;
if  STATECOD ="KY" then state =20 ;
if  STATECOD ="LA" then state =21 ;
if  STATECOD ="MA" then state =22 ;
if  STATECOD ="MD" then state =23 ;
if  STATECOD ="ME" then state =24 ;
if  STATECOD ="MI" then state =25 ;
if  STATECOD ="MN" then state =26 ;
if  STATECOD ="MO" then state =27 ;
if  STATECOD ="MS" then state =28 ;
if  STATECOD ="MT" then state =29 ;
if  STATECOD ="NC" then state =30 ;
if  STATECOD ="ND" then state =31 ;
if  STATECOD ="NE" then state =32 ;
if  STATECOD ="NH" then state =33 ;
if  STATECOD ="NJ" then state =34 ;
if  STATECOD ="NM" then state =35 ;
if  STATECOD ="NV" then state =36 ;
if  STATECOD ="NY" then state =37 ;
if  STATECOD ="OH" then state =38 ;
if  STATECOD ="OK" then state =39 ;
if  STATECOD ="OR" then state =40 ;
if  STATECOD ="PA" then state =41 ;
if  STATECOD ="PR" then state =42 ;
if  STATECOD ="RI" then state =43 ;
if  STATECOD ="SC" then state =44 ;
if  STATECOD ="SD" then state =45 ;
if  STATECOD ="TN" then state =46 ;
if  STATECOD ="TX" then state =47 ;
if  STATECOD ="UT" then state =48 ;
if  STATECOD ="VA" then state =49 ;
if  STATECOD ="VI" then state =50 ;
if  STATECOD ="VT" then state =51 ;
if  STATECOD ="WA" then state =52 ;
if  STATECOD ="WI" then state =53 ;
if  STATECOD ="WV" then state =54 ;
if  STATECOD ="WY" then state =55 ;
run;
 
 
data all;
set all;
if  NTITLE ="Mr" then title =1;
if  NTITLE ="Mrs" then title =2;
if  NTITLE ="Ms" then title =3;
if  NTITLE ="None" then title =4;
run;
Super User
Posts: 5,884

Re: trying to convert a Character variables to numeric for proc corr analysis

Posted in reply to azhar7860
Create look up tables with these mappings.
You can use that either to join with your original table, or as source for creating SAS formats, which you later can use to transform your values.
Data never sleeps
Super User
Posts: 13,583

Re: trying to convert a Character variables to numeric for proc corr analysis

[ Edited ]
Posted in reply to azhar7860

See if

 

State= stfips(statecod);

 

looks helpful.

Though I am wondering what "state" is represented by "AE", "AP" ???

 

FIPS is a US standard for certain types of coding and SAS has a number of functions that support such things.

Also this function will accept mixed case in case some of your values are "Al" instead of "AL".

 

The values won't exactly align but have the added value that they are a known standard and you can retrieve information from then with functions such as FIPNAME or FIPNAMEL or FIPSTATE.

Super User
Posts: 10,279

Re: trying to convert a Character variables to numeric for proc corr analysis

Posted in reply to azhar7860

What @LinusH means might look like this:

data myformat;
input start $ label $
fmtname = 'mystates';
type = 'C';
datalines;
AE 1
AK 2
AL 3
;
run;

You can join along start, or use it as a cntlin for proc format; note how the datalines reduce your typing effort.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
New Contributor
Posts: 4

Re: trying to convert a Character variables to numeric for proc corr analysis

Posted in reply to KurtBremser

 

 

 

I mdofied my previous codes and divided the states into  4 Regions, but its generating a character missing values.

any inputs will be appreciated and also can i change the new variable STATECOD_1 to Numeric Type.

 

Thanks kindly

 

 

 

data sas_pr.merge_4;
set sas_pr.merge_3;
length STATECOD_1 $ 14;
if STATECOD = in('CT', 'ME','MA','NH','RI','VT','NJ','NY','PA')
THEN STATECOD_1 = 'Region_1';
if STATECOD = in('IL','IN','MI','OH','WI','IA','KS','MN','MO','NE','ND','SD')
THEN STATECOD_1 = 'Region_2';
if STATECOD = in('DE','FL','GA','MD','NC','SC','VA','DC','WV','AL','KY','MS','TN','AR','LA','OK','TX')
THEN STATECOD_1 = 'Region_3';
if STATECOD = in('AZ','CO','ID','MT','NV','NM','UT','WY','AK','CA','HI','OR','WA')
THEN STATECOD_1 = 'Region_4';
RUN;

Esteemed Advisor
Posts: 5,540

Re: trying to convert a Character variables to numeric for proc corr analysis

Posted in reply to azhar7860

There should be no equal sign between STATECOD and IN(...).

 

if STATECOD IN(....) then ....;

PG
New Contributor
Posts: 4

Re: trying to convert a Character variables to numeric for proc corr analysis

Thanks kindly

Super User
Posts: 13,583

Re: trying to convert a Character variables to numeric for proc corr analysis

Posted in reply to azhar7860

@azhar7860 wrote:

 

 

 

I mdofied my previous codes and divided the states into  4 Regions, but its generating a character missing values.

any inputs will be appreciated and also can i change the new variable STATECOD_1 to Numeric Type.

 

Thanks kindly

 

 

 

data sas_pr.merge_4;
set sas_pr.merge_3;
length STATECOD_1 $ 14;
if STATECOD = in('CT', 'ME','MA','NH','RI','VT','NJ','NY','PA')
THEN STATECOD_1 = 'Region_1';
if STATECOD = in('IL','IN','MI','OH','WI','IA','KS','MN','MO','NE','ND','SD')
THEN STATECOD_1 = 'Region_2';
if STATECOD = in('DE','FL','GA','MD','NC','SC','VA','DC','WV','AL','KY','MS','TN','AR','LA','OK','TX')
THEN STATECOD_1 = 'Region_3';
if STATECOD = in('AZ','CO','ID','MT','NV','NM','UT','WY','AK','CA','HI','OR','WA')
THEN STATECOD_1 = 'Region_4';
RUN;


I notice AE AP VI PR  are no longer involved ...

New Contributor
Posts: 4

Re: trying to convert a Character variables to numeric for proc corr analysis

Appreciate.
THANKS KINDLY,
Super User
Posts: 10,279

Re: trying to convert a Character variables to numeric for proc corr analysis

Posted in reply to azhar7860

@azhar7860 wrote:

 

 

 

I mdofied my previous codes and divided the states into  4 Regions, but its generating a character missing values.

any inputs will be appreciated and also can i change the new variable STATECOD_1 to Numeric Type.

 

Thanks kindly

 

 

 

data sas_pr.merge_4;
set sas_pr.merge_3;
length STATECOD_1 $ 14;
if STATECOD = in('CT', 'ME','MA','NH','RI','VT','NJ','NY','PA')
THEN STATECOD_1 = 'Region_1';
if STATECOD = in('IL','IN','MI','OH','WI','IA','KS','MN','MO','NE','ND','SD')
THEN STATECOD_1 = 'Region_2';
if STATECOD = in('DE','FL','GA','MD','NC','SC','VA','DC','WV','AL','KY','MS','TN','AR','LA','OK','TX')
THEN STATECOD_1 = 'Region_3';
if STATECOD = in('AZ','CO','ID','MT','NV','NM','UT','WY','AK','CA','HI','OR','WA')
THEN STATECOD_1 = 'Region_4';
RUN;


Once again, create a lookup dataset and a format from that, and the cards section will look like

cards;
CT Region_1
ME Region_1
.....
IL Region_2
IN Region_2

and so on.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Ask a Question
Discussion stats
  • 9 replies
  • 258 views
  • 0 likes
  • 5 in conversation