BookmarkSubscribeRSS Feed
azhar7860
Obsidian | Level 7
   I am working on SAS  dataset consists of 31 variables and 164,450 obs.
 
And there are 2 to 3 character variables, iam trying to convert them into numeric to check correlation using proc cor.
i have written a code its very generic would you suggest any smart code.
Any suggestion will be highly appreciated
 
Thanks Kindly
 
 
Here is my code:
 
data all;
set all;
if  STATECOD ="AE" then state =1 ;
if  STATECOD ="AK" then state =2 ;
if  STATECOD ="AL" then state =3 ;
if  STATECOD ="AP" then state =4 ;
if  STATECOD ="AR" then state =5 ;
if  STATECOD ="AZ" then state =6 ;
if  STATECOD ="CA" then state =7 ;
if  STATECOD ="CO" then state =8 ;
if  STATECOD ="CT" then state =9 ;
if  STATECOD ="DC" then state =10 ;
if  STATECOD ="DE" then state =11 ;
if  STATECOD ="FL" then state =12 ;
if  STATECOD ="GA" then state =13 ;
if  STATECOD ="HI" then state =14 ;
if  STATECOD ="IA" then state =15 ;
if  STATECOD ="ID" then state =16 ;
if  STATECOD ="IL" then state =17 ;
if  STATECOD ="IN" then state =18 ;
if  STATECOD ="KS" then state =19 ;
if  STATECOD ="KY" then state =20 ;
if  STATECOD ="LA" then state =21 ;
if  STATECOD ="MA" then state =22 ;
if  STATECOD ="MD" then state =23 ;
if  STATECOD ="ME" then state =24 ;
if  STATECOD ="MI" then state =25 ;
if  STATECOD ="MN" then state =26 ;
if  STATECOD ="MO" then state =27 ;
if  STATECOD ="MS" then state =28 ;
if  STATECOD ="MT" then state =29 ;
if  STATECOD ="NC" then state =30 ;
if  STATECOD ="ND" then state =31 ;
if  STATECOD ="NE" then state =32 ;
if  STATECOD ="NH" then state =33 ;
if  STATECOD ="NJ" then state =34 ;
if  STATECOD ="NM" then state =35 ;
if  STATECOD ="NV" then state =36 ;
if  STATECOD ="NY" then state =37 ;
if  STATECOD ="OH" then state =38 ;
if  STATECOD ="OK" then state =39 ;
if  STATECOD ="OR" then state =40 ;
if  STATECOD ="PA" then state =41 ;
if  STATECOD ="PR" then state =42 ;
if  STATECOD ="RI" then state =43 ;
if  STATECOD ="SC" then state =44 ;
if  STATECOD ="SD" then state =45 ;
if  STATECOD ="TN" then state =46 ;
if  STATECOD ="TX" then state =47 ;
if  STATECOD ="UT" then state =48 ;
if  STATECOD ="VA" then state =49 ;
if  STATECOD ="VI" then state =50 ;
if  STATECOD ="VT" then state =51 ;
if  STATECOD ="WA" then state =52 ;
if  STATECOD ="WI" then state =53 ;
if  STATECOD ="WV" then state =54 ;
if  STATECOD ="WY" then state =55 ;
run;
 
 
data all;
set all;
if  NTITLE ="Mr" then title =1;
if  NTITLE ="Mrs" then title =2;
if  NTITLE ="Ms" then title =3;
if  NTITLE ="None" then title =4;
run;
9 REPLIES 9
LinusH
Tourmaline | Level 20
Create look up tables with these mappings.
You can use that either to join with your original table, or as source for creating SAS formats, which you later can use to transform your values.
Data never sleeps
ballardw
Super User

See if

 

State= stfips(statecod);

 

looks helpful.

Though I am wondering what "state" is represented by "AE", "AP" ???

 

FIPS is a US standard for certain types of coding and SAS has a number of functions that support such things.

Also this function will accept mixed case in case some of your values are "Al" instead of "AL".

 

The values won't exactly align but have the added value that they are a known standard and you can retrieve information from then with functions such as FIPNAME or FIPNAMEL or FIPSTATE.

Kurt_Bremser
Super User

What @LinusH means might look like this:

data myformat;
input start $ label $
fmtname = 'mystates';
type = 'C';
datalines;
AE 1
AK 2
AL 3
;
run;

You can join along start, or use it as a cntlin for proc format; note how the datalines reduce your typing effort.

azhar7860
Obsidian | Level 7

 

 

 

I mdofied my previous codes and divided the states into  4 Regions, but its generating a character missing values.

any inputs will be appreciated and also can i change the new variable STATECOD_1 to Numeric Type.

 

Thanks kindly

 

 

 

data sas_pr.merge_4;
set sas_pr.merge_3;
length STATECOD_1 $ 14;
if STATECOD = in('CT', 'ME','MA','NH','RI','VT','NJ','NY','PA')
THEN STATECOD_1 = 'Region_1';
if STATECOD = in('IL','IN','MI','OH','WI','IA','KS','MN','MO','NE','ND','SD')
THEN STATECOD_1 = 'Region_2';
if STATECOD = in('DE','FL','GA','MD','NC','SC','VA','DC','WV','AL','KY','MS','TN','AR','LA','OK','TX')
THEN STATECOD_1 = 'Region_3';
if STATECOD = in('AZ','CO','ID','MT','NV','NM','UT','WY','AK','CA','HI','OR','WA')
THEN STATECOD_1 = 'Region_4';
RUN;

PGStats
Opal | Level 21

There should be no equal sign between STATECOD and IN(...).

 

if STATECOD IN(....) then ....;

PG
ballardw
Super User

@azhar7860 wrote:

 

 

 

I mdofied my previous codes and divided the states into  4 Regions, but its generating a character missing values.

any inputs will be appreciated and also can i change the new variable STATECOD_1 to Numeric Type.

 

Thanks kindly

 

 

 

data sas_pr.merge_4;
set sas_pr.merge_3;
length STATECOD_1 $ 14;
if STATECOD = in('CT', 'ME','MA','NH','RI','VT','NJ','NY','PA')
THEN STATECOD_1 = 'Region_1';
if STATECOD = in('IL','IN','MI','OH','WI','IA','KS','MN','MO','NE','ND','SD')
THEN STATECOD_1 = 'Region_2';
if STATECOD = in('DE','FL','GA','MD','NC','SC','VA','DC','WV','AL','KY','MS','TN','AR','LA','OK','TX')
THEN STATECOD_1 = 'Region_3';
if STATECOD = in('AZ','CO','ID','MT','NV','NM','UT','WY','AK','CA','HI','OR','WA')
THEN STATECOD_1 = 'Region_4';
RUN;


I notice AE AP VI PR  are no longer involved ...

Kurt_Bremser
Super User

@azhar7860 wrote:

 

 

 

I mdofied my previous codes and divided the states into  4 Regions, but its generating a character missing values.

any inputs will be appreciated and also can i change the new variable STATECOD_1 to Numeric Type.

 

Thanks kindly

 

 

 

data sas_pr.merge_4;
set sas_pr.merge_3;
length STATECOD_1 $ 14;
if STATECOD = in('CT', 'ME','MA','NH','RI','VT','NJ','NY','PA')
THEN STATECOD_1 = 'Region_1';
if STATECOD = in('IL','IN','MI','OH','WI','IA','KS','MN','MO','NE','ND','SD')
THEN STATECOD_1 = 'Region_2';
if STATECOD = in('DE','FL','GA','MD','NC','SC','VA','DC','WV','AL','KY','MS','TN','AR','LA','OK','TX')
THEN STATECOD_1 = 'Region_3';
if STATECOD = in('AZ','CO','ID','MT','NV','NM','UT','WY','AK','CA','HI','OR','WA')
THEN STATECOD_1 = 'Region_4';
RUN;


Once again, create a lookup dataset and a format from that, and the cards section will look like

cards;
CT Region_1
ME Region_1
.....
IL Region_2
IN Region_2

and so on.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1564 views
  • 0 likes
  • 5 in conversation