BookmarkSubscribeRSS Feed
azhar7860
Obsidian | Level 7
   I am working on SAS  dataset consists of 31 variables and 164,450 obs.
 
And there are 2 to 3 character variables, iam trying to convert them into numeric to check correlation using proc cor.
i have written a code its very generic would you suggest any smart code.
Any suggestion will be highly appreciated
 
Thanks Kindly
 
 
Here is my code:
 
data all;
set all;
if  STATECOD ="AE" then state =1 ;
if  STATECOD ="AK" then state =2 ;
if  STATECOD ="AL" then state =3 ;
if  STATECOD ="AP" then state =4 ;
if  STATECOD ="AR" then state =5 ;
if  STATECOD ="AZ" then state =6 ;
if  STATECOD ="CA" then state =7 ;
if  STATECOD ="CO" then state =8 ;
if  STATECOD ="CT" then state =9 ;
if  STATECOD ="DC" then state =10 ;
if  STATECOD ="DE" then state =11 ;
if  STATECOD ="FL" then state =12 ;
if  STATECOD ="GA" then state =13 ;
if  STATECOD ="HI" then state =14 ;
if  STATECOD ="IA" then state =15 ;
if  STATECOD ="ID" then state =16 ;
if  STATECOD ="IL" then state =17 ;
if  STATECOD ="IN" then state =18 ;
if  STATECOD ="KS" then state =19 ;
if  STATECOD ="KY" then state =20 ;
if  STATECOD ="LA" then state =21 ;
if  STATECOD ="MA" then state =22 ;
if  STATECOD ="MD" then state =23 ;
if  STATECOD ="ME" then state =24 ;
if  STATECOD ="MI" then state =25 ;
if  STATECOD ="MN" then state =26 ;
if  STATECOD ="MO" then state =27 ;
if  STATECOD ="MS" then state =28 ;
if  STATECOD ="MT" then state =29 ;
if  STATECOD ="NC" then state =30 ;
if  STATECOD ="ND" then state =31 ;
if  STATECOD ="NE" then state =32 ;
if  STATECOD ="NH" then state =33 ;
if  STATECOD ="NJ" then state =34 ;
if  STATECOD ="NM" then state =35 ;
if  STATECOD ="NV" then state =36 ;
if  STATECOD ="NY" then state =37 ;
if  STATECOD ="OH" then state =38 ;
if  STATECOD ="OK" then state =39 ;
if  STATECOD ="OR" then state =40 ;
if  STATECOD ="PA" then state =41 ;
if  STATECOD ="PR" then state =42 ;
if  STATECOD ="RI" then state =43 ;
if  STATECOD ="SC" then state =44 ;
if  STATECOD ="SD" then state =45 ;
if  STATECOD ="TN" then state =46 ;
if  STATECOD ="TX" then state =47 ;
if  STATECOD ="UT" then state =48 ;
if  STATECOD ="VA" then state =49 ;
if  STATECOD ="VI" then state =50 ;
if  STATECOD ="VT" then state =51 ;
if  STATECOD ="WA" then state =52 ;
if  STATECOD ="WI" then state =53 ;
if  STATECOD ="WV" then state =54 ;
if  STATECOD ="WY" then state =55 ;
run;
 
 
data all;
set all;
if  NTITLE ="Mr" then title =1;
if  NTITLE ="Mrs" then title =2;
if  NTITLE ="Ms" then title =3;
if  NTITLE ="None" then title =4;
run;
9 REPLIES 9
LinusH
Tourmaline | Level 20
Create look up tables with these mappings.
You can use that either to join with your original table, or as source for creating SAS formats, which you later can use to transform your values.
Data never sleeps
ballardw
Super User

See if

 

State= stfips(statecod);

 

looks helpful.

Though I am wondering what "state" is represented by "AE", "AP" ???

 

FIPS is a US standard for certain types of coding and SAS has a number of functions that support such things.

Also this function will accept mixed case in case some of your values are "Al" instead of "AL".

 

The values won't exactly align but have the added value that they are a known standard and you can retrieve information from then with functions such as FIPNAME or FIPNAMEL or FIPSTATE.

Kurt_Bremser
Super User

What @LinusH means might look like this:

data myformat;
input start $ label $
fmtname = 'mystates';
type = 'C';
datalines;
AE 1
AK 2
AL 3
;
run;

You can join along start, or use it as a cntlin for proc format; note how the datalines reduce your typing effort.

azhar7860
Obsidian | Level 7

 

 

 

I mdofied my previous codes and divided the states into  4 Regions, but its generating a character missing values.

any inputs will be appreciated and also can i change the new variable STATECOD_1 to Numeric Type.

 

Thanks kindly

 

 

 

data sas_pr.merge_4;
set sas_pr.merge_3;
length STATECOD_1 $ 14;
if STATECOD = in('CT', 'ME','MA','NH','RI','VT','NJ','NY','PA')
THEN STATECOD_1 = 'Region_1';
if STATECOD = in('IL','IN','MI','OH','WI','IA','KS','MN','MO','NE','ND','SD')
THEN STATECOD_1 = 'Region_2';
if STATECOD = in('DE','FL','GA','MD','NC','SC','VA','DC','WV','AL','KY','MS','TN','AR','LA','OK','TX')
THEN STATECOD_1 = 'Region_3';
if STATECOD = in('AZ','CO','ID','MT','NV','NM','UT','WY','AK','CA','HI','OR','WA')
THEN STATECOD_1 = 'Region_4';
RUN;

PGStats
Opal | Level 21

There should be no equal sign between STATECOD and IN(...).

 

if STATECOD IN(....) then ....;

PG
ballardw
Super User

@azhar7860 wrote:

 

 

 

I mdofied my previous codes and divided the states into  4 Regions, but its generating a character missing values.

any inputs will be appreciated and also can i change the new variable STATECOD_1 to Numeric Type.

 

Thanks kindly

 

 

 

data sas_pr.merge_4;
set sas_pr.merge_3;
length STATECOD_1 $ 14;
if STATECOD = in('CT', 'ME','MA','NH','RI','VT','NJ','NY','PA')
THEN STATECOD_1 = 'Region_1';
if STATECOD = in('IL','IN','MI','OH','WI','IA','KS','MN','MO','NE','ND','SD')
THEN STATECOD_1 = 'Region_2';
if STATECOD = in('DE','FL','GA','MD','NC','SC','VA','DC','WV','AL','KY','MS','TN','AR','LA','OK','TX')
THEN STATECOD_1 = 'Region_3';
if STATECOD = in('AZ','CO','ID','MT','NV','NM','UT','WY','AK','CA','HI','OR','WA')
THEN STATECOD_1 = 'Region_4';
RUN;


I notice AE AP VI PR  are no longer involved ...

Kurt_Bremser
Super User

@azhar7860 wrote:

 

 

 

I mdofied my previous codes and divided the states into  4 Regions, but its generating a character missing values.

any inputs will be appreciated and also can i change the new variable STATECOD_1 to Numeric Type.

 

Thanks kindly

 

 

 

data sas_pr.merge_4;
set sas_pr.merge_3;
length STATECOD_1 $ 14;
if STATECOD = in('CT', 'ME','MA','NH','RI','VT','NJ','NY','PA')
THEN STATECOD_1 = 'Region_1';
if STATECOD = in('IL','IN','MI','OH','WI','IA','KS','MN','MO','NE','ND','SD')
THEN STATECOD_1 = 'Region_2';
if STATECOD = in('DE','FL','GA','MD','NC','SC','VA','DC','WV','AL','KY','MS','TN','AR','LA','OK','TX')
THEN STATECOD_1 = 'Region_3';
if STATECOD = in('AZ','CO','ID','MT','NV','NM','UT','WY','AK','CA','HI','OR','WA')
THEN STATECOD_1 = 'Region_4';
RUN;


Once again, create a lookup dataset and a format from that, and the cards section will look like

cards;
CT Region_1
ME Region_1
.....
IL Region_2
IN Region_2

and so on.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1028 views
  • 0 likes
  • 5 in conversation