BookmarkSubscribeRSS Feed
Phamhhm
Fluorite | Level 6

Hi

I have problem for coding character var in categorical var.

 

DATA Bla;

SET Bla;

IF            area = "Chicago"                         THEN  area = 1;

ELSE IF  area = "New York City"               THEN area = 2;

ELSE IF  area = "DC/Maryland/Virginia"    THEN area = 3;

ELSE                                                                     area = . ;

RUN;

 

PROC FREQ DATA = Bla;

TABLE area;

RUN;

 

I don't get the categories in my table, jus a row with :

 

area          Frequency          Percent         Cumulative Frequency          Cumulative Percent

.                50000                 100.00                 50000                              100.00

 

 

Thanks

18 REPLIES 18
Reeza
Super User
You cannot change a variable type like that. And you destroyed your data so you need to start over and recreate the bla file first.

You cannot recode a variable from character (area) to numeric (area). Change the variable name for one and it will be fine.
Phamhhm
Fluorite | Level 6

I have a copy of the data.

I did change the name of the variable, but still doesn't work.

 

narea = 1;

narea =2;

.....

 

 

But for my other variable, i have not problem by coding :

 

DATA Bla;

SET Bla;

IF           x_type = "AAA" THEN x_type=1;

ELSE IF x_type = "BBB" THEN x_type=2;

ELSE                                         x_type= .;

       

Reeza
Super User
Check your log. There's a note in there regarding type conversion I bet. I suspect your area variable is already numeric with a format applied, check it via PROC CONTENTS.
Phamhhm
Fluorite | Level 6

Original data = Bla

var = area : Char

 

 

Data = Blabla

new var = narea : Num

Reeza
Super User
I'm not sure what you're trying to say. The variables appear to have the correct type, but if it's not working and you don't show your code, I can't help beyond what I've already stated. Good Luck.
Phamhhm
Fluorite | Level 6

I start over, delete the data set, import the data set, give the name of the variable narea  , and it does not work.

 

Reeza
Super User
Post your full new code and the log then.
'does not work' does not provide any information that we can help you with. You need to provide explicit details.
ballardw
Super User

@Phamhhm wrote:

Hi

I have problem for coding character var in categorical var.

 

DATA Bla;

SET Bla;

IF            area = "Chicago"                         THEN  area = 1;

ELSE IF  area = "New York City"               THEN area = 2;

ELSE IF  area = "DC/Maryland/Virginia"    THEN area = 3;

ELSE                                                                     area = . ;

RUN;

 

PROC FREQ DATA = Bla;

TABLE area;

RUN;

 

I don't get the categories in my table, jus a row with :

 

area          Frequency          Percent         Cumulative Frequency          Cumulative Percent

.                50000                 100.00                 50000                              100.00

 

 

Thanks


You should look at your log from the data step. I bet it has message about character value was converted to numeric some where.

 

I think that your actual data in BLA before this step had a numeric variable named area. In that case when you did

if area='Chicago' then

SAS attempted to change the text "Chicago" to a numeric value to compare the actual value of the variable area. Since all of your comparisons shown would fail to convert to numeric the result of each comparison was "false" and the only assignment possible was the final assignment to missing.

 

 

Perhaps you had a different variable intended to compare such as CITY or similar?

But as @Reeza says the original data set BLA has been overwritten and may not be useable for your purpose.

 

Especially when recoding an existing variable the use of the

Data bla;

    set bla;

is extremely dangerous as logic errors, such as in this example, have the potential of destroying you data.

Best is to create a new data set.

Repeated use of the structure also makes it extremely difficult to debug where something may have gone wrong as you overwrite the complete set at each use.

Phamhhm
Fluorite | Level 6

I also did change the DATA set name and it doesn't work.

 

DATA Blabla;

SET Bla;

 

 

 

Shmuel
Garnet | Level 18

Please check your input by running

     proc freq data=Bla; table area; run;

 

Are there the areas you are looking for? Do they match upcase/locase letters exactly ?

 

and as said before:

  - change output dataset name to differ from input

  - define the category variable name different from current variables in dataset

Phamhhm
Fluorite | Level 6

 

 

1).  SAS doesn't care about upper case or lower case.

 

2). Given a new name to a dataset is to prevent losing the original dataset.

 

3). Given a new name to a variable after transformation/manipulation is to prevent losing the old variable.

 

All the 3 little details are not the source of my problem, because I have try and you could try it to.

 

Anyways, here are the code , as you can see... 

 

Thanks  for the help.... People are not working all the time, people get out and go eat.  

 

 

 

 

 

***********************************************************************************************************

DATA PHAM.Blabla;
SET  PHAM.Bla;
IF      div_type = "BTH" THEN div_type = 1;
ELSE IF div_type = "LDD" THEN div_type = 2;
ELSE IF div_type = "LTD" THEN div_type = 3;
ELSE                          div_type = .;
RUN;

 

***************************************************************************************************************

NOTE: Numeric values have been converted to character values at the places given by:
      (Line):(Column).
      93:42   94:42   95:42   96:42
NOTE: There were 50000 observations read from the data set PHAM.BLA.
NOTE: The data set PHAM.BLABLA has 50000 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

 

***************************************************************************************************************

 

PROC FREQ DATA = PHAM.Blabla;
TABLE div_type;
RUN;

***************************************************************************************************************

 

Version:1.0 StartHTML:000000271 EndHTML:000037427 StartFragment:000035746 EndFragment:000037389 StartSelection:000035746 EndSelection:000037377 SourceURL:file:///C:/Users/Minh%20Pham/AppData/Local/Temp/SAS%20Temporary%20Files/_TD2756_DESKTOP-MSHHIQ0_/sashtml.htmSAS Output
The SAS System

The FREQ Procedure
4072181.444072181.44
9631.934168483.37
785515.714953999.08
4610.9250000100.00

 

 

 

***************************************************************************************************************************************

 

DATA PHAM.Blabla;
SET  PHAM.Bla;
IF        area = "Atlantic South"        THEN narea = 1;
ELSE IF   area = "California North"      THEN narea = 2;
ELSE IF   area = "Central/South Texas"   THEN narea = 3;
ELSE IF   area = "Chicago"               THEN narea = 4;
ELSE IF   area = "Dallas"                THEN narea = 5;
ELSE IF   area = "DC/Maryland/Virginia"  THEN narea = 6;
ELSE IF   area = "Great Lakes"           THEN narea = 7;
ELSE IF   area = "Houston"               THEN narea = 8;
ELSE IF   area = "Los Angeles"           THEN narea = 9;
ELSE IF   area = "Midwest"               THEN narea = 10;
ELSE IF   area = "New England"           THEN narea = 11;
ELSE IF   area = "New York City"         THEN narea = 12;
ELSE IF   area = "North Florida"         THEN narea = 13;
ELSE IF   area = "Northwest/Rocky Mountain" THEN narea = 14;
ELSE IF   area = "Ohio"                     THEN narea = 15;
ELSE IF   area = "Philadelphia"             THEN narea = 16;
ELSE IF   area = "South Florida"            THEN narea = 17;
ELSE IF   area = "Southwest"                THEN narea = 18;
ELSE IF   area = "Tennessee"                THEN narea = 19;
ELSE                                             narea = .;
RUN;
 
******************************************************************************************************************
 
NOTE: There were 50000 observations read from the data set PHAM.BLA.
NOTE: The data set PHAM.BLABLA has 50000 observations and 3 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.03 seconds
 
********************************************************************************************************************
 

PROC FREQ DATA = PHAM.Blabla;
TABLE narea;
RUN;
 
 
*********************************************************************************************************************
 
Version:1.0 StartHTML:000000271 EndHTML:000036897 StartFragment:000035746 EndFragment:000036859 StartSelection:000035746 EndSelection:000036821 SourceURL:file:///C:/Users/Minh%20Pham/AppData/Local/Temp/SAS%20Temporary%20Files/_TD2756_DESKTOP-MSHHIQ0_/sashtml.htmSAS Output
The SAS System

The FREQ Procedure
    
 
 
***********************************************************************************************************************************
 
 
PROC CONTENTS DATA = PHAM.Bla;
RUN;
 
********************************************************************************************************************************
 
Version:1.0 StartHTML:000000271 EndHTML:000041087 StartFragment:000035746 EndFragment:000041049 StartSelection:000035746 EndSelection:000041049 SourceURL:file:///C:/Users/Minh%20Pham/AppData/Local/Temp/SAS%20Temporary%20Files/_TD2756_DESKTOP-MSHHIQ0_/sashtml.htmSAS Output
The SAS System

The CONTENTS Procedure
PHAM.BLA50000
DATA2
V90
2018-11-15 19:15:4133
2018-11-15 19:15:410
 NO
 NO
  
WINDOWS_64 
wlatin1 Western (Windows) 

 

65536
26
1
1977
1940
0
YES
C:\Program Files\Oxford\HEC\Techniques Exploitation Données\Travail2\bla.sas7bdat
9.0401M5
X64_10HOME
DESKTOP-MSHHIQ0\Minh Pham
2MB
1769472

 

areaChar30$F30.$F30.Area
div_typeChar3$F3.$3.Division Type Code
 
 
 
************************************************************************************************************************************
 
 
PROC CONTENTS DATA = PHAM.Blabla;
RUN;
 
 
************************************************************************************************************************************
 
 
Version:1.0 StartHTML:000000271 EndHTML:000041088 StartFragment:000035746 EndFragment:000041050 StartSelection:000035746 EndSelection:000041050 SourceURL:file:///C:/Users/Minh%20Pham/AppData/Local/Temp/SAS%20Temporary%20Files/_TD2756_DESKTOP-MSHHIQ0_/sashtml.htmSAS Output
The CONTENTS Procedure
PHAM.BLABLA50000
DATA3
V90
2018-11-15 19:25:2748
2018-11-15 19:25:270
 NO
 NO
  
WINDOWS_64 
wlatin1 Western (Windows) 

 

65536
37
1
1361
1333
0
YES
C:\Program Files\Oxford\HEC\Techniques Exploitation Données\Travail2\blabla.sas7bdat
9.0401M5
X64_10HOME
DESKTOP-MSHHIQ0\Minh Pham
2MB
2490368

 

areaChar30$F30.$F30.Area
div_typeChar3$F3.$3.Division Type Code
nareaNum8   
 
 
 
 
 
 
 

 

 

 

 

 
Shmuel
Garnet | Level 18

@Phamhhm wrote

1).  SAS doesn't care about upper case or lower case.

that's right dealing with variable names.

that's wrong dealing with variable value.

 

Phamhhm
Fluorite | Level 6

The dataset is available, the code and the log output are also posted as requested .

Does anyone know how to figure out the solution to my problem that can help a beginner in SAS....

 

 

Phamhhm
Fluorite | Level 6

Solution is found, problem is solved.

 

Thank you for all your advices.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 18 replies
  • 1620 views
  • 4 likes
  • 4 in conversation