BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Cooksam13
Fluorite | Level 6

I am wanting to categorize data within 2 ranges, what happens is they all result in missing or the missing data is wrongfully categorized. this is my code:

 

data new;

set old;

if WHt='.' then Obese1=.;
else if  Wht< 30 then Obese1="N";
else Obese1="Y";

run;

 

results in this warning over and over again and every row in the data under Obese1 resulting with '.'

 

NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
103:17 103:34 107:9 108:34 109:18 118:4
NOTE: Variable un is uninitialized.
NOTE: Invalid numeric data, 'N' , at line 108 column 34.
race_eth=Hispanic (any race) ID=2012000176 MotherHeight=05:07 PriorWeight=170 PREV_ALIVE=2 RFDiabGest= EstGest=39 EstGestOb=
EstGestClin=True Plurality=1 MatEthnicity=210 MatRace=01 M_AGE=31 GestDiab=N prev_alive2=2 Ht_Inches=67 PRiorweight2=170
BMI=26.698596569 ObesePrior=. un=. Preterm=N _ERROR_=1 _N_=1
NOTE: Invalid numeric data, 'Y' , at line 109 column 18.
race_eth=Hispanic (any race) ID=2012000324 MotherHeight=05:02 PriorWeight=220 PREV_ALIVE=1 RFDiabGest=True EstGest=38 EstGestOb=True
EstGestClin= Plurality=1 MatEthnicity=280 MatRace=01 M_AGE=32 GestDiab=Y prev_alive2=1 Ht_Inches=62 PRiorweight2=220
BMI=40.348595213 ObesePrior=. un=. Preterm=N _ERROR_=1 _N_=2
 
 

When I rearrange the code:

if Wht < 30 then Obese1="N"; else if Wht='.' then Obese1=.;
else Obese1="Y";

 

there are no warnings or errors, BUT, the the missing 'Wht" data are categorized as "N" in the Obese1 column.

what is happening here?

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Your first statement starts you off on the wrong path.  Using a . as the value is the correct way to refer to a missing value for a numeric variable.  However, since you want ObesePrior to be a character value taking on values like "Y" and "N", change the code to make it a character variable:

if missing(BMI) then ObesePrior=" ";
else if BMI < 30 then ObesePrior="N"; 
else ObesePrior= 'Y';

View solution in original post

7 REPLIES 7
ballardw
Super User

@Cooksam13 wrote:

I am wanting to categorize data within 2 ranges, what happens is they all result in missing or the missing data is wrongfully categorized. this is my code:

 

data new;

set old;

if WHt='.' then Obese1=.;
else if  Wht< 30 then Obese1="N";
else Obese1="Y";

run;

 

results in this warning over and over again and every row in the data under Obese1 resulting with '.'

 

NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
103:17 103:34 107:9 108:34 109:18 118:4
NOTE: Variable un is uninitialized.
NOTE: Invalid numeric data, 'N' , at line 108 column 34.
race_eth=Hispanic (any race) ID=2012000176 MotherHeight=05:07 PriorWeight=170 PREV_ALIVE=2 RFDiabGest= EstGest=39 EstGestOb=
EstGestClin=True Plurality=1 MatEthnicity=210 MatRace=01 M_AGE=31 GestDiab=N prev_alive2=2 Ht_Inches=67 PRiorweight2=170
BMI=26.698596569 ObesePrior=. un=. Preterm=N _ERROR_=1 _N_=1
NOTE: Invalid numeric data, 'Y' , at line 109 column 18.
race_eth=Hispanic (any race) ID=2012000324 MotherHeight=05:02 PriorWeight=220 PREV_ALIVE=1 RFDiabGest=True EstGest=38 EstGestOb=True
EstGestClin= Plurality=1 MatEthnicity=280 MatRace=01 M_AGE=32 GestDiab=Y prev_alive2=1 Ht_Inches=62 PRiorweight2=220
BMI=40.348595213 ObesePrior=. un=. Preterm=N _ERROR_=1 _N_=2
 
 

When I rearrange the code:

if Wht < 30 then Obese1="N"; else if Wht='.' then Obese1=.;
else Obese1="Y";

 

there are no warnings or errors, BUT, the the missing 'Wht" data are categorized as "N" in the Obese1 column.

what is happening here?


Copy the entire log including the data step code when you have questions about anything in the log.

 

In this case: WHT does not exist in your data. I know that because this error message:

NOTE: Invalid numeric data, 'N' , at line 108 column 34.
race_eth=Hispanic (any race) ID=2012000176 MotherHeight=05:07 PriorWeight=170 PREV_ALIVE=2 RFDiabGest= EstGest=39 EstGestOb=
EstGestClin=True Plurality=1 MatEthnicity=210 MatRace=01 M_AGE=31 GestDiab=N prev_alive2=2 Ht_Inches=67 PRiorweight2=170
BMI=26.698596569 ObesePrior=. un=. Preterm=N _ERROR_=1 _N_=1

includes the values of every single variable assigned at the time the error occurs. You also have no variable named Obese1.

Your code does not show a variable UN used, but the cause would be listing a variable in a statement but not assigning a value to it.

 

Do not test numeric variable for missing with a '.' , quoted period. That is a character variable. Do not test character variables for missing with '.' either because that is an actual value. Better is to use the MISSING function since it works with both types of variables:

if missing(somevariablename) then do <whatever>;

Missing values are less than any value. So if Var has a missing value : Var < 30 is true.

 

You cannot assign character values, i.e. 'Y' or 'N' to numeric variables. You can determine the characteristics for your variables by running:

Proc contents data=<yourdatasetname>;
run;

If this shows the variable a numeric but you see Y and N, then that means a Format has been assigned that will show up in the Proc Contents. You will need to assign the numeric value associated with the format. At a guess 1 =Y and 0 is N but other codes may be used.

 

Cooksam13
Fluorite | Level 6

Sorry for the confusion, I changed "BMI" to Wht and "ObesePrior" to Obese1 in the question to appease to my professor in not including exact code

Cooksam13
Fluorite | Level 6

I did your suggestion 

if missing(BMI) then ObesePrior=.;
else if BMI < 30 then ObesePrior="N";
else ObesePrior= 'Y';

and I got the same error 

Astounding
PROC Star

Your first statement starts you off on the wrong path.  Using a . as the value is the correct way to refer to a missing value for a numeric variable.  However, since you want ObesePrior to be a character value taking on values like "Y" and "N", change the code to make it a character variable:

if missing(BMI) then ObesePrior=" ";
else if BMI < 30 then ObesePrior="N"; 
else ObesePrior= 'Y';
Rebecca7
Calcite | Level 5

Hello all,

I am new on SAS community. I am preparing for SAS base certification and I came across this question ( I am expected to fix the errors on this code and run the program).

I have used the example someone posted on this community yet I am still getting errors in my code.

This is the code.

data work.lowchol work.highchol; 

  set sashelp.heart;

   if cholesterol lt 200 output work.lowchol;

  if cholesterol ge 200 output work.highchol;

  if cholesterol is missing output work.misschol;

run;

 

 

PaigeMiller
Diamond | Level 26

Probably best to start a new thread. Also, when you get errors in the log, show us the ENTIRE log (not just the error messages).

--
Paige Miller
Rebecca7
Calcite | Level 5
1) When I did a proc print of the initial dataset given, it looped and turned forever and when it finally stopped it gave me the above log.
 
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 data xyz;
74 set sashelp.heart;
75 run;
 
NOTE: There were 5209 observations read from the data set SASHELP.HEART.
NOTE: The data set WORK.XYZ has 5209 observations and 17 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 2330.21k
OS Memory 28332.00k
Timestamp 09/06/2023 10:18:11 PM
Step Count 60 Switch Count 2
Page Faults 0
 
2) When i tried to categorize the dataset into 3 categories using if- then-else statement, then followed by a proc sort statement. This is what i got in the log.
, but i got my output, and well categorized.
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73
74 data work.lowchol work.highchol;
75 set sashelp.heart;
76 if cholesterol < 200 then output = 'work.lowchol';
77 else if cholesterol ge 200 then output = 'work.highchol';
78 else if cholesterol = '.' then output = 'work.misschol';
79 run;
 
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
78:26
NOTE: There were 5209 observations read from the data set SASHELP.HEART.
NOTE: The data set WORK.LOWCHOL has 5209 observations and 18 variables.
NOTE: The data set WORK.HIGHCHOL has 5209 observations and 18 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
user cpu time 0.01 seconds
system cpu time 0.00 seconds
memory 3835.15k
OS Memory 31408.00k
Timestamp 09/06/2023 10:29:25 PM
Step Count 96 Switch Count 4
Page Faults 0
Page Reclaims 690
Page Swaps 0
Voluntary Context Switches 23
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 4120
3) When i tried to do a proc print of the data that i categorized and sorted so i could answer the questions that followed. I did not get any result for the proc print but this is what i got in the log.
 
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 proc sort data = work.lowchol work.highchol;
_____________
22
201
ERROR 22-322: Syntax error, expecting one of the following: ;, (, ASCII, BUFFNO, DANISH, DATA, DATECOPY, DETAILS, DIAG, DUPOUT,
EBCDIC, EQUALS, FINNISH, FORCE, IN, ISA, L, LEAVE, LIST, MESSAGE, MSG, NATIONAL, NODUP, NODUPKEY, NODUPKEYS,
NODUPLICATE, NODUPLICATES, NODUPREC, NODUPRECS, NODUPS, NOEQUALS, NORWEGIAN, NOTHREADS, NOUNIKEY, NOUNIKEYS,
NOUNIQUEKEY, NOUNIQUEKEYS, NOUNIQUEREC, NOUNIQUERECS, NOUNIREC, NOUNIRECS, OSA, OUT, OVERWRITE, PAGESIZE, PRESORTED,
PSIZE, REVERSE, SIZE, SORTSEQ, SORTSIZE, SORTWKNO, SWEDISH, T, TAGSORT, TECH, TECHNIQUE, TESTHSI, THREADS, UNIOUT,
UNIQUEOUT, WKNO, WORKNO.
ERROR 201-322: The option is not recognized and will be ignored.
74 by output;
75 run;
 
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 1354.84k
OS Memory 28840.00k
Timestamp 09/06/2023 10:34:52 PM
Step Count 103 Switch Count 0
Page Faults 0
Page Reclaims 247
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 0
 
 
 
76 proc print data = work.lowchol work.highchol;
_____________
22
201
ERROR 22-322: Syntax error, expecting one of the following: ;, (, BLANKLINE, CONTENTS, DATA, DOUBLE, GRANDTOTAL_LABEL,
GRANDTOT_LABEL, GRAND_LABEL, GTOTAL_LABEL, GTOT_LABEL, HEADING, LABEL, N, NOOBS, NOSUMLABEL, OBS, ROUND, ROWS, SPLIT,
STYLE, SUMLABEL, UNIFORM, WIDTH.
ERROR 201-322: The option is not recognized and will be ignored.
77 run;
 
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.00 seconds
user cpu time 0.01 seconds
system cpu time 0.00 seconds
memory 1405.15k
OS Memory 28840.00k
Timestamp 09/06/2023 10:34:52 PM
Step Count 104 Switch Count 0
Page Faults 0
Page Reclaims 252
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 0
 
78
79 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 
 
 
80
81 proc sort data = work.lowchol work.highchol;
_____________
22
201
ERROR 22-322: Syntax error, expecting one of the following: ;, (, ASCII, BUFFNO, DANISH, DATA, DATECOPY, DETAILS, DIAG, DUPOUT,
EBCDIC, EQUALS, FINNISH, FORCE, IN, ISA, L, LEAVE, LIST, MESSAGE, MSG, NATIONAL, NODUP, NODUPKEY, NODUPKEYS,
NODUPLICATE, NODUPLICATES, NODUPREC, NODUPRECS, NODUPS, NOEQUALS, NORWEGIAN, NOTHREADS, NOUNIKEY, NOUNIKEYS,
NOUNIQUEKEY, NOUNIQUEKEYS, NOUNIQUEREC, NOUNIQUERECS, NOUNIREC, NOUNIRECS, OSA, OUT, OVERWRITE, PAGESIZE, PRESORTED,
PSIZE, REVERSE, SIZE, SORTSEQ, SORTSIZE, SORTWKNO, SWEDISH, T, TAGSORT, TECH, TECHNIQUE, TESTHSI, THREADS, UNIOUT,
UNIQUEOUT, WKNO, WORKNO.
ERROR 201-322: The option is not recognized and will be ignored.
82 by output;
83 run;
 
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.00 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 1356.71k
OS Memory 29608.00k
Timestamp 09/06/2023 10:29:25 PM
Step Count 97 Switch Count 0
Page Faults 0
Page Reclaims 241
Page Swaps 0
Voluntary Context Switches 0
Involuntary Context Switches 0
Block Input Operations 0
Block Output Operations 0
 
84
85 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 
 
 
3) When i tried to do a proc sort of the data i categorized above, i could see values for work.lowchol 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1586 views
  • 0 likes
  • 5 in conversation