- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello all,
I am having some trouble converting my character variables to numerics. I have about 13 variables to convert so I was trying to use an array.
I currently have this array set up:
Array X[13] $ zipcode survdate q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 COVIDscore;
Array Y[13] Newzipcode Newsurvdate Newq1 Newq2 Newq3 Newq4 Newq5 Newq6 Newq7 Newq8 Newq9 Newq10 NewCOVIDscore;
Do I=1 to 13;
Y(I) = input(X(I),best32.);
End;
Drop I
But I keep getting warning statements in my log:
WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed.
NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to
missing values.
Each place is given by: (Number of times) at (Line):(Column).
4052 at 686:18
686 Y(I) = input(X(I),best32.);
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Let's look at some details of your existing data:
zipcode=92365-4856 q1=1 q2=1 q3=1 q4=1 q5=1 q6=1 q7=0 q8=0 q9=0 q10=0 COVIDscore=86 survdate=11/02/2020
Zipcode = 92365-4856 is not a number. So attempting to convert it using any of the numeric informats fails. The - either has to precede digits to indicate negative values, -12345 OR in scientific notation to precede the exponent to indicate decimal places 1.5E-5 for example.
f you want Zipcode to be numeric you need to make a decision of what you expect for a numeric value and write specific code for that. Possibly : newzip = input(scan(zipcode,1,'-'),f5.) would be one way to create a five-digit numeric zip.
Similarly survdate is not a simple number because of the / characters and can't be dealt with the same numeric informat. That would take a date informat such as mmddyy10. (and should have a date format associated so people can understand what is stored).
I
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You need to look at that messages before that in the log. Basically it is saying the string does not contain a number.
28 data test; 29 array X $32 string; 30 array Y number; 31 input string ; 32 y[1]=input(x[1],32.); 33 cards; NOTE: Invalid argument to function INPUT at line 32 column 8. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 35 abc _I_=. string=abc number=. _ERROR_=1 _N_=2 NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values. Each place is given by: (Number of times) at (Line):(Column). 1 at 32:8 NOTE: The data set WORK.TEST has 2 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 36 ;
If you don't care about the invalid strings and just want to create missing values without SAS writing notes into the SAS log you can use the ? or ?? modifier in front of the informat in the
37 data test; 38 array X $32 string; 39 array Y number; 40 input string ; 41 y[1]=input(x[1],?32.); 42 cards; RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 44 abc _I_=. string=abc number=. _ERROR_=1 _N_=2 NOTE: The data set WORK.TEST has 2 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 45 ; 46 data test; 47 array X $32 string; 48 array Y number; 49 input string ; 50 y[1]=input(x[1],??32.); 51 cards; NOTE: The data set WORK.TEST has 2 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 54 ;
Note there is not a separate informat named BEST. If you ask for it you just get the normal numeric informat.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The log messages above what I initially posted are:
id=3276 city=Otside La County age=50-59 gender= zipcode=92365-4856 q1=1 q2=1 q3=1 q4=1 q5=1 q6=1 q7=0 q8=0 q9=0 q10=0 COVIDscore=86
survdate=11/02/2020 Newgender= Newcity=Outside of Los Angeles Newq1=. Newq2=. Newq3=. Newq4=. Newq5=. Newq6=. Newq7=. Newq8=.
Newq9=. Newq10=. NewCOVIDscore=. Newsurvdate=. I=14 var1=. var2=. var3=1 var4=1 var5=1 var6=1 var7=1 var8=1 var9=0 var10=0 var11=0
var12=0 var13=86 _ERROR_=1 _N_=19
NOTE: Invalid argument to function INPUT at line 686 column 18.
NOTE: Invalid argument to function INPUT at line 686 column 18.
It does that a bunch of times. Do you think its because the format of my zipcode and survdate? (zipcode in the dataset is 9 digits but I was instructed to change it to 5, and survdate in the dataset is MMDDYY10, but I was instructed to change it to weekday.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Let's look at some details of your existing data:
zipcode=92365-4856 q1=1 q2=1 q3=1 q4=1 q5=1 q6=1 q7=0 q8=0 q9=0 q10=0 COVIDscore=86 survdate=11/02/2020
Zipcode = 92365-4856 is not a number. So attempting to convert it using any of the numeric informats fails. The - either has to precede digits to indicate negative values, -12345 OR in scientific notation to precede the exponent to indicate decimal places 1.5E-5 for example.
f you want Zipcode to be numeric you need to make a decision of what you expect for a numeric value and write specific code for that. Possibly : newzip = input(scan(zipcode,1,'-'),f5.) would be one way to create a five-digit numeric zip.
Similarly survdate is not a simple number because of the / characters and can't be dealt with the same numeric informat. That would take a date informat such as mmddyy10. (and should have a date format associated so people can understand what is stored).
I
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Ah I see, so for the variables zipcode and survdate I would have to use the input function to convert it to a new variable. I cannot use the array for those 2? But for q1-q10 I can use the array?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@shortyofhb wrote:
Ah I see, so for the variables zipcode and survdate I would have to use the input function to convert it to a new variable. I cannot use the array for those 2? But for q1-q10 I can use the array?
Correct.
SAS is very nice when you have sequentially numbered variables to place in an array. You can use shorthand like: array <name> q1-q10. And for your new variables if you name an array like:
Array newq(10) that will create 10 numeric variables named newq1, newq2, newq3 etc. It is a good idea to understand the individual naming though to get used to thinking of order of definition is important.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you so much for the help!!
So I converted created a new variable for zipcode and survdate with the input function. That clarifies it up alot!
One more quick question (I hope). How would I go about using an array to recode a variable to saying something else? For example, in my Proc Format I have 1="Yes" and 0="No" my instructions are however are to "use an array to recode 0="No" to 2="No."
Would I create an array and have it so when 0 is shown it gets replaced with 2? And then use a format to change 2 to "No"?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Maybe this is homework, if it is ignore the following.
There is just one way to solve problems like yours: fix the data import!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It is a homework assignment unfortunately haha, the dataset is meant to be a mess!