BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ody
Quartz | Level 8 Ody
Quartz | Level 8

Hi folks,

 

I have mixed numeric data formatted as character. The data values range from the single digits into the billions. Some of the data is null or missing, some has decimal places, some of the numeric characters have commas, etc... 

 

I'm having a really hard time converting these values to numeric with the correct format. I've tried almost all the numeric formats I can think of within an input statement. 

 

As an example:

value1a = input(value1,24.2);

value1a = input(value1,comma24.2);

value1a = input(value1,??24.2);

value1a = input(value1,??24.);

 

Some of these formats don't produce results for very large numbers, some don't produce results at all. The closest I've gotten was using input(value1,??24.) but this is adding decimals to all the values turing a value like 1280 into 12.80.

 

Any thoughts on this? 

 

Thanks!

 

 

data test;

infile datalines dsd truncover;

input

value1:$200.

value2:$200.

value3:$200.

value4:$200.     

;

 

datalines;

'1001322', '48,371,804.50', '91662', '2,295,968.99'

'1280', '73,393.39', '143, 3,656.94'

'2', 222, , '59888.99'

'255287', '22,401,559.53', '5432618', '210,129,208.53'

'735034', '34,031,056.78', '89640',

'1992923', '104,877,814.20', '5614063', '1,214,793,943.92'

;;;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

It looks like some of the data was read in incorrectly (previously).  For example, it looks like on observation 2, VALUE3 should be 143 and VALUE4 should be 3,656.94. 

 

Once that is cleared up, the comma24. informat should work just fine.

View solution in original post

7 REPLIES 7
Reeza
Super User

Try using the BEST32 informat instead and let SAS make a guess. I suspect it may do a better job.

Note that when you specify a informat of 24.2 it expects all values to have two decimal places, so 100 becomes 1.00

 


@Ody wrote:

Hi folks,

 

I have mixed numeric data formatted as character. The data values range from the single digits into the billions. Some of the data is null or missing, some has decimal places, some of the numeric characters have commas, etc... 

 

I'm having a really hard time converting these values to numeric with the correct format. I've tried almost all the numeric formats I can think of within an input statement. 

 

As an example:

value1a = input(value1,24.2);

value1a = input(value1,comma24.2);

value1a = input(value1,??24.2);

value1a = input(value1,??24.);

 

Some of these formats don't produce results for very large numbers, some don't produce results at all. The closest I've gotten was using input(value1,??24.) but this is adding decimals to all the values turing a value like 1280 into 12.80.

 

Any thoughts on this? 

 

Thanks!

 

 

data test;

infile datalines dsd truncover;

input

value1:$200.

value2:$200.

value3:$200.

value4:$200.     

;

 

datalines;

'1001322', '48,371,804.50', '91662', '2,295,968.99'

'1280', '73,393.39', '143, 3,656.94'

'2', 222, , '59888.99'

'255287', '22,401,559.53', '5432618', '210,129,208.53'

'735034', '34,031,056.78', '89640',

'1992923', '104,877,814.20', '5614063', '1,214,793,943.92'

;;;

 


 

Ody
Quartz | Level 8 Ody
Quartz | Level 8

Thanks for the feedback.

 

I tried using best32. and while some of the data converted accurately much of it did not. 

 

I seem to be running into an issue where data is missing or the numbers are large with decimals or commas.

 

Any other suggestions?

Astounding
PROC Star

Try the comma24. informat.  The ?? is optional.

Ody
Quartz | Level 8 Ody
Quartz | Level 8

Thanks for the reply,

 

The format comma24. doesn't seem to work either. I tried that earlier just didnt list it as one of the option I tried.

 

Here is a copy of my log using the best32. format. 

 

Appreciate any insight into this. Thanks.

 


data test1;
set test;

value1a = input(value1,best32.);
value2a = input(value2,best32.);
value3a = input(value3,best32.);
value4a = input(value4,best32.);

run;


479  data test1;
480  set test;
481
482  value1a = input(value1,best32.);
483  value2a = input(value2,best32.);
484  value3a = input(value3,best32.);
485  value4a = input(value4,best32.);
486
487  run;

NOTE: Invalid argument to function INPUT at line 483 column 11.
NOTE: Invalid argument to function INPUT at line 485 column 11.
value1=1001322 value2=48,371,804.50 value3=91662 value4=2,295,968.99 value1a=1001322 value2a=.
value3a=91662 value4a=. _ERROR_=1 _N_=1
NOTE: Invalid argument to function INPUT at line 483 column 11.
NOTE: Invalid argument to function INPUT at line 484 column 11.
value1=1280 value2=73,393.39 value3=143, 3,656.94 value4=  value1a=1280 value2a=. value3a=.
value4a=. _ERROR_=1 _N_=2
NOTE: Invalid argument to function INPUT at line 483 column 11.
NOTE: Invalid argument to function INPUT at line 485 column 11.
value1=255287 value2=22,401,559.53 value3=5432618 value4=210,129,208.53 value1a=255287
value2a=. value3a=5432618 value4a=. _ERROR_=1 _N_=4
NOTE: Invalid argument to function INPUT at line 483 column 11.
value1=735034 value2=34,031,056.78 value3=89640 value4=  value1a=735034 value2a=. value3a=89640
value4a=. _ERROR_=1 _N_=5
NOTE: Invalid argument to function INPUT at line 483 column 11.
NOTE: Invalid argument to function INPUT at line 485 column 11.
value1=1992923 value2=104,877,814.20 value3=5614063 value4=214,793,943.92 value1a=1992923
value2a=. value3a=5614063 value4a=. _ERROR_=1 _N_=6
NOTE: Mathematical operations could not be performed at the following places. The results of
      the operations have been set to missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      5 at 483:11   1 at 484:11   3 at 485:11
NOTE: There were 6 observations read from the data set WORK.TEST.
NOTE: The data set WORK.TEST1 has 6 observations and 8 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.00 seconds


NOTE: Remote submit to EMSVR complete.
Astounding
PROC Star

It looks like some of the data was read in incorrectly (previously).  For example, it looks like on observation 2, VALUE3 should be 143 and VALUE4 should be 3,656.94. 

 

Once that is cleared up, the comma24. informat should work just fine.

novinosrin
Tourmaline | Level 20
data test;

infile datalines dsd truncover;

input

value1:$200.

value2:$200.

value3:$200.

value4:$200.     

;

 

datalines;
'1001322', '48,371,804.50', '91662', '2,295,968.99'
'1280', '73,393.39', '143, 3,656.94'
'2', 222, , '59888.99'
'255287', '22,401,559.53', '5432618', '210,129,208.53'
'735034', '34,031,056.78', '89640',
'1992923', '104,877,814.20', '5614063', '1,214,793,943.92'
;;;


data want;
set test;
v1=input(value1,comma32.);
v2=input(value2,comma32.);
v3=input(value3,comma32.);
v4=input(value4,comma32.);
run;

 
Ody
Quartz | Level 8 Ody
Quartz | Level 8

You're correct. I forget a few single quotes when supplying my test data.

 

I've changed that and reran the code using both comma24. and comma32. as a format. The results are what I would expect, which is good. This seems to have solved my problem, thanks. Not really sure why when I tried that format earlier I was having issues.

 

Any way, thanks.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 3001 views
  • 2 likes
  • 4 in conversation