DATA Step, Macro, Functions and more

Format issue with character variable

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 9
Accepted Solution

Format issue with character variable

I have a variable that I am importing from a flat file. It contains spaces, alpha characters and numbers. I am importing it using input with a length of 32 ($32).

My problem is that some of the numbers are showing up in exponential format (see examples below). Anyone have a suggestion on how to correct this, so that all numerics show the full number?

Acct_Nbr

1.00009E+12

7200025802

1.01018E+12

6581202097

6.06005E+13

7441258501

5.20004E+11

900-161-6121

323553729

2.00004E+12

4.07028E+19

662631175

1.53495E+11

662631175

4127400093

ABC 123324

Thanks in advance for the assistance!


Accepted Solutions
Solution
‎10-05-2011 09:40 AM
PROC Star
Posts: 7,474

Re: Format issue with character variable

Posted in reply to Mr_Breeze

I would think that you are trying to accomplish something like the following:

data TC1;

  infile "path\today.csv"

  delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2;

  informat Beneficiary_Account $32. ;

  format Beneficiary_Account $32. ;

  input  Beneficiary_Account;

  if not missing(input(Beneficiary_Account, ?? best32.))

   then Beneficiary_Account=strip(put(input(

     Beneficiary_Account,best32.),best32.));

run;

View solution in original post


All Replies
Occasional Contributor
Posts: 7

Format issue with character variable

Posted in reply to Mr_Breeze

Mr Breeze,

Could you post some input data and the sas code you are using..?

Regards,

Chendhil

PROC Star
Posts: 7,474

Format issue with character variable

Posted in reply to Mr_Breeze

It would help if you provide some example incoming data and the code you are using to import the file.  You may only have to apply a numeric format (like best32.).

Occasional Contributor
Posts: 9

Format issue with character variable

Posted in reply to Mr_Breeze

  Here's part of my code:

data TC1;

  infile "path\today.csv"

  delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2;

  informat Beneficiary_Account $32. ;

  format Beneficiary_Account $32. ;

input  Beneficiary_Account $;

run;

Here's an example of the incoming data (csv file):

27569276

27569276

5.102E+11

LU520023190087756900

8377715

11937232

1.0023E+11

501-408899-3

8.01411E+13

2323339

5103366633

1.6023E+11

1.00021E+13

820930

1700008145

2.72304E+13

7881283120

8.98002E+11

5221725656

CM128804

Super User
Super User
Posts: 7,050

Format issue with character variable

Posted in reply to Mr_Breeze

Character strings don't know or care if the string looks like a number in scientific notation.

So what is it that you want to change? 

Do you want to convert the scientific notation character strings in the input CSV file into some other format?  What format do you want. 

Sounds like the real problem might be in whatever process generated the CSV file.  Do you have control over that process?

Solution
‎10-05-2011 09:40 AM
PROC Star
Posts: 7,474

Re: Format issue with character variable

Posted in reply to Mr_Breeze

I would think that you are trying to accomplish something like the following:

data TC1;

  infile "path\today.csv"

  delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2;

  informat Beneficiary_Account $32. ;

  format Beneficiary_Account $32. ;

  input  Beneficiary_Account;

  if not missing(input(Beneficiary_Account, ?? best32.))

   then Beneficiary_Account=strip(put(input(

     Beneficiary_Account,best32.),best32.));

run;

Valued Guide
Posts: 765

Re: Format issue with character variable

Hi ... just a suggestion about Art's code, this

Beneficiary_Account=strip(put(input(Beneficiary_Account,best32.),best32.));


can be replaced by

Beneficiary_Account=cats(input(Beneficiary_Account,best32.));

Regular Contributor
Posts: 184

Re: Format issue with character variable

Also get rid of the "best" in the informat. It's just an alias, one which reinforces the misconception that something is being optimized.

Beneficiary_Account=cats(input(Beneficiary_Account,32.));


MikeZdeb wrote:

Hi ... just a suggestion about Art's code, this

Beneficiary_Account=strip(put(input(Beneficiary_Account,best32.),best32.));


can be replaced by

Beneficiary_Account=cats(input(Beneficiary_Account,best32.));

PROC Star
Posts: 7,474

Re: Format issue with character variable

Agreed!  Based on user notes like http://support.sas.com/kb/19/665.html I always presumed that it did do some kind of optimizing.  However, based on your post I tried to look it up and discovered that you are indeed correct .. as usual!

p.s. I've just started reading your book and, while I'm only at the beginning, I already like what I see!

Occasional Contributor
Posts: 9

Format issue with character variable

Posted in reply to Mr_Breeze

Art, that took care of it.

Thanks everyone for the help!!

Super User
Super User
Posts: 7,050

Re: Format issue with character variable

Posted in reply to Mr_Breeze

You should still double check what is causing the underlying issue.  If your account numbers have more than 12 significant digits or require leading zeros then you will not be able to convert the scientific notation generated by the BEST12. format back to the original string.

Valued Guide
Posts: 765

Re: Format issue with character variable

Posted in reply to Mr_Breeze

Hi ... here's something new in V9.3, creating an  INFORMAT (or FORMAT) from a user-defined function.

The MISSING and FIND in the SELECT statement is 'situation-specific' since this ...

when (^missing(input(a, best32.))) c = a;

currently does not work (produces errors).  Maybe someone else knows a good way to determine if a character string

can be converted to a number without using an INPUT function.

See also ... "Example 12: Creating a Function to Use as a Format"

http://support.sas.com/documentation/cdl/en/proc/63079/HTML/default/viewer.htm#p1gg77jyhc9s42n1f1vjy...


proc fcmp outlib=work.functions.math;

  function acc(a $) $;

  select;

    when (^missing(compress(a,,'d')) and ^find(a,'E+')) c = a;

    otherwise c = cats(input(a,best32.));

  end;

  return(c);

  endsub;

quit;

options cmplib=work.functions;

proc format;

invalue $acc other = [acc()];

run;

data new;

input x : & $acc32. @@;

datalines;

27569276   27569276   5.102E+11   LU520023190087756900  

8377715   11937232

1.0023E+11   4.07028E+19   662631175   1.53495E+11  

662631175   4127400093   ABC 123324

;

run;


results ...

27569276

27569276

510200000000

LU520023190087756900

8377715

11937232

100230000000

40702800000000000000

662631175

153495000000

662631175

4127400093

ABC 123324


Super User
Super User
Posts: 7,050

Re: Format issue with character variable

I do not have 9.3, but did you try adding '??' to the input function?

when (^missing(input(a, ??best32.))) c = a;

PROC Star
Posts: 7,474

Re: Format issue with character variable

Tom,  That is what I did in my original suggestion.

Valued Guide
Posts: 765

Re: Format issue with character variable

Hi ... here's why ...

32   proc fcmp outlib=work.functions.math;

33     function acc(a $) $;

34     select;

35       when (^missing(input(a, ?? best32.))) c = a;

                                 -

                                 22

                                 200

ERROR 22-322: Expecting a format name.

ERROR 200-322: The symbol is not recognized and will be ignored.

36       otherwise c = cats(input(a,best32.));

37     end;

38     return(c);

39     endsub;

40   quit;


You get the same error with ...

when (^missing(input(a, ? best32.))) c = a;

So, after a few email exchanges, I learned that ? and ?? might be available in PROC FCMP in a later version of V9.3.

ps  Another variation ...

http://www.sascommunity.org/wiki/Tips:Create_an_Informat_from_a_User-Defined_Function

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 14 replies
  • 2406 views
  • 0 likes
  • 6 in conversation