BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
anjgupta
Calcite | Level 5

Hello,

I'd like to read in a variable and have it change from character to numeric - retaining the same name.  The issue I'm facing is when a character value of .U or .N is inputted into numeric - I lose the missing type and all become just dot (.)

If this isn't possible I can create new variables and drop the character variables - but if there is any way ....

I'm reading in variables as character and applying the format:

proc format;

    invalue chtono

    'Y'=        1

    'N'=        2

    'X'=        .N

    other=      .U;

Which results in character values of '1', '2', 'N', and 'U'.     When I attempt to change these to numbers - I lose the Ns and Us and only have 'generic' missings for them.  Regardless of an initial character length of 1 or 2.

I change them via:

data new(drop=x );

birthdata01 (rename=(&name. =x));

&name.=input(x, 2.);

run;

Most likely there isn't a way for SAS to translate a character 'U' to a missing numeric value of .U but thought to ask.

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Yes, the 1 indicated length and could either be specified as the default when you create the informat or, like I did, when you use the informat.

I had to guess at the new informat you introduced, but the following reads the kind of data that I think you are dealing with:

proc format;

  invalue chtono

    Y=1

    N=2

    X=.N

    other=.U;

  invalue degf

    H=1

    A=2

    B=3

    M=4

    X=.N

    other=.U;

run;

data want;

  input @1 (fdobmo fdobdy) (2.)

       (ever_mar married pat_ack) (chtono1.)

        mat_deg degf1.;

  cards;

0315YYYH

0617XXUB

1002NN2M

12142NXX

;

run;

View solution in original post

20 REPLIES 20
art297
Opal | Level 21

Wouldn't you get what you want, directly, if you apply the informat you created when you input the data?  e.g.,

proc format;

    invalue chtono

    'Y'=        1

    'N'=        2

    'X'=        .N

    other=      .U;

run;

data test;

  informat x chtono.;

  input x;

  cards;

1

Y

X

N

2

;

anjgupta
Calcite | Level 5

I can try - but the cards would be 'Y', 'N', 'X'.

The raw data are character and I'm trying to use the same variable name and end up with 1,2,.N and .U.

Thank you

art297
Opal | Level 21

I'm not sure what you mean.  If you are saying that the value in the datalines have quotes around them, just include an infile statement with a dsd option. e.g.,

proc format;

    invalue chtono

    'Y'=        1

    'N'=        2

    'X'=        .N

    other=      .U;

run;

data test;

  informat x chtono.;

  infile cards dsd;

  input x;

  cards;

'1'

'Y'

'X'

'N'

'2'

;

anjgupta
Calcite | Level 5

Thank you.  I think that will work - just need to test it with how I'm reading in the raw data.  I think I need to avoid infiling the data as character initially.

ag

anjgupta
Calcite | Level 5

Hi,

If I need to input the data and include the variables' lengths - how can I incorporate the suggested solution?

355  data stout.&state._RawBirthData ;          /* data read in to become formatted data */

356     infile stbirth LRECL=&birthreclength linesize=&birthlinesize

357     N=&birthLinesPerObs missover;

358     %create_code_statements(birthfmt, pramsvars) /*informats for formatting birth file */

...

MPRINT(CREATE_CODE_STATEMENTS):   ever_mar = input(ever_mar, chtono.);

MPRINT(CREATE_CODE_STATEMENTS):   married = input(married, chtono.);

...

MPRINT(CREATE_INPUT_STRING_D3):   #1 @0090 ever_mar 1.

MPRINT(CREATE_INPUT_STRING_D3):   #1 @0091 married 1.

...

NOTE: Invalid data for ever_mar in line 2 90-90.

NOTE: Invalid data for married in line 2 91-91.

Thank you,

Anjali

art297
Opal | Level 21

It looks like you are only reading in 1 character for ever_mar and 1 character for married.  As such, I would go back to my originally suggested proc format.  As for your errors, you macro appears to be assigning informats of 1. when you probably need them to be the name of the informat you create (e.g., chtono1.).

anjgupta
Calcite | Level 5

Thank you for the response.  Why is the length of 1 problematic?   I have put the formats in a file that I include at the start of the program.  I think you're suggesting #1 @0090 ever_mar  chotno.  ?  Is that correct?  I will do so.  Guess I need a bit more clarity - sorry for the hassle.

Anjali

art297
Opal | Level 21

A length of 1, in itself, isn't problematic.  However, since you don't have delimiters between fields, you have to account for that fact.  There is probably a modifier I can't think of at the moment thus, if I were pressed to read the data immediately, I would read them in as characters and convert them.  E.g., the following would provide (I think) what you expect:

proc format;

    invalue chtono

    'Y'=        1

    'N'=        2

    'X'=        .N

    other=      .U;

run;

data want (drop=in_:);

  input in_x $ 1-1 in_y $ 2-2;

  x=input(in_x,chtono.);

  y=input(in_y,chtono.);

  cards;

12

YN

XY

NX

21

;

run;

anjgupta
Calcite | Level 5

It *seems* to be working without designating the length.  Will test more. 

Previously, I was reading in the raw data with the $char1.  or 1. informat.   Then I was trying to apply my user-defined formats and ran into the char vs numeric issue.

Your suggestion to use my user-defined informats makes so much sense.  And much less code - as I can read and format the vars in 1 step:

MPRINT(CREATE_INPUT_STRING_D3):   #1 @0090 ever_mar chtono.

MPRINT(CREATE_INPUT_STRING_D3):   #1 @0091 married chtono.

MPRINT(CREATE_INPUT_STRING_D3):   #1 @0092 pat_ack chtono.

Fingers crossed - seems to work.

Many thanks.

art297
Opal | Level 21

I think that your code will produce the wrong results except for the last field and any field that is followed by a delimiter (e.g., a blank).

anjgupta
Calcite | Level 5

Thanks for the heads up.  I'll check.  Maybe the @0090 etc will help designate the start of a new var ...  probably not!

Just trying to avoid the double name convention - reading in 1 var as character and renaming it to a numeric.

anjgupta
Calcite | Level 5

Yeah - back to the drawing board.  Argh.    I do have the start and end column documented for each variable and can subtract them +1 to get the length.


Basically - am I correct that as soon as you designate length you also imply if it is char or numeric?  And then my attempt falls apart?   I've been struggling with this for weeks - maybe time to quit the attempt for finesse.   

It's so close - the initially char variables are indeed ending up as numeric.  Just an issue with reading in too much data due to the lack of a length.

Anjali

art297
Opal | Level 21

I think that the following might work for you:

proc format;

    invalue chtono

    Y=        1

    N=        2

    X=        .N

    other=      .U;

run;

data want;

  input #1 @1 (x y) (chtono1.)

        #2 z $;

  cards;

12

xx

YN

xx

XY

xx

NX

xx

21

xx

;

run;

anjgupta
Calcite | Level 5

Would you mind dissecting/explaining it a bit?   It chtono1.  equivalent to chtono above?  The '1' doesn't indicate length, does it?  Are the lengths implied in the code above?

My raw data is all on 1 line per client, non delimited.  And the SAS code is created from a spreadsheet containing:

           
varNamefromDatastartPointerendPointerRownformat
                                                                       
fdobmobirth0085008612.
fdobdybirth0087008812.
ever_marbirth009000901chtono.
marriedbirth009100911chtono.
pat_ackbirth009200921chtono.
mat_degbirth009300931degf.

At least that's from my last attempt at changing nformat to include some created informats as well as lengths with char and numeric designations.

Ever grateful,

Anjali

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 20 replies
  • 1809 views
  • 6 likes
  • 4 in conversation