BookmarkSubscribeRSS Feed
LostInTheCity
Calcite | Level 5

Hi All,

I've come across a challenge I'm struggling to resolve. 

I have a SAS dataset created from an MS SQL Server table which has a field stored in SQL as varchar(500). This field shows as character length 500 in the SAS dataset as expected.  Values showing in this field are simply single digits (0,1,2,etc.) and I need to convert these to number.  So normally should be case of using an input function with the correct format however when I do this I get the following error message:

data newdata;

set olddata;

format newvar 3.;

newvar=input(oldvar,3.);

run;

NOTE: Invalid numeric data

      '30002020202020202020202020202020202020202020202020202020202020202020202020202020202020202

      020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202

      020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202

      020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202

      020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202

      020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202

      020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202

      020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202

      020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202

      020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202

      020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202

      02020202020'X

I think this means the data is stored in hex format.  However I'm confused as to why it is showing in the data s 0,1,2, etc even when I've removed all formats using proc datasets.

If it helps the value I need in this case is always the second character of the above string.

So my questions are:

- is this really a hex format or something else?

- and how do I convert it to the numeric value of what I'm seeing in the data?

Incidentally, the alternative approach of doing this on the SQL side using cast or convert isn't an option as I don't have the permissions to modify the data.  Hence trying to do the conversion in SAS once the data is extracted.

Any help gratefully received!

Thanks all

4 REPLIES 4
Tom
Super User Tom
Super User

Looks like the data is stored with a binary zero at the second position that is causing the issue. Try changing to:

newvar=input(scan(oldvar,1,'00'x),3.);


Check if the data is actually stored as the hex representation or if it just regular characters with $HEX format attached.  If it really is stored as '3000...' instead of '3000...'X then you will need to also add step to convert from hex characters to the actual values.  INPUT(oldvar,$HEX1000.)

TomKari
Onyx | Level 15

All data is in effect hexadecimal, but reading hex data is a bit of a lost art.

Your string is a byte "30", followed by "00", then by 498 "20" characters. The "30" is the digit zero as character, and the "20" is the character for space, so these are fine in the context of your input statement. As Tom says, the "00", which is a "NUL" character and is frequently used to terminate strings and records, is the problem. Somehow this data was improperly translated from SQL Server to SAS.

Tom's solution, of taking all of the characters up to the "00" character, should work very nicely.

Astounding
PROC Star

Based on all the comments so far, this might be the right combination:

newvar = input(scan(oldvar,1,'00'x), hex6.);

It assumes that the proper value in the data will be contain no more than 3 digits.  If it could be more, a wider informat than hex6 would be needed.

LostInTheCity
Calcite | Level 5

Thanks all for the help.

The final solution is slightly more complex but the responses gave me a really good steer.  Essentially it turns out that a (small) number of records have more than one byte so using '00'x as a delimiter risked cutting off some values where the final character was 0.  So I substituted this with '002'x to generate the string containing true values then created an array to loop through this taking every odd character to create the final string to translate to a number.  Not very elegant but works and can also be applied to other fields where I have the same issue.

So much for hoping there was a simple format to do this!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1068 views
  • 6 likes
  • 4 in conversation