BookmarkSubscribeRSS Feed
0 Likes

    Hi all,

 

I know I've not that much experience, in the whole SAS environment and in SAS programming too but, since the time I started working with SAS EG projects, looking into SAS Community and in the net, I'm seeing a lot of questions (and great confusion, where I can also add mine) about "type casting", an area where, in other languages like C, there is no confusion at all: nearly all mature languages have their own typecast functions, explicit (java, C, etc.) or implicit (python) which "take care" to make the data having a "safe landing" into the target variables (or throw errors, if impossible or leads to data loss).

 

What I mean is: if some topic, main documentation apart, needs a so broad amount of "guides" and documents further explaining it (which I really never seen, in all my life, among other languages) there should be something wrong in the way it have been thought: apart the initial read I gave to the docs, I never needed to further read how to cast an int into a long, or have to take personally care about how var(string_value) decided to work on "3,14" to have 3.14 assigned to my num_short_pi numeric variable.

 

While using SAS I'm constantly banging my head against "typecast walls", all the time having to PUT something into my INPUT to have it converted etc. etc... so, after some time (as I initially thought it was my problem), I asked to myself "why not having an "intelligent typecast" function, which reads the source variable's type and, working behind the scenes, converts it into my target variable wanted data type?"

 

Thus this proposal: I know there is a CAST() function already but, from the few I read, it seems more related to proc sql (and even not all of it: works into FEDSQL), the proposal is to have it available "everywhere" (so even outside PROC SQL), in base SAS, as an alternative to PUT/INPUT. Then, with time, if SAS devs will see users using it more than original PUT/INPUT, they could decide to keep CAST only... or to remove it if, to the contrary, it won't be used at all.

 

Syntax: CAST(source variable, target variable, target variable desired type, [target variable format]);

 

source variable: any type (numeric or char) source variable name

target variable: target variable name

target variable desired type: wanted target type (numeric or char), if not defined before

(optional) target variable format: the target format to apply to target variable desired type

 

As I'm not a SAS expert, even if I believe there shouldn't be problems in converting char-to-num or num-to-char, some problems could arise: this is why I wrote "target variable desired type": if desired type can't be achieved, e.g. there is data loss in the conversion, a warning should be raised, if the conversion can't take place at all, an error should be raised.

As I told, I'm no expert, so feel free to improve this proposal with your expertise, which it's for sure far greather than mine.

9 Comments
Tom
Super User
Super User

SAS already has implicit type casting. Try:

data test;
   length string $12 num1 num2 8;
   num1=3.4;
   string=num1;
   num2=string;
   put (_all_) (=);
run;

Since BASE SAS only has two data types (fixed length character strings and floating point numbers) it does not add much value to have a multipurpose CAST() function.  You can cast from numeric to character with PUT() or PUTN() and cast from character to numeric with INPUT() or INPUTN().  

 

What is the purpose of the TARGET VARIABLE argument in your proposed function?  How is it different than the DESIRED TYPE argument?

 

lc_isp
Quartz | Level 8

In short, is to prevent of receiving an error, when a function requires a datatype and one have to explicitly use PUT or INPUT (or both, in some cases):

 

CAST(num1, string) if "string" variable have been already defined as character, makes CAST() to use a PUT(). It could be even written as string = CAST(num1), in the simplest form. So, in your example, string = CAST(num1); and num2 = CAST(string);

 

In case string and num2 wasn't already defined, they need a target-type for the definition (unless it's a wanted behavior, to raise an "must be defined before" error), so:

string = CAST(num1, $12);
num2 = CAST(string, 8);

or

CAST(num1, string, $12);
CAST(string, num2, 8);

as preferred

 

extended format could be:

string = CAST(num1, 8, $20);

(which pads with spaces, in this case)

num2 = CAST(string, 8, BEST8.1);

or

CAST(num1, string, $20);
CAST(string, num2, 8, BEST8.1);

as preferred.

 

It's mostly to avoid, all the times, to have to think (and, at least in my case, usually to fail, getting errors) if to use PUT or INPUT: CAST should be an utility/support function (could be a macro too, which correctly translates to PUT or INPUT at need, together with their formats).

 

The problem I'm addressing is not that PUT or INPUT are not working well or as expected: the problem is, many of us (me included) are working in SAS while having a few (or no) training on it: I personally have been trained, and with basic training too, after more than 1 year I was actively using SAS EG. That generates a lot of confusion. Sure it's nothing a SAS expert experiments but, as I told, I read a ton of threads where the experts just suggest to correctly use PUT or INPUT so, why keep having (less trained) people, which have to work with SAS with the knowledge they (we) have at the moment, daily struggling with such errors, instead of "just writing an helper function"?

PaigeMiller
Diamond | Level 26

How would telling someone to use this great new CAST() function be better than telling the same person to use PUT()?

 

A long time ago, I tried to learn R. R has many different data types and data structures; if you picked the wrong one you couldn't do what you wanted. Other languages require the proper data type/data structure as well. If you are going to get really proficient at a language, you have to learn how it works.

lc_isp
Quartz | Level 8

@PaigeMiller TYVM to have shared your thoughts, I know you're right but, as 99% of the workers out here, "I do what I can, not what I want". 😉

ballardw
Super User

I had to look up stuff related to CAST functions to make sure I understood, at least a bit.

 

From what I picked up one of the main concerns with at least part of Cast is to assure proper conversion rules of values are met. Such as when needing an Integer converted to some sort of double precision float value.

Since foundation SAS and the basic SAS data set supports exactly one type of numeric value then that sort of conversion "help" isn't really helpful.

 

The internal conversion implied by Casting character to numeric is apparently a poor subset of the SAS INPUT function, at least from a brief study. I can see Cast to convert a character value like '1234.56' to a numeric. But the SAS input function will convert text like "MALE" to a numeric 1 with the proper informat. And make sure the value is upcased prior to the comparison of the list of values.  Not to mention assignment of special missing values for specific text like 'Not entered' 'Not collected' 'Missing' or what have you to allow differentiation between simple conversion error resulting in missing to a value that can be used to identify specifically why known cases are missing values.

 

SAS formats for PUT and informats for INPUT also allow the designer of the format/informat how to handle Other values (exceptions as treated by most other languages).

lc_isp
Quartz | Level 8

Since foundation SAS and the basic SAS data set supports exactly one type of numeric value then that sort of conversion "help" isn't really helpful.

It means CAST() could be helpful in some other scenarios too? (where more data types are supported, I'm not aware of what all SAS products data type support)

 

The internal conversion implied by Casting character to numeric is apparently a poor subset of the SAS INPUT function, at least from a brief study. I can see Cast to convert a character value like (...) missing values.

Which could mean that CAST(), if extended outside (FEDSQL) proc sql to foundation/basic SAS, could be recoded to work better? (maybe nowadays it have not be done 'cause of PUT/INPUT presence?)

 

SAS formats for PUT and informats for INPUT also allow the designer of the format/informat how to handle Other values

My proposal wasn't meant to replace PUT/INPUT by CAST: it should give an alternative, to be used by less knoledged people (like me) which had few luck with company's training, or have given no time to be properly trained, or such: if SAS devs think it could help people (like me, and the many I'm reading in the forums/net) an helper function should be welcomed (it could also reduce the amount of questions about "why I'm getting an error when ...?" usually implying PUT/INPUT conversions).

 

Side story: from time to time, I've noted that the expert thinks "how things should be" (and they certainly can do, from their high knowledge) but sometimes it could be an option to "just give up (with high theories) and do what (all) the people expects things should be", that even if the experts are right, and all the rest of the people is wrong. Just to give an example: in one of the last OS updates, Apple changed the way "precise (virtual) keyboard input" was managed into iPhones/iPads, 'cause their experts developed an "intelligent feature" which "understood" where the user wanted to place the cursor, when editing a text. That's the theory. Facts are "a bomb exploded", 'cause the "intelligent feature" was placing cursors where they thought a user wanted them... which, many times, was not where the user really wanted, and there was also no more the previous (oh so useful!) "magnifying glass" to help them. I read a ton users whyning in the forums and, as far as I'm seeing, in the last OS update I got from Apple, kind of the old magnifying glass was back: that's pragmatism, IMHO.

PaigeMiller
Diamond | Level 26

I disagree with your point that the situation can be remedied by adding a new function like CAST(). The problem some people have is not that they don't know how to use INPUT or PUT, but that they don't even realize that they are using a numeric variable where only character variables should be used, or that they are using a character variable when they need a numeric variable. They don't realize they have the wrong data type. This is the problem. Adding a working CAST() function in SAS does not eliminate the above problem, people who don't realize they have the wrong data type will not know to use INPUT or PUT, and they will not know to use CAST() either.

LinusH
Tourmaline | Level 20

CAST is a supported function i PROC FedSQL.

So it would make sense to make it available at least in PROC SQL for conformity, even if it doesn't add more functionality compared to PUT/INPUT functions.

Quentin
Super User

I understand that someone coming from SQL background would expect there to be a CAST function.  But I think it's fine for there to be a CAST function in  PROC FEDSQL without there being one in PROC SQL.  We know that PROC FEDSQL and PROC SQL are different languages.  I would expect PROC SQL is essentially 'functionally stable.' If CAST were added to PROC SQL, as a convenience, they would probably need to add it to DATA step as well.  Given that there are likely limited resources for coding enhancements to BASE SAS (i.e. DATA step language and PROC SQL language), I don't think the benefit of having a CAST function as a convenience would outweigh the opportunity costs.