BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
somebody
Lapis Lazuli | Level 10

I have a CRSP dataset and one of the variables (RET) is RETURN which has a numeric format 11.6. However, there are observations that has character values( B and C character) for this variable ( which does not make sense). I cannot query to choose or remove these observations. I have tried " if Ret = 'C' then .... " and if Ret = C then ....." and don't. Is there a way to work around this?

Thanks,

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

1. Make sure it was imported correctly - so it would need to be a character variable. If it's not, then you've already lost that information. 

 

If it is numeric and is showing values of B/C that likely means you're using a format on that variable or special missing values. 

if this is the case, then try if Ret = .C  or if missing(RET)

View solution in original post

7 REPLIES 7
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Check out the properties for the dataset, is it telling you that variable is really numeric?  If so then:

where not missing(ret)...

Should work.  If it is character, then what you have should work.  If its got a format applied then check out what the format is.  Really hard for me to guess without any information.  Use this post to show some test data:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

Reeza
Super User

1. Make sure it was imported correctly - so it would need to be a character variable. If it's not, then you've already lost that information. 

 

If it is numeric and is showing values of B/C that likely means you're using a format on that variable or special missing values. 

if this is the case, then try if Ret = .C  or if missing(RET)

somebody
Lapis Lazuli | Level 10

This works. thanks

 

ballardw
Super User

Check the data set properties to see what the actual format assigned is. If it is actually numeric someone may have assigned SAS special missing values. In some views they special value shows but is excluded for calculations.

 

data example;
   input numeric;
datalines;
1
2
.
.B
;
run;

Proc freq;run;
proc print;run;
Astounding
PROC Star

Have you done any research into why you see a "B" or "C" in a supposedly numeric field?  That might be important to know.

 

At any rate, you will likely need RET to be numeric.  If you can see a "B" or a "C", it's likely character.  (If you are seeing "B" or "C" after using PROC IMPORT, a common tool with CRSP data, then it's definitely character.  But PROC CONTENTS will confirm that.)  So you will need to do something like this:

 

data want;

set have;

ret_numeric = input(ret, ??10.);

if ret_numeric = . then delete;

drop ret;

rename ret_numeric = ret;

run;

somebody
Lapis Lazuli | Level 10

i used to download CRSP data in CSV format and then import it. But this time, I chose the dataset tobe in SAS dataset format and hence the issue. It turns out missing value can take other values too, not just "." so in this case it is .B and .C

Reeza
Super User

So these are the special missing values I mentioned earlier, you will likely find some notes in the documentation on what they mean specifically, ie .B may mean no answer and .C may mean not asked (I made those up, check the docs!). 

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 3884 views
  • 4 likes
  • 5 in conversation