DATA Step, Macro, Functions and more

to read the dataset with a variable contains alpha, or number

Reply
Super Contributor
Posts: 336

to read the dataset with a variable contains alpha, or number

I need to read the dataset contains both icd-9 and icd-10. My understanding is very limited.

So I am thinking to develop a program, that use the ANYALPHA and ANYDIGIT functions. But I am not clear how to do it.

 

it would be, if the first character of the icd is numeric, then it is icd-9;

if it is "V" or "E", it would be icd-9,

else it would be icd-10;

anyone with this expereince?

tips on how to use anyalpha, anydigit funtions?

Thanks.

Super User
Posts: 5,093

Re: to read the dataset with a variable contains alpha, or number

These statements could go inside a DATA step:

 

length type $ 6;

if icd > ' ' then do;

   if upcase(left(icd)) in : ('V', 'E', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9') then type='icd-9';

   else type='icd-10';

end;

Super Contributor
Posts: 336

Re: to read the dataset with a variable contains alpha, or number

[ Edited ]

thank you

when i think it over, i feel my knowledge is not enough

data is unlimited, so there should be some other options, so the value might be icd-9, or icd-10, or both, or other 3.

by using your code, i find there are around 60% of obs using icd10, 40% using icd9.

Now I think I would ask for advice, if there are four options, listed above, other than two options only.

I do not know how it looks like if the code is both, and how to develop a program to read that. any advice?

Thanks.

 

 

 

 

if the value of icd is more than 5 characters but less than 8, I can call it is icd10,

if the value has less than 3 character or more than 8 characters, i will call it "other"

...

Super User
Posts: 5,093

Re: to read the dataset with a variable contains alpha, or number

When it comes to questions about what is in your data, and how you should interpret it, I'm not sure I can help.  For example, I'm not sure when "both" would be appropriate.  But the issues you are talking about are cases that SAS can handle.

 

Notice that  your code contains a logical error.  ELSE applies to only the previous IF/THEN statement, not to both as a group.  It would be improved by changing the second IF/THEN statement to say:

 

else if upcase(left(icd)) in : ("A", "B", "C" ...) then type='icd=10';

 

It would help to know whether ICD is already left-hand justified or not.  If it is, you can eliminate the LEFT function wherever any of the sample code uses it.

 

To measure and use the length of ICD, these statements might be appropriate:

 

len = length(left(icd));

if len > 8 then type='other';

else if len < 3 then type='other';

else if len in (6, 7) then type='icd-10';

 

There are many ways to set up these statements.  None of them are complex, but it takes some attention to the details to decide on the proper set of rules.

Super User
Posts: 17,912

Re: to read the dataset with a variable contains alpha, or number

I'm going to suggest a different but approach, build a library of ALL ICD9 and ICD10 codes, preferably each as a format. Then check each value against that list and see if it's present in either, both, or neither and classify accordingly.  Lists of each set of codes are available online.

 

 

Ask a Question
Discussion stats
  • 4 replies
  • 278 views
  • 2 likes
  • 3 in conversation