- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello
I am trying to filter a data using a set of ICD codes. In my sas dataset, there are 18 diagnosis-related variables (1 primary and 17 secondary; character and $7) and I have a set of ICD codes/conditions I am interested in. I want to tell SAS to examine all 18 variables and pick out the observations that match one of the ICD codes. Here's the ICD codes I am interested in:
O36.4XX0-036.4XX9
Z37.1 Z37.3 Z37.4
Z37.6
Z37.60
Z37.61
Z37.62
Z37.63
Z37.64
Z37.69
Z37.7
I think I can use CATX but not sure how to go about it. Any comments would be greatly appreciated. Thanks!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The way that I've seen this done before:
- Use a SAS array to hold the values of each diagnosis code variable in your record
- Then use the IN operator to check for each of your potential DX codes.
See this paper for some examples.
Excerpt:
data ExampleA2;
set TableA;
array AllDiagCodes {25} $ DgCd1 – DgCd25;
RespInd = 0;
do i = 1 to 25;
if AllDiagCodes{i} IN ('A100','A304','A503')
then RespInd = 1;
end;
drop i;
run;
A faster, and probably more robust method will involve the hash object. But hash objects have a bit of a learning curve.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The first thing we need is an example of your dataset. See my footnotes for how to convert a dataset to data step code and how to post that code.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your reply. Unfortunately I don't have access to the data at the moment. What I posted is a summary of the instruction I got and I was trying to figure out a way to do it before I receive the data.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You can use catx() to concatenate all the character variables and then use index() to see if a comparison string is present in there.
If you need to check for several strings, you can either use a macro %do loop to repetitively create code, or store your comparison strings into a temporary array you can use in a (datastep code) do loop.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The way that I've seen this done before:
- Use a SAS array to hold the values of each diagnosis code variable in your record
- Then use the IN operator to check for each of your potential DX codes.
See this paper for some examples.
Excerpt:
data ExampleA2;
set TableA;
array AllDiagCodes {25} $ DgCd1 – DgCd25;
RespInd = 0;
do i = 1 to 25;
if AllDiagCodes{i} IN ('A100','A304','A503')
then RespInd = 1;
end;
drop i;
run;
A faster, and probably more robust method will involve the hash object. But hash objects have a bit of a learning curve.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
1) You wrote: "there are 18 diagnosis-related variables ..." - please post your dataset format, to show which variables are there,
and what do you mean by primary and secondary icodes. It is enougth to display the releant variables.
2) Is the set you are looking for fix or do you preffer to have it dymaic, easy to change ?
3) You mentiond that ICOD is a CHAR type $7 - then what do you mean by:
O36.4XX0-036.4XX9 /* is this a rage of ICODES */ Z37.1 Z37.3 Z37.4 /* is this a list of ICODES ? */
Is the ICODE aligned to left or it can contain leading spaces ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I clearly confused a lot of people... Hope this clarifies a few things
1) I haven't received the data yet I just know there are 18 diagnosis-related variables I can use and the format(size) of these variables are txt (7)
2) Fixed
3) Yes, those are the ICD codes we are interested in.
As far as I know they do not have leading spaces and they are aligned to left.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Kiko I concur with @ChrisHemedinger on hash solution. If you could provide us a clear HAVE sample and a WANT sample, I would like to attempt the hash solution and that helps not going back and forth for further clarity. Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you are looking to actually learn something, rather than accepting a solution you barely understand, here's what I would suggest. (Other solutions are perfectly fine, but you have to appreciate that the background of the posters varies quite a bit and what might work best is not necessarily the best choice for you.)
(1) On your own, write a program that would check just one diagnosis code against your list of possible values.
(2) To check all the diagnosis codes instead of just one, apply the arrays suggestion.
(3) Make one small change to it, to speed things up slightly:
do i = 1 to 25 until (RespInd=1);
That way, the program will stop checking the rest of the ICD codes as soon as it finds a match.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content