- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi everyone,
I'm trying to figure out the most efficient way to see if any one of several columns might contain an alphanumeric code of interest from a fairly long list. (Let's say my variables are called cause1-cause20 and death1.) The alphanumeric codes start with one letter and end with two or three numbers (e.g., X81, Y350, W19). For the subsets I'm trying to create, I might be dealing with 50 or more different alphanumeric codes of interest. Depending on the question being asked, the list may change, so I have to establish conditions for each subset.
For simpler situations using one variable (death1) for example, if I'm looking to see if my columns have a code in the range of W00 to W19, I approach it this way and there are no issues:
proc sql;
create table injury.deaths as
select &listx
from injury.deathfile
where death1 like 'W0%' or
death1 like 'W1%';
quit;
What would be an efficient approach when I have to search through several columns for instances of codes from a very long list of alphanumeric codes? Should I create macros for the list of codes? Is there a way to use wildcards or underscores in a macro? Would I use an array and do loop and macros all at once? (I seem to get tripped up when combining things.) Any advice would be greatly appreciated. When I have experimented with arrays in data steps, I end up with all or none of the observations, which I know is incorrect. Thank you.
I'm using SAS x9.4.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use two arrays one for the list of values you want to check and one for list of variables to check. Use WHILE to shortcut the loop once you find a condition.
Untested:
data want;
set have;
array dx_want (3) _Temporary_ $8. ('Z123', 'Z185', 'Z245');
array dx_have(25) dx_prim dx2-dx25;
flag=0;
do i=1 to dim(dx_have) while(flag=0);
do j=1 to dim(dx_want) while(flag=0) ;
if find( dx_have(i), dx_want(j))>0 then flag=1;
end;
end;
run;
@monstera wrote:
Hi everyone,
I'm trying to figure out the most efficient way to see if any one of several columns might contain an alphanumeric code of interest from a fairly long list. (Let's say my variables are called cause1-cause20 and death1.) The alphanumeric codes start with one letter and end with two or three numbers (e.g., X81, Y350, W19). For the subsets I'm trying to create, I might be dealing with 50 or more different alphanumeric codes of interest. Depending on the question being asked, the list may change, so I have to establish conditions for each subset.
For simpler situations using one variable (death1) for example, if I'm looking to see if my columns have a code in the range of W00 to W19, I approach it this way and there are no issues:
proc sql; create table injury.deaths as select &listx from injury.deathfile where death1 like 'W0%' or death1 like 'W1%'; quit;
What would be an efficient approach when I have to search through several columns for instances of codes from a very long list of alphanumeric codes? Should I create macros for the list of codes? Is there a way to use wildcards or underscores in a macro? Would I use an array and do loop and macros all at once? (I seem to get tripped up when combining things.) Any advice would be greatly appreciated. When I have experimented with arrays in data steps, I end up with all or none of the observations, which I know is incorrect. Thank you.
I'm using SAS x9.4.
Or you can also join with a column with like.
proc sql;
create table injury.deaths as
select &listx
from injury.deathfile
left join lookupTable as lt
where death1 like 'W0%' or
death1 like lt.variableCompare;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use two arrays one for the list of values you want to check and one for list of variables to check. Use WHILE to shortcut the loop once you find a condition.
Untested:
data want;
set have;
array dx_want (3) _Temporary_ $8. ('Z123', 'Z185', 'Z245');
array dx_have(25) dx_prim dx2-dx25;
flag=0;
do i=1 to dim(dx_have) while(flag=0);
do j=1 to dim(dx_want) while(flag=0) ;
if find( dx_have(i), dx_want(j))>0 then flag=1;
end;
end;
run;
@monstera wrote:
Hi everyone,
I'm trying to figure out the most efficient way to see if any one of several columns might contain an alphanumeric code of interest from a fairly long list. (Let's say my variables are called cause1-cause20 and death1.) The alphanumeric codes start with one letter and end with two or three numbers (e.g., X81, Y350, W19). For the subsets I'm trying to create, I might be dealing with 50 or more different alphanumeric codes of interest. Depending on the question being asked, the list may change, so I have to establish conditions for each subset.
For simpler situations using one variable (death1) for example, if I'm looking to see if my columns have a code in the range of W00 to W19, I approach it this way and there are no issues:
proc sql; create table injury.deaths as select &listx from injury.deathfile where death1 like 'W0%' or death1 like 'W1%'; quit;
What would be an efficient approach when I have to search through several columns for instances of codes from a very long list of alphanumeric codes? Should I create macros for the list of codes? Is there a way to use wildcards or underscores in a macro? Would I use an array and do loop and macros all at once? (I seem to get tripped up when combining things.) Any advice would be greatly appreciated. When I have experimented with arrays in data steps, I end up with all or none of the observations, which I know is incorrect. Thank you.
I'm using SAS x9.4.
Or you can also join with a column with like.
proc sql;
create table injury.deaths as
select &listx
from injury.deathfile
left join lookupTable as lt
where death1 like 'W0%' or
death1 like lt.variableCompare;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you so much, Reeza. Your reply was really helpful to me as a newer user. I never would have figured out how to do this on my own with the books I have. You also gave me new directions for learning, since I'm unfamiliar with flags and joining tables for this purpose. Thanks again for taking the time to write a helpful response with an explanation.