- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Attempting to create a new variable in a data step called top breed based on a breed column in previous data set. Data in breed column of old data varies, ie English Bulldog, American Bulldog, Labrador Mix, Labrador Retriever, Jack Russel Terrier, etc.
Trying to write if then index code so that if breed contains Bulldog, Lab, Terrier, Beagle, or Shepherd topbreed = 1. If not 0. I have only been able to figure out the the coding for one breed I am trying to assign a value for. Is there any ways to add multiple if then index clauses within the same data step?
Current code:
data new variable;
set old variables;
if index(breed, 'Bulldog') then topbreed = 1;
else topbreed = 0;
run;
Trying to get: if index(breed, 'Bulldog', 'Lab', 'Terrier', 'Shepherd', 'Beagle') then .... with no success
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
In general though, with cleaning data and data that's fuzzy it's a bit of an interative approach. Things you need to consider, which you currently are not, are case differences, spelling differences, partial matches, and fuzzy matching - which is likely what you need. I suspect using LIKE in a SQL query may be more efficient overall but less accurate.