/*in the list of Cards - first one is exact word, followup are have typo err.*/
data CATEGORY;
input MODE :$20;
cards;
A.VELOCITY
A.VALOSITY
A.VELOCIY
B.HYDROCHLORIC
B.HYDRRACLORIK
B.HYDROCLOKIK
C.GEOMETRY
C.GEMENTRY
run;
/* the above dataset is just for an example */
/* I have a table in SQL with same kind of problem, As now just keep on eye with cards */
proc sql;
update CATEGORY
set Column2 = 'physics'
where MODE =* 'A.VELOCITY';
run;
proc sql;
update CATEGORY
set Column2 = 'chemistry'
where MODE =* 'B.HYDROCLORIC';
run;
proc sql;
update CATEGORY
set Column2 = 'maths'
where MODE =* 'C.GEOMENTRY';
run;
/*Is there any effective way to resolve this than some other methods in SAS */
/*I don't care about the initial (A. B. C.) but i need to focus on methods(velocity geometry hydrochloric)
MY MAJOR CONCERN ON CATEGORIZING "TYPO ERR"
You need to somehow standardize your data using some mechanism to group/cluster values with similar strings.
There are quite a few discussions and solution approaches around such "typo" problems in this forum.
To start with use search terms like: SPEDIS, COMPGED, FUZZY ...and I'm sure the posts you find with these terms will give you ideas for other search terms you could use as well.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.