good day all,
I borrowed an idea from this forum user and encounter an issue and don't know how to fix.
what i expect is all the data will be changed to A DRIVING SCHOOL by the program when passed the test(in this case should be all pass)
however it fails because i set the lag(name) at beginning but when the name is changed by program,the lag(name) do not refresh!
is there any way to enhance this program when the name is changed by program , at the same time the lag(name) will refresh itself?
data testing;
infile datalines dlm="\";
input Name :$80. Category :$40.;
datalines;
A DRIVING SCHOOL\service
A DRIVING\service
A DRIVINGA\service
A DRIVINGb\service
A DRIVINGc\service
;
run;
data testing2;
set testing;
_Name = lag(Name);
_Category = lag(Category);
dif=compged(Name, _Name);
dif2=compged(Category, _Category);
if dif<=100 and dif2 <=100 then
do;
Name = _Name;
Category = _Category;
match = "ok";end;
run;
what i expect the final output is
Name Cateory
A DRIVING SCHOOL service
A DRIVING SCHOOL service
A DRIVING SCHOOL service
A DRIVING SCHOOL service
A DRIVING SCHOOL service
thanks in advance
Harry
Not sure if this will really help you with your real data but I believe it implements more or less what you had in mind.
The LAG() function is not working for you because it just always picks the value from the previous source row. I made the assumption that you want to RETAIN the value in case there is a "match".
data have;
infile datalines dlm="\";
input Name :$80. Category :$40.;
datalines;
A DRIVING SCHOOL\service
A DRIVING\service
A DRIVINGA\service
A DRIVINGb\service
A DRIVINGc\service
;
data want;
set have;
length want_name $80 want_category $40;
retain want_name want_category;
min_len=min(lengthn(name),lengthn(want_name));
dif=compged(substrn(Name,1,min_len), substrn(want_name,1,min_len));
dif2=compged(Category, want_category);
if not (dif<=100 and dif2 <=100) then
do;
want_name = Name;
want_category = Category;
end;
run;
proc print data=want;
run;
Not sure if this will really help you with your real data but I believe it implements more or less what you had in mind.
The LAG() function is not working for you because it just always picks the value from the previous source row. I made the assumption that you want to RETAIN the value in case there is a "match".
data have;
infile datalines dlm="\";
input Name :$80. Category :$40.;
datalines;
A DRIVING SCHOOL\service
A DRIVING\service
A DRIVINGA\service
A DRIVINGb\service
A DRIVINGc\service
;
data want;
set have;
length want_name $80 want_category $40;
retain want_name want_category;
min_len=min(lengthn(name),lengthn(want_name));
dif=compged(substrn(Name,1,min_len), substrn(want_name,1,min_len));
dif2=compged(Category, want_category);
if not (dif<=100 and dif2 <=100) then
do;
want_name = Name;
want_category = Category;
end;
run;
proc print data=want;
run;
The function lag returns the value of a variable in the previous observation of the dataset you are processing, not the values that are written to the output-dataset. So i think that @Patrick is on the right track: you don't need lag, but retain
thanks all
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: