good day,
here is my program.
just wondering why if the up and down row are exact the same. the result always = 40?
is it something wrong with my program? word "checking" only 8 character.
data testing2;
input name $40.;
infile datalines dlm=',';
datalines;
checking
checking
checking
checking
checking1
checking2
;
run;
data testing3;
set testing2;
name2=compress(name);
lag_name=compress(lag(name));
run;
data testing4;
set testing3;
if substr(compress(name2),1,40) = substr(compress(lag_name),1,40) then Length_match_Lag ="40";
else if substr(compress(name2),1,9) = substr(compress(lag_name),1,9) then Length_match_Lag ="9";
else if substr(compress(name2),1,8) = substr(compress(lag_name),1,8) then Length_match_Lag ="8";
run;
thanks in advance
harry
compress() strips all leading and trailing blanks, so the resulting string is only 8 or 9 bytes long. This lets the checks for 40 (in all cases) and 9 (in most other cases) fail. The substr() function issues a NOTE (Maxim 2: Read the Log) and returns blank values, which match in the check for 40.
You are better off using a do loop to determine the match length:
data testing2;
input name $40.;
infile datalines dlm=',';
datalines;
checking
checking
checking
checking
checking1
checking2
checking2
;
run;
data testing4;
set testing2;
name2=compress(name);
lag_name=compress(lag(name));
do i = 1 to length(name2);
if substr(name2,1,i) = substr(lag_name,1,i) then Length_match_Lag = i;
end;
drop i;
run;
I added a second "checking2" value to get one result of 9.
compress() strips all leading and trailing blanks, so the resulting string is only 8 or 9 bytes long. This lets the checks for 40 (in all cases) and 9 (in most other cases) fail. The substr() function issues a NOTE (Maxim 2: Read the Log) and returns blank values, which match in the check for 40.
You are better off using a do loop to determine the match length:
data testing2;
input name $40.;
infile datalines dlm=',';
datalines;
checking
checking
checking
checking
checking1
checking2
checking2
;
run;
data testing4;
set testing2;
name2=compress(name);
lag_name=compress(lag(name));
do i = 1 to length(name2);
if substr(name2,1,i) = substr(lag_name,1,i) then Length_match_Lag = i;
end;
drop i;
run;
I added a second "checking2" value to get one result of 9.
What is the test you are tying to do? Why are you using COMPRESS()? Are you trying to remove the embedded spaces from the values? Why do you keep calling it after you already removed them? Did you expect them re-appear somehow?
data test;
input name $40.;
lag_name = lag(name);
match40 = name = lag_name;
match9 = substr(name,1,9) = substr(lag_name,1,9);
match8 = substr(name,1,8) = substr(lag_name,1,8);
datalines;
checking
checking
checking
checking
checking1
checking2
checking2
;
proc print;
run;
Obs name lag_name match40 match9 match8 1 checking 0 0 0 2 checking checking 1 1 1 3 checking checking 1 1 1 4 checking checking 1 1 1 5 checking1 checking 0 0 1 6 checking2 checking1 0 0 1 7 checking2 checking2 1 1 1
thanks you all
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.