Hello again everyone,
Just had another question, this time regarding the creation of a binary variable. I am currently working on a project that has a data set with a large volume of addresses. Some of these addresses are on a street segment, while some others are on street corners. So, my question is how can I write logic for a binary variable that will return a value of 0 or 1 for either a street corner or a street segment? Thank you
data have;
location = 'Fake St Main St';
output;
location = '123 Main St';
output;
run;
data want;
set have;
count = 0;
do i = 1 to countw(location);
if scan(location,i) in ('Ave','St','Dr','Blvd') then count + 1;
end;
dummyvar = count > 1;
keep location dummyvar;
run;
How do you identify a street corner from street segment?
A street segment has an address like "123 Main Street", while a corner is listed in the dataset like "Main St Fake St". I tried to write a logic statement that would flag any entry under my variable with 2 uses of a word like "St" or "Ave" as a street corner, but I cannot get the IF THEN statement correct
See the thing is, I am not really sure what else I can provide to you, but I'll give it a go. In my data set, the variable I am working with is called "Location". The location is an address that either falls on a street segment (123 Main Street) or a street corner (Corner of Fake Street and Main Street). Street segments are exclusive from street corners and vice versa. So, what I am trying to accomplish is to create a dummy variable, that will either return a 0 for an entry under the "Location" variable on a street segment, and a 1 for an entry under the "Location" variable on a street corner. I think that I can write some sort of conditional statement that will accomplish this for me, but I am not entirely sure how.
This is an example of what my data set looks like:
Location:
123 Main St
12 Third St
Fake St Main St (This is how a street corner was entered into the data set, it was not my doing)
Pennsylvania Ave Main St
What I've Tried:
The only way I could think to write a conditional expression is something like this:
IF Location contains "St" or "Ave or "Dr" or "Blvd" (and so on) GREATER THAN OR EQUAL TO 2 times THEN dummyvariable= 1.
I do not know if this helped you understand my problem any better, but it is the best I can do
data have;
location = 'Fake St Main St';
output;
location = '123 Main St';
output;
run;
data want;
set have;
count = 0;
do i = 1 to countw(location);
if scan(location,i) in ('Ave','St','Dr','Blvd') then count + 1;
end;
dummyvar = count > 1;
keep location dummyvar;
run;
I think Kurt's solution above will work. One suggestion to add on would be to upcase everything for comparison. You can use the upcase function for that.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.