Hello again everyone,
Just had another question, this time regarding the creation of a binary variable. I am currently working on a project that has a data set with a large volume of addresses. Some of these addresses are on a street segment, while some others are on street corners. So, my question is how can I write logic for a binary variable that will return a value of 0 or 1 for either a street corner or a street segment? Thank you
data have;
location = 'Fake St Main St';
output;
location = '123 Main St';
output;
run;
data want;
set have;
count = 0;
do i = 1 to countw(location);
if scan(location,i) in ('Ave','St','Dr','Blvd') then count + 1;
end;
dummyvar = count > 1;
keep location dummyvar;
run;
How do you identify a street corner from street segment?
A street segment has an address like "123 Main Street", while a corner is listed in the dataset like "Main St Fake St". I tried to write a logic statement that would flag any entry under my variable with 2 uses of a word like "St" or "Ave" as a street corner, but I cannot get the IF THEN statement correct
See the thing is, I am not really sure what else I can provide to you, but I'll give it a go. In my data set, the variable I am working with is called "Location". The location is an address that either falls on a street segment (123 Main Street) or a street corner (Corner of Fake Street and Main Street). Street segments are exclusive from street corners and vice versa. So, what I am trying to accomplish is to create a dummy variable, that will either return a 0 for an entry under the "Location" variable on a street segment, and a 1 for an entry under the "Location" variable on a street corner. I think that I can write some sort of conditional statement that will accomplish this for me, but I am not entirely sure how.
This is an example of what my data set looks like:
Location:
123 Main St
12 Third St
Fake St Main St (This is how a street corner was entered into the data set, it was not my doing)
Pennsylvania Ave Main St
What I've Tried:
The only way I could think to write a conditional expression is something like this:
IF Location contains "St" or "Ave or "Dr" or "Blvd" (and so on) GREATER THAN OR EQUAL TO 2 times THEN dummyvariable= 1.
I do not know if this helped you understand my problem any better, but it is the best I can do
data have;
location = 'Fake St Main St';
output;
location = '123 Main St';
output;
run;
data want;
set have;
count = 0;
do i = 1 to countw(location);
if scan(location,i) in ('Ave','St','Dr','Blvd') then count + 1;
end;
dummyvar = count > 1;
keep location dummyvar;
run;
I think Kurt's solution above will work. One suggestion to add on would be to upcase everything for comparison. You can use the upcase function for that.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.