Hi,
I have a large health dataset with a character variable indicating the location of a health treatment. I am trying to create a dummy variable for certain cities.
Let's say:
data old;
set new;
nyc=0;
if location="New York" then nyc=1;
output;
run;
The problem is that it doesn't seem to recognize either "New York" or New York without quotation marks and I get all zeros. (I kow for a fact that at least 5% of the sample was in fact treated in NYC. What could be the problem?
Thanks
Since you do not post any sample data this is just guessing..
But perhaps your character variable is cased differently than "New York". A common way to deal with this issue is to use the UPCASE function, such that your program looks like
data old;
set new;
nyc=0;
if UPCASE(location) = "NEW YORK" then nyc=1;
output;
run;
Since you do not post any sample data this is just guessing..
But perhaps your character variable is cased differently than "New York". A common way to deal with this issue is to use the UPCASE function, such that your program looks like
data old;
set new;
nyc=0;
if UPCASE(location) = "NEW YORK" then nyc=1;
output;
run;
Great! It worked! Thanks
btw you definately need the quotation marks 🙂
Text strings must match EXACTLY.
1. Run a PROC FREQ on location variable and see values. Note that string comparisons are case secsifive.
2. Use a HEX format to display variable and look for non printing blanks.
3. Use FIND/INDEX to search a string for partial values.
And goes without saying, check your log for Notes/Warnings/Errors.
A quick and simple way to code a dataset, and to save you typing, if you just want a set number:
proc sort data=your_dataset out=codelist nodupkey (keep=location); by location; run; data codelist; set codelist; code=_n_; run; proc sql; create table WANT as select A.*, B.CODE from YOUR_DATASET A left join CODELIST B on A.LOCATION=B.LOCATION; quit;
What this does is create a table with distinct locations, and assigns 1-x based on sort order (obviously you could change the sort if you wanted), then merges that back onto your original data. In this way you don't need to type each if statement, and don't need to worry about "your string" not being the same as "data string" as both come from the same.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.