BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GKati
Pyrite | Level 9

Hi,

 

I have a large health dataset with a character variable indicating the location of a health treatment. I am trying to create a dummy variable for certain cities.

Let's say:

 

data old;

    set new;

nyc=0;

if location="New York" then nyc=1;

output;

run;

 

The problem is that it doesn't seem to recognize either "New York" or New York without quotation marks and I get all zeros. (I kow for a fact that at least 5% of the sample was in fact treated in NYC. What could be the problem?

 

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Since you do not post any sample data this is just guessing..

 

But perhaps your character variable is cased differently than "New York". A common way to deal with this issue is to use the UPCASE function, such that your program looks like

 

data old;
    set new;
nyc=0;
if UPCASE(location) = "NEW YORK" then nyc=1;
output;
run;

View solution in original post

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

Since you do not post any sample data this is just guessing..

 

But perhaps your character variable is cased differently than "New York". A common way to deal with this issue is to use the UPCASE function, such that your program looks like

 

data old;
    set new;
nyc=0;
if UPCASE(location) = "NEW YORK" then nyc=1;
output;
run;
GKati
Pyrite | Level 9

Great! It worked! Thanks

PeterClemmensen
Tourmaline | Level 20

btw you definately need the quotation marks 🙂

Reeza
Super User

Text strings must match EXACTLY. 

 

1. Run a PROC FREQ on location variable and see values. Note that string comparisons are case secsifive. 

2. Use a HEX format to display variable and look for non printing blanks. 

3. Use FIND/INDEX to search a string for partial values. 

 

And goes without saying, check your log for Notes/Warnings/Errors. 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

A quick and simple way to code a dataset, and to save you typing, if you just want a set number:

proc sort data=your_dataset out=codelist nodupkey (keep=location);
  by location;
run;

data codelist;
  set codelist;
  code=_n_;
run;

proc sql;
  create table WANT as
  select  A.*,
          B.CODE
  from    YOUR_DATASET A
  left join CODELIST B
  on      A.LOCATION=B.LOCATION;
quit;

What this does is create a table with distinct locations, and assigns 1-x based on sort order (obviously you could change the sort if you wanted), then merges that back onto your original data.  In this way you don't need to type each if statement, and don't need to worry about "your string" not being the same as "data string" as both come from the same.  

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 924 views
  • 3 likes
  • 4 in conversation