DATA Step, Macro, Functions and more

Refer to a character value when creating a new variable...

Accepted Solution Solved
Reply
Contributor
Posts: 47
Accepted Solution

Refer to a character value when creating a new variable...

Hi,

 

I have a large health dataset with a character variable indicating the location of a health treatment. I am trying to create a dummy variable for certain cities.

Let's say:

 

data old;

    set new;

nyc=0;

if location="New York" then nyc=1;

output;

run;

 

The problem is that it doesn't seem to recognize either "New York" or New York without quotation marks and I get all zeros. (I kow for a fact that at least 5% of the sample was in fact treated in NYC. What could be the problem?

 

Thanks


Accepted Solutions
Solution
‎01-31-2017 05:27 AM
PROC Star
Posts: 554

Re: Refer to a character value when creating a new variable...

Since you do not post any sample data this is just guessing..

 

But perhaps your character variable is cased differently than "New York". A common way to deal with this issue is to use the UPCASE function, such that your program looks like

 

data old;
    set new;
nyc=0;
if UPCASE(location) = "NEW YORK" then nyc=1;
output;
run;

View solution in original post


All Replies
Solution
‎01-31-2017 05:27 AM
PROC Star
Posts: 554

Re: Refer to a character value when creating a new variable...

Since you do not post any sample data this is just guessing..

 

But perhaps your character variable is cased differently than "New York". A common way to deal with this issue is to use the UPCASE function, such that your program looks like

 

data old;
    set new;
nyc=0;
if UPCASE(location) = "NEW YORK" then nyc=1;
output;
run;
Contributor
Posts: 47

Re: Refer to a character value when creating a new variable...

Great! It worked! Thanks

PROC Star
Posts: 554

Re: Refer to a character value when creating a new variable...

btw you definately need the quotation marks Smiley Happy

Super User
Posts: 17,912

Re: Refer to a character value when creating a new variable...

Text strings must match EXACTLY. 

 

1. Run a PROC FREQ on location variable and see values. Note that string comparisons are case secsifive. 

2. Use a HEX format to display variable and look for non printing blanks. 

3. Use FIND/INDEX to search a string for partial values. 

 

And goes without saying, check your log for Notes/Warnings/Errors. 

 

Super User
Super User
Posts: 7,417

Re: Refer to a character value when creating a new variable...

A quick and simple way to code a dataset, and to save you typing, if you just want a set number:

proc sort data=your_dataset out=codelist nodupkey (keep=location);
  by location;
run;

data codelist;
  set codelist;
  code=_n_;
run;

proc sql;
  create table WANT as
  select  A.*,
          B.CODE
  from    YOUR_DATASET A
  left join CODELIST B
  on      A.LOCATION=B.LOCATION;
quit;

What this does is create a table with distinct locations, and assigns 1-x based on sort order (obviously you could change the sort if you wanted), then merges that back onto your original data.  In this way you don't need to type each if statement, and don't need to worry about "your string" not being the same as "data string" as both come from the same.  

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 172 views
  • 3 likes
  • 4 in conversation