BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
robulon
Quartz | Level 8

Hi all,

 

I've done quite a bit of searching online around this and can't find anything. I'm using Enterprise Guide version 7.15 on a Windows PC.

 

I'm trying to create a flag variable in a proc sql program to identify whether a specific character variable contains the word 'unsold'. Here is my code: -

 

proc sql;
create table work.test as
select
*
,case when findw(stop_category_description,'unsold','','i') then 1 else 0 end as unsold
from work.t1;
quit;

 

When I check the output dataset, there are no observations populated with 1 for the unsold variable. If however I view the output dataset in EG and click 'where' and paste the 'findw(stop_category_description,'unsold','','i')' code in, it identifies 31 observations, all of which I can see do contain the work 'unsold' in them but that are populated with a 0 for the unsold variable.

 

I did see a previous post from a couple of years ago that someone had put about the findw function and the user was recommended to contact SAS Technical Services. I have a feeling this may also need their attention but thought I'd ask the community first in case there's something I'm missing.

 

As always, any help will be gratefully received.


Thanks,

Rob

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Your problem comes from the third parameter in findw. '' , with no space, means there are NO characters in the list. If you want default delimiters don't put the string at all, if you want a blank as the only delimiter you must provide a blank in the parameter.

Otherwise NO characters are used for delimiters so your whole string is the only word.

 

Here is a minimum sort of test case:

data example;
   infile datalines missover;
   input var $25.;
datalines;
contains unsold
does not have word
;

proc sql;
create table work.test as
select
*
,case when findw(var,'unsold',' ','i') then 1 else 0 end as unsold
from work.example;
quit;

that does set the flag.

 

Also, you need to remember that this forum reformats the text in the main message window and so code pasted there may not actually be what was submitted when white space characters are removed or the other reformatting applied.

View solution in original post

12 REPLIES 12
PaigeMiller
Diamond | Level 26

Please provide a small portion (that illustrates the problem) of the data set used (WORK.T1) following these instructions (and not via any other method)

https://blogs.sas.com/content/sastraining/2016/03/11/jedi-sas-tricks-data-to-data-step-macro/

--
Paige Miller
robulon
Quartz | Level 8

I'm going to struggle to do that I'm afraid as the dataset contains sensitive customer data (apologies, I had meant to include that information in my original post).

 

I had hoped that I had provided enough information to demonstrate that the function worked within the where facility in the EG viewer but not in the code itself. I've also just tried doing the same in the source dataset (work.t1) and it has correctly identified the correct observation there as well.

 

Not to worry, I can easily work around it for the time being anyway, I just thought it might be something that someone had come across before. If I can think of a way to provide the data in the required format, I will do.

 

Thanks anyway 

robulon
Quartz | Level 8

Just to add, I've also coded this up in a data step: -

 

data work.t3;
set work.t1;
unsold = 0;
if findw(stop_category_description,'unsold','','i') then unsold = 1;
run;

 

and this has worked perfectly so it seems to be something to do with how proc sql is implementing the function.

Tom
Super User Tom
Super User

No need to supply the real data. Just some example data that exhibits the issue. 

For example you could subset to just the variables needed for the FINDW() call and one or two observations where it fails (and if there are any then one or two where it works).  If the values of that variable are sensitive then change some of the values to make them sharable.

You might find that you can can find the solution yourself in the process of creating that test data.

mkeintz
PROC Star

I don't think you can get much help on this question without data that illustrates the problem.  Why not make some synthetic data that shows the issue?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
robulon
Quartz | Level 8

Thanks for the suggestions. I've not been able to download the data2datastep zip file as my work pc will not allow it but have looked to recreate the issue using some datalines code (hope that's ok): -

 

data l1;
input stop_category_description $30.;
datalines;
Covid Customer Care
Post Covid Arrangement Agreed
Covid Customer Care
New VT Unsold
Covid Customer Care
Covid Customer Care
Post Covid
Covid Customer Care
Post Covid Arrangement Agreed
Post Covid
Covid Customer Care
Covid Customer Care
New Repossession Unsold
New Repossession Unsold
New VT Unsold
Post Covid Arrangement Agreed
Covid Customer Care
Post Covid
Covid Customer Care
Covid Customer Care
Covid Customer Care
Covid Customer Care
Covid Customer Care
Covid Customer Care
Covid Customer Care
Covid Customer Care
Post Covid Arrangement Agreed
New VT Unsold
Post Covid Arrangement Agreed
New IVA
;
run;

proc sql;
create table l2 as
select
stop_category_description
,case when findw(stop_category_description,'unsold','','i') then 1 else 0 end as unsold
from l1;
quit;

data l3;
set l1;
if findw(stop_category_description,'unsold','','i') then unsold = 1;
else unsold = 0;
run;

 

when I run these three programs, the proc sql one does not populate the 'unsold' variable with any 1 values whereas when I do the same in the data step, it correctly identifies them.

 

As I say, I can work around it anyway but I like to understand why things don't work as for the most part, it's because I'm doing something incorrectly and want to try to learn the right way of doing things but in this case, I'm not sure that is what's happening.

 

Thanks again

ballardw
Super User

Your problem comes from the third parameter in findw. '' , with no space, means there are NO characters in the list. If you want default delimiters don't put the string at all, if you want a blank as the only delimiter you must provide a blank in the parameter.

Otherwise NO characters are used for delimiters so your whole string is the only word.

 

Here is a minimum sort of test case:

data example;
   infile datalines missover;
   input var $25.;
datalines;
contains unsold
does not have word
;

proc sql;
create table work.test as
select
*
,case when findw(var,'unsold',' ','i') then 1 else 0 end as unsold
from work.example;
quit;

that does set the flag.

 

Also, you need to remember that this forum reformats the text in the main message window and so code pasted there may not actually be what was submitted when white space characters are removed or the other reformatting applied.

robulon
Quartz | Level 8

That's brilliant, thank you.

 

I was only including the third parameter as my understanding is if you are applying a modifier, you also have to apply a delimiter parameter so that SAS knows you are intending to use the 'i' as a modifier and not a delimiter, however I hadn't appreciated that I would also need to include a space.

 

I should have known it was me doing the wrong thing (again) but that's been really helpful.

 

Thanks ballardw and thanks to everyone for looking.

 

Rob

robulon
Quartz | Level 8

Having said that, I'm still a bit confused as to why I don't need a space when doing it in a data step but I do in proc sql but I'll worry about that another time!

Tom
Super User Tom
Super User

There is a difference between data step and SQL that are impacting this.

In a data step there is no way to express a literal string of length zero.  If you code two quote character next to each other the result is a string of length 1 with a single space character.  You can generate one using the TRIMN() function.

Test:

data example;
   input var $25.;
   datastep1=findw(var,'unsold',' ','i') ;
   datastep2=findw(var,'unsold','','i') ;
   datastep3=findw(var,'unsold',trimn(' '),'i') ;
datalines;
unsold
contains unsold
does not have word
;

proc sql;
create table work.test as
select *
     , findw(var,'unsold',' ','i') as sql1
     , findw(var,'unsold','','i') as sql2
     , findw(var,'unsold',trimn(' '),'i') as sql3
from work.example
;
quit;

Results:

Obs    var                   datastep1    datastep2    datastep3    sql1    sql2    sql3

 1     unsold                     1            1           0          1       0       0
 2     contains unsold           10           10           0         10       0       0
 3     does not have word         0            0           0          0       0       0

 

robulon
Quartz | Level 8
Legend! Thanks Tom
FreelanceReinh
Jade | Level 19

Interestingly, with PROC SQL's FEEDBACK option the log incorrectly states that a single blank was interpreted as a null string, whereas the log of a DATA step correctly describes the opposite interpretation of a null string in a WHERE condition:

648  proc sql feedback;
649  create table selsql as
650  select var from example
651  where findw(var,'unsold',' ','i');
NOTE: Statement transforms to:

        select EXAMPLE.var
          from WORK.EXAMPLE
         where FINDW(EXAMPLE.var, 'unsold', '', 'i');

NOTE: Table WORK.SELSQL created, with 2 rows and 1 columns.

652  quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.13 seconds
      cpu time            0.12 seconds


653
654  data selds;
655  set example;
656  where findw(var,'unsold','','i');
657  run;

NOTE: There were 2 observations read from the data set WORK.EXAMPLE.
      WHERE FINDW(var, 'unsold', ' ', 'i');
NOTE: The data set WORK.SELDS has 2 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           0.04 seconds
      cpu time            0.04 seconds

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1305 views
  • 5 likes
  • 6 in conversation