BookmarkSubscribeRSS Feed
frupaul
Quartz | Level 8

Hi Everyone,

 

A challenging one and apologies if its a bit of a read. Just want to give you as much information as possible to enable you figure out.

 

I have the following data sets (codes follow below);

 

Screen Shot 2018-07-25 at 19.57.03.png

 

Created with the below codes respectively:

data Master_File;
infile datalines dlm=',';
length Survey_ref $8 Question1 $90;
input Survey_ref $ Question1 $;
cards;
PPTV,How Likely are you to recommend [Staff_Name] from [Company_Name]?
JJNT,How Likely are you to recomment [Company_Name] to a Friend?
;

data Customer_Survey;
infile datalines dlm=',';
length Survey_ref $8 TQuestion1 $90 TStaff_Name $20 TCompany_Name $15;
input Survey_ref $ TQuestion1 $ TStaff_Name $ TCompany_Name $;
cards;
PPTV, How Likely are you to recommend Alex Jefferson from Morrisons?,Alex Jefferson,Morrisons
JJNT, How Likely are you to recommend Morrisons to a Friend?,Pablo, Morrisons
;

 

OBJECTIVE:

I wish to compare survey questions on the different Files (Question1 Vs TQuestion1) for each survey_ref. To make this possible, go through the below steps;

A. Try to Standardize both files e.g. by replacing the Staff_Name (value: Alex Jefferson) in TQuestion1 with [Staff_Name]  and also replace company name (Value: Morrisons) with [Company_Name] (These value substitutions are done in step C but writing it here gives a bit more context). Also, these value substitutions work  if I use actually values assigned to the Macro variables but doesn't work when i use the macro variables themselves.

B. Merge both Files

C. Compare values

 

A.

data Master_File_simplified;
set Master_File;

/* Extract words in square brackets in Question 1*/


Inserted_word1 = scan(Question1,2,'[]');
Inserted_Word2 = scan(Question1,4,'[]');

/*Store Extracted words in Macro variables*/
call symput('Inserted_word1',Inserted_word1);
call symput('Inserted_Word2',Inserted_Word2);
run;

 

B.

/*Merge Files - For simplicity, filtering for survey_ref='PPTV' only*/

 

data Mapped_file;
merge Master_File_Simplified(in=M where=(survey_ref='PPTV'))
Customer_Survey(in=T where=(survey_ref='PPTV'));
by survey_ref;
if M and T;
run;

 

table from this is:

Screen Shot 2018-07-25 at 20.08.41.png

 

C.

 data Final;
set Mapped_file;
if not missing(&Inserted_word1) then do;
TTQuestion1=transtrn(TQuestion1,&Inserted_word1.,"[&Inserted_word1.]");
if not missing(&Inserted_Word2) then TTQuestion1=transtrn(TTQuestion1,&Inserted_word2.,"[&Inserted_word2.]");
end;
run;

 

Error generated is:

Screen Shot 2018-07-25 at 21.05.37.png

 

 

How can i successfully make these comparisons without getting this error please. Mind you this is a simplification of the actual problem. The real problem has many different survey questions in that master File, with different sorts of hardcoded values in square brackets.

10 REPLIES 10
Reeza
Super User

Post the exact log. It usually does a really good job of indicating where the errors are. 

If you're using nested functions it also helps to unnest it for testing.

 

And add a period after the macro variable name so SAS knows when the macro variable ends.

 

TTQuestion1=transtrn(TQuestion1,&Inserted_word1,"[&Inserted_word1.]"); 
frupaul
Quartz | Level 8

I have added the full stop at the end of the macro variables and added updated the post with the full log

Astounding
PROC Star

Try adding this statement at the beginning, then rerun:

 

options mprint symbolgen;

 

The results will likely make debugging easy (although there might be a longer log to sift through).

ballardw
Super User

Please do not post pictures of code or log. It makes it very hard to reference things. If you had pasted the code from the log into a code box as opened using the {I} icon then we could copy a few relative lines to highlight issues.

 

You claimed that the code was

if not missing(&Inserted_Word2) then TTQuestion1=transtrn(TTQuestion1,&Inserted_word2.,"[&Inserted_word2.]");

But if you look at the line with the error you will see that there are no ( ) around &inserted_word2.

 

data junk;
   x = 3;
   if missing x then put "missing x";
   else put "not missing x";
run;

generates the same error for x. Compare to:

 

data junk;
   x = 3;
   if missing ( x ) then put "missing x";
   else put "not missing x";
run;

Also in the construct "if not missing(&Inserted_Word2) then " &inserted_word would be expected to be an existing variable in the dataset MAPPED_FILE. But I don't think there is one as you don't show where it would be created. There is TCompany_name but no company_name variable. So Company_name would be created in the use of missing(company_name), be numeric and have a missing value.

 

frupaul
Quartz | Level 8

Apologies in the course of recreating the problem i missed a where option.

 

data Master_File_simplified;
set Master_File(where=(survey_ref='PPTV'));
Inserted_word1 = scan(Question1,2,'[]');
Inserted_Word2 = scan(Question1,4,'[]');
call symput('Inserted_word1',Inserted_word1);
call symput('Inserted_Word2',Inserted_Word2);
run;

 

With this, the code runs without errors but doesn't generate the expected table after running the last step.

Tom
Super User Tom
Super User

 

What is the purpose of macro variables in your program?

It looks like you already have the data in real dataset variables.  Just use those variables.

data want ;
  merge Master_File Customer_Survey ;
  by Survey_ref ;
  length want $200 ;
  want=Question1;
  if not missing(TStaff_Name) then want=transtrn(want,"[Staff_Name]",trimn(TStaff_Name));
  if not missing(TCompany_Name) then want=transtrn(want,"[Company_Name]",trimn(TCompany_Name));
run;
57    data _null_;
58      set;
59      put (_all_) (=/) / ;
60    run;


Survey_ref=JJNT
Question1=How Likely are you to recomment [Company_Name] to a Friend?
TQuestion1=How Likely are you to recommend Morrisons to a Friend?
TStaff_Name=Pablo
TCompany_Name=Morrisons
want=How Likely are you to recomment Morrisons to a Friend?


Survey_ref=PPTV
Question1=How Likely are you to recommend [Staff_Name] from [Company_Name]?
TQuestion1=How Likely are you to recommend Alex Jefferson from Morrisons?
TStaff_Name=Alex Jefferson
TCompany_Name=Morrisons
want=How Likely are you to recommend Alex Jefferson from Morrisons?
NOTE: There were 2 observations read from the data set WORK.WANT.

 

frupaul
Quartz | Level 8

The examples I have provided here are just a few survey questions. The Macro variables are to enable flexibility as I am trying to build a macro that tests the survey questions (currently 20 surveys being built) against the survey questions in the Master doc to see if the analyst output file is/isnt consistent with the Master doc. Some survey questions have [staff_name], some have [Company_Names] and some have others variables that have, in the master doc, been hardcoded in square brackets.

 

Doing these manual changes will defeat the purpose of creating the macro

Tom
Super User Tom
Super User

I am still not sure how macro variables help this problem in any way.

Just restructure the data so that you have a list of QUESTIONS and their associate FIELDS.  Should be easy enough to get the field names from the questions using regular expressions if you have to.  So you would have something like this:

data Master_File;
  infile datalines dlm='|' dsd truncover ;
  length Survey_ref Question $90 Field $32;
  input Survey_ref Question Field;
cards;
JJNT|How Likely are you to recomment [Company_Name] to a Friend?|Company_Name
PPTV|How Likely are you to recommend [Staff_Name] from [Company_Name]?|Company_Name
PPTV|How Likely are you to recommend [Staff_Name] from [Company_Name]?|Staff_Name
;

Then you want another data file that has all of the values of the Fields for each target that you want generate questions text for.  You might even have values for fields you don't need for this survey, like Phone_Number in this example.  Perhaps you just need to use PROC TRANSPOSE on your existing data?

data Customer_Survey;
  infile datalines dlm='|' dsd truncover ;
  length Survey_ref $8 Field $32 Value $50 ;
  input Survey_ref Field Value;
cards;
JJNT|Staff_Name|Pablo
JJNT|Company_Name|Morrisons
JJNT|Phone_Number|XXX-555-1234
PPTV|Staff_Name|Alex Jefferson
PPTV|Company_Name|Morrisons
PPTV|Phone_Number|XXX-555-1234
;

Then combine the two.

proc sql noprint;
create table survey_question_fields as
  select a.*,b.value
  from Master_File a
  left join Customer_Survey b
    on a.Survey_ref = b.Survey_ref
   and a.Field = b.Field
  order by a.Survey_ref,a.question,a.field 
;
quit;

And then collapse back to a single row per question while making the text replacements.

data WANT ;
  do until (last.question);
    set survey_question_fields ;
    by survey_ref question ;
    if first.question then new_question=question ;
    if not missing(value) then new_question=transtrn(new_question,cats('[',field,']'),trimn(value));
  end;
  keep survey_ref new_question ;
run;

image.png

 

 

frupaul
Quartz | Level 8

Thanks for the response. I think this works if one does all the transposing. However, I have not been able to figure out how to transpose character fields in SAS. Any clue how to transpose the customer table i provided to arrive at yours?

 

Tom
Super User Tom
Super User

It is no harder to transpose character variable, although you are REQUIRED to use the VAR statement.

data Customer_Survey;
infile datalines dlm=',';
length Survey_ref $8 TQuestion1 $90 TStaff_Name $20 TCompany_Name $15;
input Survey_ref $ TQuestion1 $ TStaff_Name $ TCompany_Name $;
cards;
PPTV, How Likely are you to recommend Alex Jefferson from Morrisons?,Alex Jefferson,Morrisons
JJNT, How Likely are you to recommend Morrisons to a Friend?,Pablo, Morrisons
;

proc transpose data=Customer_Survey out=want ;
  by Survey_ref TQuestion1 notsorted ;
  var TStaff_Name TCompany_Name ;
run;

image.png

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 561 views
  • 0 likes
  • 5 in conversation