BookmarkSubscribeRSS Feed
dbhade
Calcite | Level 5

I am currently working on a program which is supposed to read through the array and check if it is in the bookcount list. Then, I want the output to print out the "printer" and its associated value from the proc format I created for genrefs, authorfs, pagesfs, etc.. However, the output for these is not recognizing all the occurrences, and all the output for genrefs authorfs pagesfs is the same along with the output for genrels authorls pagesls. I don't have any errors or warnings in my code so I am confused why the output isnt correct.

3 REPLIES 3
ballardw
Super User

No real reason to make that code an attachment. Just open a text box with the </> icon above the message box and paste:

Proc format;		
value $genre		
'P1'	=	1	
'P2'	=	2	
'P3'	=	2	
'P4'	=	3	
'P5'	=	3	;
value $pages	
'P1'	=	5
'P2'	=	2	
'P3'	=	5	
'P4'	=	1	
'P5'	=	9	;		
value $author		
'P1'	=	1	
'P2'	=	1	
'P3'	=	1	
'P4'	=	2	
'P5'	=	2	;
%macro bookcount;
	if 	printer in	 
	('P1', 'P2', 'P3','P4','P5')
	then do;
%mend;

data out.favebooks
(keep= author pubdate publisher authorcount reader
       genrefs authorfs pagesfs genrels authorls pagesls); 
merge out.reading (in=ina)
    writing (keep=efamid enrolid year dx1-dx15 admdate in=in_i);
;
by author pubdate publisher;
   	if ina=1;
retain authorcount reader 0;
		if first.person then do;
           authorcount=0;
        end; 
        retain authorcount;

array printer read1-read4;
		do over printer;
             %bookcount;
              authorcount+1;
          	  end;
		   if authorcount=1 then do;
           retain genrefs authorfs pagesfs genrels authorls pagesls;
              genrefirst=put(printer,$genrefs.);
              authorfirst=put(printer,$authorfs.);
			  pagesfirst=put(printer,$pagesfs.);
		   end;
           genrelast=put(printer,$genrels.);
           authorlast=put(printer,$authorls.);
		   pageslast=put(printer,$pagesls.);
         
   		end;
	if last.person then do; 
		if authorcount>0 then reader=1; else reader=0;
        output;
    end;
run;

It is pretty poor generally to have an unclosed macro such as your %bookcount. It has a DO but does not provide it's own end. So the flow of reading the program can be misleading. For debugging I would remove that macro and hard code that bit (which I am not really seeing a need for at this point) instead of the macro. Get the code working with one case and then try to generalize/automate/make flexible or what ever the reason for the macro might be after it is working.

 

You need to provide some example data in the form of a data step and what you expect as the result for that data.

 

You probably should comment out the KEEP = options for the output data set so you can see all of the temporary variables. They may tell you where things are going wrong. You may also want to write out all the records instead of filtering the output to LAST.Person. If you are accumulating a total of some sort and the total is incorrect you likely need to identify where the first accumulation is incorrect and that is very likely not to be only on the LAST record for the Person.

This sort of test should restrict the PERSON to a few where you are sure that the result is incorrect and trace down the reason.

 

Some things I might look for are values of Printer other than the 5 you list. If some of your data has "p1" instead of "P1" then it would be found and/or the Formats have no value to work with.

Also none of the formats used in this bit of code are shown in the Proc Format

             genrefirst=put(printer,$genrefs.);
              authorfirst=put(printer,$authorfs.);
			  pagesfirst=put(printer,$pagesfs.);
		   end;
           genrelast=put(printer,$genrels.);
           authorlast=put(printer,$authorls.);
		   pageslast=put(printer,$pagesls.);
  

So if your log is showing messages about undefined formats that would be why. If not, then you need to add the definitions for those formats to the code.

 

 

Tom
Super User Tom
Super User

Show some example data.

I suspect the issue is you have the PUT function calls in the wrong place. 

Your current code should generate an error, like in this example.

2010  data test;
2011    array x x1-x3 ;
2012    do over x;
2013    end;
2014    y = put(x,5.);
2015  run;

ERROR: Array subscript out of range at line 2014 column 11.
_I_=4 x1=. x2=. x3=. y=  _ERROR_=1 _N_=1
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.TEST may be incomplete.  When this step was stopped there were 0 observations and 4 variables.
WARNING: Data set WORK.TEST was not replaced because this step was stopped.

I suspect you want them inside the IF/THEN block that is checking for valid value of READ1 to READ4.

array printer read1-read4;
do over printer;
  if printer in ('P1', 'P2', 'P3','P4','P5') then do;
    authorcount+1;
    if authorcount=1 then do;
      genrefirst=put(printer,$genrefs.);
      authorfirst=put(printer,$authorfs.);
      pagesfirst=put(printer,$pagesfs.);
    end;
    genrelast=put(printer,$genrels.);
    authorlast=put(printer,$authorls.);
    pageslast=put(printer,$pagesls.);
  end;
end;
Kurt_Bremser
Super User

You make your life unnecessarily hard by writing code in such an ugly and unreadable way.

  • use consistent indentation
  • use only one RETAIN statement, and do never try to use it conditionally (as it is a declarative statement)
  • use macros only when needed; code that is not dynamic and used only once does not need a macro AT ALL
  • either put THEN DO in the same line as the IF, or in the next line, but do it CONSISTENTLY
  • use at least one line per dataset option to make them easier to detect

This is your code, made readable:

data out.favebooks (
  keep=
    author pubdate publisher authorcount reader
    genrefs authorfs pagesfs genrels authorls pagesls
);
merge
  out.reading (in=ina)
  writing (
    in=in_i
    keep=efamid enrolid year dx1-dx15 admdate
  )
;
by author pubdate publisher;
if ina;
retain
  authorcount
  reader 0
  genrefs
  authorfs
  pagesfs
  genrels
  authorls
  pagesls
;
array printer read1-read4;
if first.person then authorcount = 0;
do over printer;
  if printer in	('P1','P2','P3','P4','P5')
  then authorcount + 1;
  if authorcount = 1
  then do;
    genrefirst = put(printer,$genrefs.);
    authorfirst = put(printer,$authorfs.);
	pagesfirst = put(printer,$pagesfs.);
  end;
  genrelast = put(printer,$genrels.);
  authorlast = put(printer,$authorls.);
  pageslast = put(printer,$pagesls.);
end;
if last.person
then do; 
  if authorcount > 0
  then reader = 1;
  else reader = 0;
  output;
end;
run;

Why do you RETAIN reader, when you either set it to 1 or 0 when you OUTPUT? Or did you intend to increment there? In the latter case, use this simple sum statement:

reader + (authorcount > 0);

You also RETAIN variables which are never created or used, and create variables in the DO OVER loop which are not RETAINed, so most of the time, they won't have your intended values at LAST.PERSON.

And your step will fail anyway because of

  • "Array subscript out of range", as @Tom already mentioned.
  • use of FIRST.PERSON and LAST.PERSON, but PERSON is not in the BY

Maxim 2: Read Your Log.

 

Please post usable example datasets (in data steps with datalines), and show us the expected result. Make sure that your example data covers all eventual data combinations you expect to get.

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 394 views
  • 0 likes
  • 4 in conversation