BookmarkSubscribeRSS Feed
HeatherNewton
Quartz | Level 8
data acct;
set new;
array pay(12) pay01-pay12;
do i=1 to 12;
if total_ref=0 then pay(i)=.;
else
pay(i)=pay(i)*default_amt/total_ref;
end;
run;

is this strange, if I create an array what it the original value of each pay(i)

is this code correct?

7 REPLIES 7
PaigeMiller
Diamond | Level 26

The original value of PAY01 to PAY12 is probably determined by what is in the data set named NEW. These variables probably have values in that data set.

--
Paige Miller
Tom
Super User Tom
Super User

Are you asking to explain the purpose of doing something like that?

It looks like the goal is to scale the PAYxx variables from absolute values based on the TOTAL_REF and DEFALT_AMT variables.

 

Note you can use the DIVIDE() function instead of the IF/THEN/ELSE block. Also you don't need to manually count the number of elements in the array, you let SAS count how many variables you listed in the ARRAY statement instead.

data acct;
  set new;
  array pay pay01-pay12;
  do index=1 to dim(pay);
    pay[index]=divide(pay[index]*default_amt,total_ref);
  end;
run;
Kurt_Bremser
Super User

An array in SAS is, in most cases, not a data structure in itself, but a series of references to other data objects (variables). These must be of the same type, but can have different attributes (length, format, label). Elements are scattered more or less randomly in physical memory

In other languages, the individual items have no name of their own, they can only be addressed by array name and index. They also have the same attributes (size!) throughout the array.

In your given example, array element pay{1} is in fact variable pay01. From the code, it is assumed that the individual variables already exist in the PDV through the SET statement and their presence in dataset new.

The exception in SAS is a temporary array. Here the elements have no individual names (need no PDV entries), are located in direct sequence in RAM, and have the same length. They will also never appear in output datasets. Addressing an element is therefore much faster than with a "normal" array.

HeatherNewton
Quartz | Level 8
data acct;
merge acct_new refinance_acct_pay (Drop=org_code);
by refinance_acct;
array pay(12) pay01-pay12;
do I=1 to 12;
if total_ref_amt=0 then Pay(I)=.;
else
pay(I)=pay(I)*default_amt/total_ref_amt;
end;
run;

this is the actual code, so pay01-pay02 is in the result dataset after acct_new merge with refinance_acct_pay. There is delinquency_1 to delinquency_12 in acct_new so probably pay01=delinquency_1, pay02=delinquency_2 etc? there is no other variable that look like pay01-pay12...

this is so strange, why it does not refer to what variable exactly?

what if there are another set called default_1 to default_12 ...

 

Tom
Super User Tom
Super User

Is is referring to the variables PAY01, PAY02, to PAY12 by using the variable list PAY01-PAY12.

 

It is unclear in your comment whether the PAYxx variables are coming into the data step from one of the two input datasets or not.  They will definitely be in the output dataset since if they do not already exist then the ARRAY statement will create them.  If they are not in the input dataset then the DO loop is doing nothing since the result is just going to be to assign missing values to the new variables, which is what would happen without the DO loop.

 

We really cannot answer whether the DELINQUENCY variables should be used instead of the PAY variables.  We don't know your datasets.  But if you decide you want to make that change then the only thing that needs to change is the ARRAY statement.

array pay delinquency_1 - delinquency_12 ;

 

AMSAS
SAS Super FREQ

@HeatherNewton 

This simple example code might help you understand what is going on with arrays:

/* 
	Create sample data containing 1 observation and 18 variables:
		pay01-pay05
		default01-default10 
		any 
		name
		variable
*/
data have ;
	array pay{5} pay01-pay05 (11 12 13 14 15) ;
	array default{10} default01-default10 (21 22 23 24 25 26 27 28 29 30) ;
	/* the variables do not have to be related to the array name */
	array crazy{3} any name variable (1 2 3) ;
	output ;
run; 


data want ;
	/* read the sample dataset */
	set have ;
	/* Create an array "pay" with 10 elements */
	array pay{10} pay01-pay10 ;
	/* Now look at the contents of all the variables read from have */
	put "*************************" ;
	put "All Variables : " ;
	put ;
	put _all_ ;
	put ;
	put "Content of Array varaibles : " ;
	put ;
	/* Now look at the contents of the array */
	do i=1 to dim(pay) ;
		put i= pay{i}= ;
	end ;
run ;

I'd also recommend you review the ARRAY Statement documentation and examples 

Kurt_Bremser
Super User

If the PAYnn variables are not in one of the incoming datasets, then this statement is bullshit:

pay(I)=pay(I)*default_amt/total_ref_amt;

as the variables in the array will always stay missing anyway.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 754 views
  • 1 like
  • 5 in conversation