BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Thomas_mp
Obsidian | Level 7

Hello,

I have SAS 9.4.

I have multiple “measures” in columns that start with the same prefix. 

For instance, I have 50 columns of the “measure” (roa) in different years, written as follows: roa_n1 ,  roa_n2, roa_n3 etc,    until ....roa_n50). 

I have another set of measures (ast), also in 50 columns:  ast_n1, ast_n2, ast_n3 etc,…… until ast_n50).

I want to create another measure (another 50 columns) by dividing the roa/ast that  have the same suffix.  This is I want o create:

ratio_n1 = roa_n1/ast_n1,     ratio_n2 =  roa_n2/ast_n2,       ratio_n3 = roa_n3/ast_n3, etc…  until ratio_n50 = roa_n50/ast_n50.

Could you please help me with this?

Thank you !!!

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Trasnsposed tables are not ideal to work with  You would be better with 3 columns one for each measure, then have rows for the data.

As for your issue, you could use arrays:

data want;
  set have;
  array roa_n{50};
  array ast_n{50};
  array ratio_n{50};
  do i=1 to 50;
    ratio{i}=roa_n{i}/ast_n{i};
  end;
run;

However you would still be better with a data model:

NUM   ROA   AST  RATIO

1         x         y       x/y

2         x         y       x/y

...

View solution in original post

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Trasnsposed tables are not ideal to work with  You would be better with 3 columns one for each measure, then have rows for the data.

As for your issue, you could use arrays:

data want;
  set have;
  array roa_n{50};
  array ast_n{50};
  array ratio_n{50};
  do i=1 to 50;
    ratio{i}=roa_n{i}/ast_n{i};
  end;
run;

However you would still be better with a data model:

NUM   ROA   AST  RATIO

1         x         y       x/y

2         x         y       x/y

...

Thomas_mp
Obsidian | Level 7
Thank you for your quick response and please, bear with me (I only know elementary SAS).

The arrays created have missing observations. It is creating 21 arrays because I have a column (column 1, named DealNumber) with identifies . If I delete the first column, it works well, but I need this column.

How can I get SAS from considering column when creating the arrays?



Thank you again !!



Attached is a small version with 20 columns of each measure.

The code I used is


data delete;
infile 'H:\deletehk.csv' delimiter="," lrecl=100000 TRUNCOVER dsd ;
input DealNumber AcqIBCOMPUSTAT_1 AcqIBCOMPUSTAT_2 AcqIBCOMPUSTAT_3 AcqIBCOMPUSTAT_4
AcqIBCOMPUSTAT_5 AcqIBCOMPUSTAT_6 AcqIBCOMPUSTAT_7 AcqIBCOMPUSTAT_8 AcqIBCOMPUSTAT_9 AcqIBCOMPUSTAT_10
AcqIBCOMPUSTAT_11 AcqIBCOMPUSTAT_12 AcqIBCOMPUSTAT_13 AcqIBCOMPUSTAT_14 AcqIBCOMPUSTAT_15
AcqIBCOMPUSTAT_16 AcqIBCOMPUSTAT_17 AcqIBCOMPUSTAT_18 AcqIBCOMPUSTAT_19 AcqIBCOMPUSTAT_20
AcqAssetsCOMPUSTAT_1 AcqAssetsCOMPUSTAT_2 AcqAssetsCOMPUSTAT_3 AcqAssetsCOMPUSTAT_4 AcqAssetsCOMPUSTAT_5
AcqAssetsCOMPUSTAT_6 AcqAssetsCOMPUSTAT_7 AcqAssetsCOMPUSTAT_8 AcqAssetsCOMPUSTAT_9 AcqAssetsCOMPUSTAT_10
AcqAssetsCOMPUSTAT_11 AcqAssetsCOMPUSTAT_12 AcqAssetsCOMPUSTAT_13 AcqAssetsCOMPUSTAT_14
AcqAssetsCOMPUSTAT_15 AcqAssetsCOMPUSTAT_16 AcqAssetsCOMPUSTAT_17 AcqAssetsCOMPUSTAT_18
AcqAssetsCOMPUSTAT_19 AcqAssetsCOMPUSTAT_20

;
run;

data want;
set delete;
array AcqIBCOMPUSTAT_{20};
array AcqAssetsCOMPUSTAT_{20};
array ratio_{20};
do i=1 to 20;
ratio_{i}=AcqIBCOMPUSTAT_{i}/AcqAssetsCOMPUSTAT_{i};
end;
run;

proc print data=want (obs=10); run;




RW9
Diamond | Level 26 RW9
Diamond | Level 26

Please provide some test data, I can't tell from that.  Arrays do not delete any variables, therefore something else is happening.  Arrays are temporary references to variables, they can create variables if they don't exist.  You can get some test data by following:
https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

Your code looks ok, so I assume the data is at fault, but I cant see this.

Thomas_mp
Obsidian | Level 7

It works perfectly , thank you !!!

Not sure why did not work before... 

 

I have one more question that  is the next step in the task I am trying to do.

 

Once I have the 50 ratios :  ratio_n1, ratio_n2... ratio_n50, in 50 different columns, I need to get some statistics of all and of different sub-sets of these observations.

For instance the astandard deviation of the first 10 observations.

 

stdv(ratio_n1, ratio_n2..... ratio_n10)

 

A second step, which may be more difficult, is to compute the standard deviation only if there are >= 5  non-missing observations.

 

 

Thank you again.

 

 

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well nmiss() can count missings,:

num_missing=nmiss(of ratio_n{*});

I don't believe there is a standard deviation function however.  Again, this is where normalised data is far simpler to work with:

NUM   ROA   AST  RATIO

1         x         y       x/y

2         x         y       x/y

...

So:

proc means data=have;
  by num;
  var ratio;
  output out=want stddev=stddev;
run;

 

ballardw
Super User

@Thomas_mp wrote:

It works perfectly , thank you !!!

Not sure why did not work before... 

 

I have one more question that  is the next step in the task I am trying to do.

 

Once I have the 50 ratios :  ratio_n1, ratio_n2... ratio_n50, in 50 different columns, I need to get some statistics of all and of different sub-sets of these observations.

For instance the astandard deviation of the first 10 observations.

 

stdv(ratio_n1, ratio_n2..... ratio_n10)

 

A second step, which may be more difficult, is to compute the standard deviation only if there are >= 5  non-missing observations.

 

 

Thank you again. 


It is very poor planning to use a rule like "standard deviation of the first 10 observations". Reason: data sets get sorted all the time and the output of some procedures may reorder the data in some fashion. If the rule does not involve the value of one or more variables such as "idvar is value 3" or "idvar in the range 2 to 5 and timevar < 25" then the programming is going to be obnoxious at best and fragile, hard to maintain, hard to understand and possibly just plain wrong in an unobvious manner at worst. Proc means, summary, tabulate and report can do standard deviations and other statistics.

 

From your original post:  I have 50 columns of the “measure” (roa) in different years: As @RW9 mentioned you might be better to have that actual YEAR value then that could be used as one of the groups. In fact if your groups are all based on that original year it may well be that this whole exercise could likely be cleaned up by maintaining year and grouping by year values.

 

As far as your requirement to calculate when there are 5 or more nonmissing values when you use Proc Means or summary you 1) request an associated N statistic and 2) filter or set to missing based on the value of that n.

 

 

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 2132 views
  • 0 likes
  • 3 in conversation