BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
stevo642
Obsidian | Level 7

I'm going to select N subsets from a matrix in a DO loop.

N is variable, let's say 5;

Any tips on how to name the subsets: SUB1, SUB2, SUB3, SUB4, SUB5 ?

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

proc iml;
x=j(10,5);
call randgen(x,'normal');

names='sub1':'sub5';
do i=1 to ncol(x);
 call valset(names[i],x[,i]);
end;

show names;
print sub1,sub2,sub3,sub4,sub5;
quit;


View solution in original post

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

What do you specifically mean by a subset from a matrix? 🙂

stevo642
Obsidian | Level 7
RE: SUBSET

The full matrix (say, ALLDAT) is numerical stock data by stock & date: maybe 250K rows and 75 columns.

E.g., 2500 rows by 75 columns of data for "AAPL", one row for each date // 2500 rows of data for "AXP" // ...

A separate 250K X 1 stock tickers array (say, ALLTKR) tells me which rows of ALLDAT correspond to which stocks. So the 1st 2500 rows are "AAPL" data, the next 2500 rows are "AXP" data, etc.


I'm testing a stock trading strategy on a subset of ALLDAT. Let's say the subset is the 30 stocks currently in the Dow Jones Industrials Average (DJIA).

I'd specify the 30 tickers of the DJIA: {"AAPL" "AXP" "BA" "CAT" "CSCO" "CVX" "DD" "DIS" "GE" "GS" "HD" "IBM" "INTC" "JNJ" "JPM" "KO" "MCD" "MMM" "MRK" "MSFT" "NKE" "PFE" "PG" "TRV" "UNH" "UTX" "V" "VZ" "WMT" "XOM"}

Then I'd like to extract the rows of ALLDAT for each of those tickers into a matrix named as the ticker: extract 2500 "AAPL" rows into a matrix named AAPL, 2500 "AXP" rows into AXP, etc.

Within a DO loop over the dates I would perform calculations, compare, and select stocks for a portfolio.


I could probably skip the creation of the 30 submatrices by using pointers in ALLTKR to reference rows of ALLDAT but for debugging purposes having separate matrices may be very helpful.
Rick_SAS
SAS Super FREQ

A useful approach is to read each stock, analyze each stock, and then go on to the next stock. After reading all the data and computing statistics for each stock, you can do additional analysis that compares different stocks.  

 

For example, the following loop reads one stock into X.  If computes various statistics for each stock. After the loop is over, I have a matrix called RESULTS that contains all the information that I need to build my portfolio.  The following program uses the following techniques:

proc iml;
StockNames = {"IBM" "Intel" "Microsoft"}; /*{"AAPL" "AXP" ... "WMT" "XOM"}*/
results = j(3, ncol(StockNames), .);
mattrib results rowname={"Mean" "Stddev" "MaxVol"}
                colname=StockNames;
use Sashelp.stocks;
/* http://blogs.sas.com/content/iml/2013/01/21/reading-big-data.html */
do ID = 1 to ncol(StockNames);
   /* read in data for each stock */
   /* http://blogs.sas.com/content/iml/2016/04/04/where-clause-in-sasiml.html */
   read all var _NUM_ into X[colname=varNames]
            where(Stock=(StockNames[ID]));
   /* http://blogs.sas.com/content/iml/2012/10/01/access-rows-or-columns-of-a-matrix-by-names.html */
   results["Mean", ID] = mean(X[,"Close"]);  
   results["Stddev", ID] = std(X[,"Close"]);  
   results["MaxVol", ID] = max(X[,"Volume"]);  
end;
close;

print results;
Rick_SAS
SAS Super FREQ

I rarely name the subsets. In a loop you can extract the first subset, do what you want with it (for example, compute its mean) and store the result. Then you extract the next subset and store its results, etc, until all subsets have been handled.  

 

For a reminder that you should pre-allocate the results matrix for efficiency, see 

http://blogs.sas.com/content/iml/2015/02/16/friends-dont-let-friends-concatenate-results-inside-a-lo...

 

Ksharp
Super User

proc iml;
x=j(10,5);
call randgen(x,'normal');

names='sub1':'sub5';
do i=1 to ncol(x);
 call valset(names[i],x[,i]);
end;

show names;
print sub1,sub2,sub3,sub4,sub5;
quit;


sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 5 replies
  • 990 views
  • 2 likes
  • 4 in conversation