What's the difference between date10. and date9. and between cards and cards4?
new1, new2, … etc. calculate the difference between returns from usa_firms table and average_ret from Benchmark table. I want perform the same task for another dataset (say uk_firms). After performing the same operation this newdataset will also show the same variables: new1, new2, … etc. I need to merge these two datasets: usa_firms and uk_firms. It's a problematic since both datasets have same variable names (new1, new2, … ) but those are from two different countries? How can I resolve this?
If you look at the dataset usa_firms, you will see the price data for US firms and US firms are identified as . There are price data for 61 US firms. From these price data, returns are calculated (ret1, ret2, …. (shown in the right end of the dataset)) for those 61 firms. By looking at ret1, ret2 variables I cannot understand which firms' return they are. Is it possible to use firm identity instead of ret1, ret2 variable names? I need to identify the returns for each firm because I need to do the same task for another dataset (uk_firms) and need to merge these two datasets.
If you want to continue to use his horizontal format instead of a vertical format then you will need to continually be creating new variable names.
You can use some tricks to help you though. For example below SQL will create two macro variables with space delimited lists of variable names that would be very suited for use in a ARRAY statement. The first list is just the variables of interest and the second is the same but with _RETURN appended to the names.
proc sql noprint ;
select name
, cats(name,'_return')
into varlist separated by ' '
, retlist separated by ' '
from dictionary.columns
where libname = 'WORK' and memname = 'USA_FIRMS'
and indexw('US4583341098 US09061G1013',upcase(name))
;
quit;
data .... ;
array firms &varlist;
array ret &retlist;
....
run;
Don't I need to define the variables of interest in the beginning such as?
%let varlist = CA45245E1097 ... US98974P1003 ; /***I wrote all the variable names***/
%let nvar = %sysfunc(countw(&varlist,%str( )));
data usa_firms;
set Benchmar.P20040by0 (keep = date &varlist);
array firms (&nvar) &varlist;
.
.
array ret (&nvar)
And don't I need to define the variables for varlist and retlist in your following codes?
array firms &varlist;
array ret &retlist;
I don't understand this line: and indexw('US4583341098 US09061G1013',upcase(name))
Do I need to write the names of all the variables here? What is upcase(name)?
Some how you would have to define the list of variables and you obviously found where I listed the two variable names in the simple example I posted.
If you have them in a macro variable already then you could replace the hard coded string 'US4583341098 US09061G1013' with "&VARLIST".
You need to get used to reading the SAS manuals . Google is your friend
INDEXW() SAS(R) 9.4 Functions and CALL Routines: Reference, Second Edition
UPCASE() SAS(R) 9.4 Functions and CALL Routines: Reference, Second Edition
Dictionary.Columns SAS(R) 9.4 SQL Procedure User's Guide
Also just run some code and see what it does. Put in some proc prints and PUT statements to see what is happening.
Thank you very much. You are guiding me in a very helpful way. I will put all the codes together and then let you know.
Please correct the following codes. The intention is to calculate the return of some selected firms (listed in the %let varlist) from P2004 dataset (attached). There are couple of equations but ret(i) = (rvar(i)*lgvar(i))/rgla is the ultimate objective to achieve. I want that return variables are named similar to firm names instead of ret1, ret2 etc.
%let varlist= US0453271035 US16411R2085 US1844961078 US2067081099 US2296781071 US38388F1084 US4282911084 US74835Y1010 US8803491054 US90385D1072;
%let nvar = %sysfunc(countw(&varlist,%str( )));
proc sql noprint ;
select name
,cats(name,'_return')
into :varlist separated by ''
,:retlist separated by ''
from dictionary.columns
where libname='WORK' and memname='P20040BY0'
and indexw('&varlist',upcase(name))
;
quit;
data P20040by0;
set work.P2004 (keep = date &varlist);
array price (&nvar) &varlist;
array rvar (&nvar) ;
array gvar (&nvar) ;
array lgvar (&nvar) ;
array ret (&nvar) ;
do i = 1 to &nvar;
rvar(i) = (price(i)-lag(price(i)))/lag(price(i));
gvar(i) = rvar(i)+1;
lgvar(i) = lag(gvar(i));
end;
rgla = mean( of lgvar(*) );
do i = 1 to &nvar;
ret(i) = (rvar(i)*lgvar(i))/rgla;
end;
run;
By the by, instead of writing so many equations, is it possible to combine those equations into a single equation?
Macro expressions are not evaluated inside of single quotes. Change '&varlist' to "&varlist".
What about other lines of the codes? Are these ok? Do I need to write variable names instead of 'name' and how should I define retlist?
You generated the names with _RETURN on the end into a macro variable and then didn't use them anywhere.
Most likely you want to modify the array definition for RET to include the list of names just like you did for the array PRICE.
Yes, but I cannot do that. It would be a great help if you correct my codes and write the codes. Thanks a lot.
Part of programming is pattern recognition.
The SQL generated macro variables
into :varlist separated by ''
,:retlist separated by ''
The data step referenced one of them here.
array price (&nvar) &varlist;
So modify this line to match the same pattern.
array ret (&nvar) ;
Hi,
What is 'diff' in your code?
The name of the variable that you wanted to copy from the other table and subtract from every value.
You will probably get results faster if you normalize your data.
So if you currently have data with 200 columns that represent PRICE for 200 firms you should convert it into 200 rows with one column for the FIRM and another for the PRICE.
proc transpose data=have(keep=date firm1-firm200) out=price(rename=(col1=PRICE _name_=FIRM)) ;
by price;
var firm1-firm200;
run;
proc sort; by firm date; run;
Then if you have another variable that also had collected for each FIRM/DATE combination you could easily merge them and do calculations.
data gain;
merge price basis ;
by firm date ;
gain = price - basis ;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.