Hi Experts,
Time series of price data of firms are in file "p2004_usa". The file "decile1_2004" contains my selected firms (named 'code'). I need to have only those selected firms from in the "p2004_usa".
How can I do that? I couldnot upload sas files.
Second question is: how can I use the file "decile1_2004" in a datastep?
Abu
My "p2004_usa" file looks the following:
Date | US1667641005 | US91913Y1001 | US7427181091 | US4781601046 | US20825C1045 | US2605431038 | US58933Y1055 | US0028241000 | US2635341090 | US8816091016 | US88579Y1010 |
01/01/2004 | 43.195 | 10.5824 | 49.94 | 51.66 | 24.9927 | 41.57 | 46.2 | 20.8533 | 45.89 | 7.285 | 85.03 |
02/01/2004 | 42.945 | 10.5093 | 49.495 | 51.66 | 24.9584 | 41.13 | 47.05 | 20.9204 | 45.51 | 7.405 | 83.37 |
05/01/2004 | 43.775 | 10.7331 | 49.74 | 51.96 | 25.633 | 41.91 | 48 | 20.9204 | 46 | 7.7 | 84.96 |
06/01/2004 | 43.535 | 10.8244 | 49.49 | 51.73 | 25.3548 | 41.43 | 47.77 | 20.7683 | 45.67 | 7.585 | 84.7 |
07/01/2004 | 42.98 | 10.708 | 49.05 | 51.75 | 25.2214 | 41.52 | 47.83 | 20.9875 | 45.09 | 7.45 | 83.24 |
08/01/2004 | 43.025 | 10.6075 | 49.565 | 51.99 | 25.0803 | 42.39 | 48.08 | 20.37 | 45.26 | 7.25 | 82.68 |
09/01/2004 | 42.56 | 10.6372 | 49.82 | 51.4 | 24.9393 | 41.08 | 47.13 | 20.106 | 44.19 | 7.39 | 82.4 |
And "decile1_2004" file looks the following:
Code
US0010311035
US00163U1060
US00508B1026
US0097281069
Let's ASSUME that this is a rename problem, but my assumptions have been wrong in the past.
As mentioned earlier you can avoid all of this by using a long data structure rather than wide. You should really reconsider changing your data structure. If it was long you could just do the following and be done.
proc sql;
create table want as
select * from p2004_usa
where code in (select code from decile1_2004);
quit;
Instead here's how to rename your variables.
*Create a mapping of AR1, AR2, AR3 to new variable names;
data rename_data;
set decile1_2004;
original_name = catt('ar', put(_n_, 3. -l));
run;
*Create macro variable to set AR1=NEW_NAME into a long list;
proc sql noprint;
select catx("=", original_name, code) into :rename_list separated by " " from rename_Data;
quit;
*Rename variables;
data want;
set p2004_usa;
rename &rename_list;
run;
What kind of output do you want ?
data p2004_usa; input Date :$20. US1667641005 US91913Y1001 US7427181091 US4781601046 US20825C1045 US2605431038 US58933Y1055 US0028241000 US2635341090 US8816091016 US88579Y1010 ; cards; 01/01/2004 43.195 10.5824 49.94 51.66 24.9927 41.57 46.2 20.8533 45.89 7.285 85.03 02/01/2004 42.945 10.5093 49.495 51.66 24.9584 41.13 47.05 20.9204 45.51 7.405 83.37 05/01/2004 43.775 10.7331 49.74 51.96 25.633 41.91 48 20.9204 46 7.7 84.96 06/01/2004 43.535 10.8244 49.49 51.73 25.3548 41.43 47.77 20.7683 45.67 7.585 84.7 07/01/2004 42.98 10.708 49.05 51.75 25.2214 41.52 47.83 20.9875 45.09 7.45 83.24 08/01/2004 43.025 10.6075 49.565 51.99 25.0803 42.39 48.08 20.37 45.26 7.25 82.68 09/01/2004 42.56 10.6372 49.82 51.4 24.9393 41.08 47.13 20.106 44.19 7.39 82.4 ; run; data decile1_2004; input Code $40.; cards; US0010311035 US00163U1060 US00508B1026 US0097281069 US1667641005 ; run; proc sql; select distinct code into : list separated by ' ' from decile1_2004; quit; options dkricond=nowarn; data want; set p2004_usa(keep=date &list); run;
Xia Keshan
Thanks Xia. I wanted to have price data of only "decile1_2004" firms in "p2004_usa". Your code works for my first question.
But for the second question, I want to use "decile1_2004" file to rename the variables (ar1, ar2, .....) in the following file.
Date ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8 ar9 ar10
02/01/2004 | 0.018773201 | 0.008640321 | -0.02604183 | 0.027888172 | -0.002423762 | -0.027671163 | -0.016468611 | 0.0511859098 | 0.0044592593 | -0.014969492 |
05/01/2004 | 0.0076545455 | -0.009288519 | -0.016130551 | -0.057366847 | 0.0031403398 | 0.0088767919 | 0.0565801653 | -0.025166568 | 0.0678269113 | -0.024272603 |
06/01/2004 | 0.0245047281 | -0.013276251 | 0.0051666667 | 0.0028365134 | -0.020075851 | -0.007333791 | 0.0039945055 | -0.016326916 | -0.035302817 | 0.0000396459 |
07/01/2004 | 0.0271059908 | 0.0625342602 | -0.010622517 | -0.054086356 | 0.0113785489 | -0.015736047 | -0.019300546 | 0.1112048261 | -0.030822157 | -0.009736116 |
08/01/2004 | -0.002130726 | -0.028946369 | 0.0006222222 | -0.0066 | 0.0170893204 | 0.0349639609 | 0.0021680355 | 0.0042047637 | 0.0982532055 | -0.041607558 |
Something like 99% of all questions of this type befits from transposing your data. Why?
So, for question two, transpose p2004_usa data set, optimally set a sequence no within each data group.
Then join with the meta data for decile1_2004 (dictionary.columns if you are in SQL, sashelp.vcolumn elsewhere).
And the, don't transpose it back until the creation of the final report.
Hi Xia, Reeza and all Experts,
I am waiting for your reply.
The file "decile1_2004" contains my selected firms (named 'code'). I need to have only those selected firms from in the "p2004_usa".
Create a macro variable that contains the list of firms from decile data set. You haven't indicated if the name in the list is the whole variable name or only a part so I'm going to assume its the whole variable name. If it's not you'll need to look into the sashelp.vcolumn table or transpose the data. Linus has outlined the steps for you above.
Here's the code if code is the whole variable name:
proc sql;
select code into :keep_list separated by " "
from decile1_2004;
quit;
data want;
set have;
where year(date)=2004;
keep date &keep_list;
run;
Hi Reeza,
If I run your code then 'want' dataset has only 'date' variable. Other variables are gone. It shows the following messages in the log file:
WARNING: The variable CH0018666781 in the DROP, KEEP, or RENAME list has never been referenced.
WARNING: The variable US00208J1088 in the DROP, KEEP, or RENAME list has never been referenced.
WARNING: The variable US0038301067 in the DROP, KEEP, or RENAME list has never been referenced.
...
...
...
Is CH0018666781 a variable in the dataset?
Yes, these are variables in "decile1_2004" dataset.
Is it a variable or observation in decile1_2004?
My understanding was that decile1_2004 was a list of variables to extract from p2004_usa. If not you need to clarify the question, please include a detailed example.
Sorry, these are observations under the variable 'code'. You understood correctly: those are a list of variables extracted from other datasets. Now these are observations in "decile1_2004" dataset under the variable named 'code'. This 'code' is the only variable in "decile1_2004" dataset.
You need to clarify the question, please include a detailed example.
Ok. Following is my dataset. The variable names are ar1, ar2, ar3 etc.
Date ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8 ar9 ar10
02/01/2004 | 0.018773201 | 0.008640321 | -0.02604183 | 0.027888172 | -0.002423762 | -0.027671163 | -0.016468611 | 0.0511859098 | 0.0044592593 | -0.014969492 |
05/01/2004 | 0.0076545455 | -0.009288519 | -0.016130551 | -0.057366847 | 0.0031403398 | 0.0088767919 | 0.0565801653 | -0.025166568 | 0.0678269113 | -0.024272603 |
06/01/2004 | 0.0245047281 | -0.013276251 | 0.0051666667 | 0.0028365134 | -0.020075851 | -0.007333791 | 0.0039945055 | -0.016326916 | -0.035302817 | 0.0000396459 |
07/01/2004 | 0.0271059908 | 0.0625342602 | -0.010622517 | -0.054086356 | 0.0113785489 | -0.015736047 | -0.019300546 | 0.1112048261 | -0.030822157 | -0.009736116 |
08/01/2004 | -0.002130726 | -0.028946369 | 0.0006222222 | -0.0066 | 0.0170893204 | 0.0349639609 | 0.0021680355 | 0.0042047637 | 0.0982532055 | -0.041607558 |
And "decile1_2004" dataset looks the following:
Code
US0010311035
US00163U1060
US00508B1026
US0097281069
decile1_2004 is the list of variables. I want to use these variable names (instead of ar1, ar2, ......) into the previous dataset.
Let's ASSUME that this is a rename problem, but my assumptions have been wrong in the past.
As mentioned earlier you can avoid all of this by using a long data structure rather than wide. You should really reconsider changing your data structure. If it was long you could just do the following and be done.
proc sql;
create table want as
select * from p2004_usa
where code in (select code from decile1_2004);
quit;
Instead here's how to rename your variables.
*Create a mapping of AR1, AR2, AR3 to new variable names;
data rename_data;
set decile1_2004;
original_name = catt('ar', put(_n_, 3. -l));
run;
*Create macro variable to set AR1=NEW_NAME into a long list;
proc sql noprint;
select catx("=", original_name, code) into :rename_list separated by " " from rename_Data;
quit;
*Rename variables;
data want;
set p2004_usa;
rename &rename_list;
run;
Thank you very much Reeza, Xia, Linus and all experts.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.