Choosing selected variables

Accepted Solution Solved
Reply
Regular Contributor
Posts: 183
Accepted Solution

Choosing selected variables

Hi Experts,

Time series of price data of firms are in file "p2004_usa". The file "decile1_2004" contains my selected firms (named 'code'). I need to have only those selected firms from in the "p2004_usa".

How can I do that? I couldnot upload sas files.

Second question is: how can I use the file "decile1_2004" in a datastep?

Abu

My "p2004_usa" file looks the following:

                                                                                   

DateUS1667641005US91913Y1001US7427181091US4781601046US20825C1045US2605431038US58933Y1055US0028241000US2635341090US8816091016US88579Y1010
01/01/200443.19510.582449.9451.6624.992741.5746.220.853345.897.28585.03
02/01/200442.94510.509349.49551.6624.958441.1347.0520.920445.517.40583.37
05/01/200443.77510.733149.7451.9625.63341.914820.9204467.784.96
06/01/200443.53510.824449.4951.7325.354841.4347.7720.768345.677.58584.7
07/01/200442.9810.70849.0551.7525.221441.5247.8320.987545.097.4583.24
08/01/200443.02510.607549.56551.9925.080342.3948.0820.3745.267.2582.68
09/01/200442.5610.637249.8251.424.939341.0847.1320.10644.197.3982.4

And "decile1_2004" file looks the following:

Code

US0010311035

US00163U1060

US00508B1026

US0097281069


Accepted Solutions
Solution
‎03-30-2015 02:24 PM
Super User
Posts: 17,748

Re: Choosing selected variables

Let's ASSUME that this is a rename problem, but my assumptions have been wrong in the past.

As mentioned earlier you can avoid all of this by using a long data structure rather than wide.  You should really reconsider changing your data structure. If it was long you could just do the following and be done.

proc sql;

create table want as

select * from p2004_usa

where code in (select code from decile1_2004);

quit;

Instead here's how to rename your variables.

*Create a mapping of AR1, AR2, AR3 to new variable names;

data rename_data;

set decile1_2004;

original_name = catt('ar', put(_n_, 3. -l));

run;

*Create macro variable to set AR1=NEW_NAME into a long list;

proc sql noprint;

select catx("=", original_name, code)  into :rename_list separated by " " from rename_Data;

quit;

*Rename variables;

data want;

set p2004_usa;

rename &rename_list;

run;

View solution in original post


All Replies
Super User
Posts: 9,662

Re: Choosing selected variables

What kind of output do you want ?

data p2004_usa;
input Date :$20.     US1667641005     US91913Y1001     US7427181091     US4781601046     US20825C1045     US2605431038     US58933Y1055     US0028241000     US2635341090     US8816091016     US88579Y1010 ;
cards;
01/01/2004     43.195     10.5824     49.94     51.66     24.9927     41.57     46.2     20.8533     45.89     7.285     85.03
02/01/2004     42.945     10.5093     49.495     51.66     24.9584     41.13     47.05     20.9204     45.51     7.405     83.37
05/01/2004     43.775     10.7331     49.74     51.96     25.633     41.91     48     20.9204     46     7.7     84.96
06/01/2004     43.535     10.8244     49.49     51.73     25.3548     41.43     47.77     20.7683     45.67     7.585     84.7
07/01/2004     42.98     10.708     49.05     51.75     25.2214     41.52     47.83     20.9875     45.09     7.45     83.24
08/01/2004     43.025     10.6075     49.565     51.99     25.0803     42.39     48.08     20.37     45.26     7.25     82.68
09/01/2004     42.56     10.6372     49.82     51.4     24.9393     41.08     47.13     20.106     44.19     7.39     82.4
;
run;

data decile1_2004;
input Code $40.;
cards;
US0010311035
US00163U1060
US00508B1026
US0097281069
US1667641005
;
run;


proc sql;
 select distinct code into : list separated by ' ' from decile1_2004;
quit;
options dkricond=nowarn;
data want;
 set p2004_usa(keep=date &list);
run;

Xia Keshan

Regular Contributor
Posts: 183

Re: Choosing selected variables

Thanks Xia. I wanted to have price data of only "decile1_2004" firms in "p2004_usa". Your code works for my first question.

But for the second question, I want to use "decile1_2004" file to rename the variables (ar1, ar2, .....) in the following file.

     Date            ar1                  ar2                  ar3                  ar4                ar5              ar6                   ar7                ar8                   ar9                 ar10             

02/01/20040.0187732010.008640321-0.026041830.027888172-0.002423762-0.027671163-0.0164686110.05118590980.0044592593-0.014969492
05/01/20040.0076545455-0.009288519-0.016130551-0.0573668470.00314033980.00887679190.0565801653-0.0251665680.0678269113-0.024272603
06/01/20040.0245047281-0.0132762510.00516666670.0028365134-0.020075851-0.0073337910.0039945055-0.016326916-0.0353028170.0000396459
07/01/20040.02710599080.0625342602-0.010622517-0.0540863560.0113785489-0.015736047-0.0193005460.1112048261-0.030822157-0.009736116
08/01/2004-0.002130726-0.0289463690.0006222222-0.00660.01708932040.03496396090.00216803550.00420476370.0982532055-0.041607558
Super User
Posts: 5,254

Re: Choosing selected variables

Something like 99% of all questions of this type befits from transposing your data. Why?

  • It's easier to manipulate data rather that meta data (i.e. column names/labels/types)
  • A majority of the reporting and analysis features in SAS is adopted to data where the data is normalised. In this case, have a column specifying the relative year.

So, for question two, transpose p2004_usa data set, optimally set a sequence no within each data group.

Then join with the meta data for decile1_2004 (dictionary.columns if you are in SQL, sashelp.vcolumn elsewhere).

And the, don't transpose it back until the creation of the final report.

Data never sleeps
Regular Contributor
Posts: 183

Re: Choosing selected variables

Hi Xia, Reeza and all Experts,

I am waiting for your reply.

Super User
Posts: 17,748

Re: Choosing selected variables

The file "decile1_2004" contains my selected firms (named 'code'). I need to have only those selected firms from in the "p2004_usa".

Create a macro variable that contains the list of firms from decile data set. You haven't indicated if the name in the list is the whole variable name or only a part so I'm going to assume its the whole variable name. If it's not you'll need to look into the sashelp.vcolumn table or transpose the data. Linus has outlined the steps for you above.

Here's the code if code is the whole variable name:

proc sql;

select code into :keep_list separated by " "

from decile1_2004;

quit;

data want;

set have;

where year(date)=2004;


keep date &keep_list;

run;


Regular Contributor
Posts: 183

Re: Choosing selected variables

Hi Reeza,

If I run your code then 'want' dataset has only 'date' variable. Other variables are gone. It shows the following messages in the log file:

WARNING: The variable CH0018666781 in the DROP, KEEP, or RENAME list has never been referenced.

WARNING: The variable US00208J1088 in the DROP, KEEP, or RENAME list has never been referenced.

WARNING: The variable US0038301067 in the DROP, KEEP, or RENAME list has never been referenced.

...

...

...

Super User
Posts: 17,748

Re: Choosing selected variables

Is CH0018666781 a variable in the dataset?

Regular Contributor
Posts: 183

Re: Choosing selected variables

Yes, these are variables in "decile1_2004" dataset.

Super User
Posts: 17,748

Re: Choosing selected variables

Is it a variable or observation in decile1_2004?

My understanding was that decile1_2004 was a list of variables to extract from p2004_usa. If not you need to clarify the question, please include a detailed example.

Regular Contributor
Posts: 183

Re: Choosing selected variables

Sorry, these are observations under the variable 'code'. You understood correctly: those are a list of variables extracted from other datasets. Now these are observations in "decile1_2004" dataset under the variable named 'code'. This 'code' is the only variable in "decile1_2004" dataset.

Super User
Posts: 17,748

Re: Choosing selected variables

You need to clarify the question, please include a detailed example.
Regular Contributor
Posts: 183

Re: Choosing selected variables

Ok. Following is my dataset. The variable names are ar1, ar2, ar3 etc.

Date            ar1                  ar2                  ar3                  ar4                ar5              ar6                   ar7                ar8                   ar9                 ar10            

02/01/20040.0187732010.008640321-0.026041830.027888172-0.002423762-0.027671163-0.0164686110.05118590980.0044592593-0.014969492
05/01/20040.0076545455-0.009288519-0.016130551-0.0573668470.00314033980.00887679190.0565801653-0.0251665680.0678269113-0.024272603
06/01/20040.0245047281-0.0132762510.00516666670.0028365134-0.020075851-0.0073337910.0039945055-0.016326916-0.0353028170.0000396459
07/01/20040.02710599080.0625342602-0.010622517-0.0540863560.0113785489-0.015736047-0.0193005460.1112048261-0.030822157-0.009736116
08/01/2004-0.002130726-0.0289463690.0006222222-0.00660.01708932040.03496396090.00216803550.00420476370.0982532055-0.041607558

And "decile1_2004" dataset looks the following:

Code

US0010311035

US00163U1060

US00508B1026

US0097281069

decile1_2004 is the list of variables. I want to use these variable names (instead of ar1, ar2, ......) into the previous dataset.

Solution
‎03-30-2015 02:24 PM
Super User
Posts: 17,748

Re: Choosing selected variables

Let's ASSUME that this is a rename problem, but my assumptions have been wrong in the past.

As mentioned earlier you can avoid all of this by using a long data structure rather than wide.  You should really reconsider changing your data structure. If it was long you could just do the following and be done.

proc sql;

create table want as

select * from p2004_usa

where code in (select code from decile1_2004);

quit;

Instead here's how to rename your variables.

*Create a mapping of AR1, AR2, AR3 to new variable names;

data rename_data;

set decile1_2004;

original_name = catt('ar', put(_n_, 3. -l));

run;

*Create macro variable to set AR1=NEW_NAME into a long list;

proc sql noprint;

select catx("=", original_name, code)  into :rename_list separated by " " from rename_Data;

quit;

*Rename variables;

data want;

set p2004_usa;

rename &rename_list;

run;

Regular Contributor
Posts: 183

Re: Choosing selected variables

Thank you very much Reeza, Xia, Linus and all experts.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 22 replies
  • 586 views
  • 11 likes
  • 5 in conversation