BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AbuChowdhury
Fluorite | Level 6

Hi Experts,

Time series of price data of firms are in file "p2004_usa". The file "decile1_2004" contains my selected firms (named 'code'). I need to have only those selected firms from in the "p2004_usa".

How can I do that? I couldnot upload sas files.

Second question is: how can I use the file "decile1_2004" in a datastep?

Abu

My "p2004_usa" file looks the following:

                                                                                   

DateUS1667641005US91913Y1001US7427181091US4781601046US20825C1045US2605431038US58933Y1055US0028241000US2635341090US8816091016US88579Y1010
01/01/200443.19510.582449.9451.6624.992741.5746.220.853345.897.28585.03
02/01/200442.94510.509349.49551.6624.958441.1347.0520.920445.517.40583.37
05/01/200443.77510.733149.7451.9625.63341.914820.9204467.784.96
06/01/200443.53510.824449.4951.7325.354841.4347.7720.768345.677.58584.7
07/01/200442.9810.70849.0551.7525.221441.5247.8320.987545.097.4583.24
08/01/200443.02510.607549.56551.9925.080342.3948.0820.3745.267.2582.68
09/01/200442.5610.637249.8251.424.939341.0847.1320.10644.197.3982.4

And "decile1_2004" file looks the following:

Code

US0010311035

US00163U1060

US00508B1026

US0097281069

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Let's ASSUME that this is a rename problem, but my assumptions have been wrong in the past.

As mentioned earlier you can avoid all of this by using a long data structure rather than wide.  You should really reconsider changing your data structure. If it was long you could just do the following and be done.

proc sql;

create table want as

select * from p2004_usa

where code in (select code from decile1_2004);

quit;

Instead here's how to rename your variables.

*Create a mapping of AR1, AR2, AR3 to new variable names;

data rename_data;

set decile1_2004;

original_name = catt('ar', put(_n_, 3. -l));

run;

*Create macro variable to set AR1=NEW_NAME into a long list;

proc sql noprint;

select catx("=", original_name, code)  into :rename_list separated by " " from rename_Data;

quit;

*Rename variables;

data want;

set p2004_usa;

rename &rename_list;

run;

View solution in original post

22 REPLIES 22
Ksharp
Super User

What kind of output do you want ?

data p2004_usa;
input Date :$20.     US1667641005     US91913Y1001     US7427181091     US4781601046     US20825C1045     US2605431038     US58933Y1055     US0028241000     US2635341090     US8816091016     US88579Y1010 ;
cards;
01/01/2004     43.195     10.5824     49.94     51.66     24.9927     41.57     46.2     20.8533     45.89     7.285     85.03
02/01/2004     42.945     10.5093     49.495     51.66     24.9584     41.13     47.05     20.9204     45.51     7.405     83.37
05/01/2004     43.775     10.7331     49.74     51.96     25.633     41.91     48     20.9204     46     7.7     84.96
06/01/2004     43.535     10.8244     49.49     51.73     25.3548     41.43     47.77     20.7683     45.67     7.585     84.7
07/01/2004     42.98     10.708     49.05     51.75     25.2214     41.52     47.83     20.9875     45.09     7.45     83.24
08/01/2004     43.025     10.6075     49.565     51.99     25.0803     42.39     48.08     20.37     45.26     7.25     82.68
09/01/2004     42.56     10.6372     49.82     51.4     24.9393     41.08     47.13     20.106     44.19     7.39     82.4
;
run;

data decile1_2004;
input Code $40.;
cards;
US0010311035
US00163U1060
US00508B1026
US0097281069
US1667641005
;
run;


proc sql;
 select distinct code into : list separated by ' ' from decile1_2004;
quit;
options dkricond=nowarn;
data want;
 set p2004_usa(keep=date &list);
run;

Xia Keshan

AbuChowdhury
Fluorite | Level 6

Thanks Xia. I wanted to have price data of only "decile1_2004" firms in "p2004_usa". Your code works for my first question.

But for the second question, I want to use "decile1_2004" file to rename the variables (ar1, ar2, .....) in the following file.

     Date            ar1                  ar2                  ar3                  ar4                ar5              ar6                   ar7                ar8                   ar9                 ar10             

02/01/20040.0187732010.008640321-0.026041830.027888172-0.002423762-0.027671163-0.0164686110.05118590980.0044592593-0.014969492
05/01/20040.0076545455-0.009288519-0.016130551-0.0573668470.00314033980.00887679190.0565801653-0.0251665680.0678269113-0.024272603
06/01/20040.0245047281-0.0132762510.00516666670.0028365134-0.020075851-0.0073337910.0039945055-0.016326916-0.0353028170.0000396459
07/01/20040.02710599080.0625342602-0.010622517-0.0540863560.0113785489-0.015736047-0.0193005460.1112048261-0.030822157-0.009736116
08/01/2004-0.002130726-0.0289463690.0006222222-0.00660.01708932040.03496396090.00216803550.00420476370.0982532055-0.041607558
LinusH
Tourmaline | Level 20

Something like 99% of all questions of this type befits from transposing your data. Why?

  • It's easier to manipulate data rather that meta data (i.e. column names/labels/types)
  • A majority of the reporting and analysis features in SAS is adopted to data where the data is normalised. In this case, have a column specifying the relative year.

So, for question two, transpose p2004_usa data set, optimally set a sequence no within each data group.

Then join with the meta data for decile1_2004 (dictionary.columns if you are in SQL, sashelp.vcolumn elsewhere).

And the, don't transpose it back until the creation of the final report.

Data never sleeps
AbuChowdhury
Fluorite | Level 6

Hi Xia, Reeza and all Experts,

I am waiting for your reply.

Reeza
Super User

The file "decile1_2004" contains my selected firms (named 'code'). I need to have only those selected firms from in the "p2004_usa".

Create a macro variable that contains the list of firms from decile data set. You haven't indicated if the name in the list is the whole variable name or only a part so I'm going to assume its the whole variable name. If it's not you'll need to look into the sashelp.vcolumn table or transpose the data. Linus has outlined the steps for you above.

Here's the code if code is the whole variable name:

proc sql;

select code into :keep_list separated by " "

from decile1_2004;

quit;

data want;

set have;

where year(date)=2004;


keep date &keep_list;

run;


AbuChowdhury
Fluorite | Level 6

Hi Reeza,

If I run your code then 'want' dataset has only 'date' variable. Other variables are gone. It shows the following messages in the log file:

WARNING: The variable CH0018666781 in the DROP, KEEP, or RENAME list has never been referenced.

WARNING: The variable US00208J1088 in the DROP, KEEP, or RENAME list has never been referenced.

WARNING: The variable US0038301067 in the DROP, KEEP, or RENAME list has never been referenced.

...

...

...

Reeza
Super User

Is CH0018666781 a variable in the dataset?

AbuChowdhury
Fluorite | Level 6

Yes, these are variables in "decile1_2004" dataset.

Reeza
Super User

Is it a variable or observation in decile1_2004?

My understanding was that decile1_2004 was a list of variables to extract from p2004_usa. If not you need to clarify the question, please include a detailed example.

AbuChowdhury
Fluorite | Level 6

Sorry, these are observations under the variable 'code'. You understood correctly: those are a list of variables extracted from other datasets. Now these are observations in "decile1_2004" dataset under the variable named 'code'. This 'code' is the only variable in "decile1_2004" dataset.

Reeza
Super User
You need to clarify the question, please include a detailed example.
AbuChowdhury
Fluorite | Level 6

Ok. Following is my dataset. The variable names are ar1, ar2, ar3 etc.

Date            ar1                  ar2                  ar3                  ar4                ar5              ar6                   ar7                ar8                   ar9                 ar10            

02/01/20040.0187732010.008640321-0.026041830.027888172-0.002423762-0.027671163-0.0164686110.05118590980.0044592593-0.014969492
05/01/20040.0076545455-0.009288519-0.016130551-0.0573668470.00314033980.00887679190.0565801653-0.0251665680.0678269113-0.024272603
06/01/20040.0245047281-0.0132762510.00516666670.0028365134-0.020075851-0.0073337910.0039945055-0.016326916-0.0353028170.0000396459
07/01/20040.02710599080.0625342602-0.010622517-0.0540863560.0113785489-0.015736047-0.0193005460.1112048261-0.030822157-0.009736116
08/01/2004-0.002130726-0.0289463690.0006222222-0.00660.01708932040.03496396090.00216803550.00420476370.0982532055-0.041607558

And "decile1_2004" dataset looks the following:

Code

US0010311035

US00163U1060

US00508B1026

US0097281069

decile1_2004 is the list of variables. I want to use these variable names (instead of ar1, ar2, ......) into the previous dataset.

Reeza
Super User

Let's ASSUME that this is a rename problem, but my assumptions have been wrong in the past.

As mentioned earlier you can avoid all of this by using a long data structure rather than wide.  You should really reconsider changing your data structure. If it was long you could just do the following and be done.

proc sql;

create table want as

select * from p2004_usa

where code in (select code from decile1_2004);

quit;

Instead here's how to rename your variables.

*Create a mapping of AR1, AR2, AR3 to new variable names;

data rename_data;

set decile1_2004;

original_name = catt('ar', put(_n_, 3. -l));

run;

*Create macro variable to set AR1=NEW_NAME into a long list;

proc sql noprint;

select catx("=", original_name, code)  into :rename_list separated by " " from rename_Data;

quit;

*Rename variables;

data want;

set p2004_usa;

rename &rename_list;

run;

AbuChowdhury
Fluorite | Level 6

Thank you very much Reeza, Xia, Linus and all experts.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 22 replies
  • 1519 views
  • 11 likes
  • 5 in conversation