BookmarkSubscribeRSS Feed
Eileen1496
Obsidian | Level 7

Hi all,

I'm trying to reshape a table in sas. 

I have column called gvkey, fyear, each gvkey-fyear is an identifier (unique). I have other columns called rcid_gvkey, rcid_cusip, rcid_cik, frcid_gvkey, frcid_cusip, frcid_gvkey.
How can I reshape it to have one rcid column with rcid_gvkey, rcid_cusip, rcid_cik of the same observation (gvkey-year) in different rows, and a new column called rcid_source indicates the sources(so if it is rcid_gvkey, it should have "gvkey" as the value of this column). I also want one column called frcid with frcid_gvkey, frcid_cusip, frcid_gvkey of the same observations in different rows, and a new column called frcid_source indicating the sources. 
 
I attach a simple example below (should also have frcid_gvkey, frcid_cusip, frcid_cik columns, but to keep it simple I did not generate the sample)
I try the following but it does not work...
proc transpose data=temp.comp_rcid out=temp.comp_rcid_rshp 
(rename=(col1=rcid_source col2=frcid_source));
  by gvkey fyear;
  var rcid_gvkey rcid_cusip rcid_cik frcid_gvkey frcid_cusip frcid_cik;
run;
 
Can you help me ? thank you!
5 REPLIES 5
Reeza
Super User
If you only have a few, a data step with explicit output statements is a better approach. If you have more then this process could be more automated with arrays and macros.
Reeza
Super User
data temp.comp_rcid_rshp;
set temp.comp_rcid;
keep gvkey fyear cusip cik;
length source $12.;
source = 'original';
output;
source = 'rcid';
gvkey = rcid_gvkey;
cusip = rcid_cusip;
cik = rcid_cik;
output;
source = 'frcid';
gvkey = frcid_gvkey;
cusip = frcid_cusip;
cik = frcid_cik;
output;
run;

Something along the lines above, assuming I somewhat understand your data (hard without seeing an example)

 


@Eileen1496 wrote:

Hi all,

I'm trying to reshape a table in sas. 

I have column called gvkey, fyear, each gvkey-fyear is an identifier (unique). I have other columns called rcid_gvkey, rcid_cusip, rcid_cik, frcid_gvkey, frcid_cusip, frcid_gvkey.
How can I reshape it to have one rcid column with rcid_gvkey, rcid_cusip, rcid_cik of the same observation (gvkey-year) in different rows, and a new column called rcid_source indicates the sources(so if it is rcid_gvkey, it should have "gvkey" as the value of this column). I also want one column called frcid with frcid_gvkey, frcid_cusip, frcid_gvkey of the same observations in different rows, and a new column called frcid_source indicating the sources. 
 
I try the following but it does not work...
proc transpose data=temp.comp_rcid out=temp.comp_rcid_rshp 
(rename=(col1=rcid_source col2=frcid_source));
  by gvkey fyear;
  var rcid_gvkey rcid_cusip rcid_cik frcid_gvkey frcid_cusip frcid_cik;
run;
 
Can you help me ? thank you!

 

Eileen1496
Obsidian | Level 7

Hi Reeza,

 

Thanks for the reply. This is not what I mean. I just uploaded an excel with example, could you help me see again? I really appreciate it.

ballardw
Super User

Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.

 

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.

mkeintz
PROC Star

You didn't provide sample data in the form of a working data step, so this program isn't (fully) tested:

 

data want (drop=_: rcid_: frcid_:);
  id=_n_;
  length source $5;
  set have;
  by gvkey year;
  retain      _cik 'cik  '  _cusip 'cusip'  _gvkey 'gvkey' ;
  array  src  _cik          _cusip          _gvkey ;

  array rcid_data   rcid_cik   rcid_cusip   rcid_gvkey ;
  array frcid_data  frcid_cik  frcid_cusip  frcid_gvkey ;

  do over src;
    source=src;
    rcid=rcid_data;
    frcid=frcid_data;
    output;
  end;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 742 views
  • 0 likes
  • 4 in conversation