02-18-2013 06:19 PM
Dear All, thanks in advance for reading my question and trying to help.
I'm trying to transpose a dataset which originally has 250 columns and 0.5 million rows. The first column has daily dates for 30 years and after that top row of each column represents a company and each row has a unique value for that particular date. Below is a micro example of dataset i have (
date companyA companyB CompanyC Contd...
1-jan NA NA NA
2-jan 1 0.5 NA
3-jan NA 2 3
Using Transpose function, I need it to look like this so that it has only 3 columns now rather than 250 and they look like this:
date company value
1-jan companyA NA
1-jan companyB NA
1-jan companyC NA
2-jan companyA 1
2-jan companyB 0.5
2-jan companyC NA
3-jan companyA NA
3-jan companyB 2
3-jan companyC 3
I know the code and it works fine. The problem is that I have to enter all company names variable names (CompanyA, CompanyB etc) in the code. If I have more than 300 companies, SAS code wont accept it as it goes over the limit. So, here is the example of code I'm using. (As you can see, I have to use all company codes after VAR). Is there a way we can tell SAS to transpose without giving all variable names?
proc transpose data=work.Ph1a out=work.Ph1at;
VAR IN_MNM IN_CEC IN_3II IN_3MI IN_EOI IN_PXX IN_ABN IN_ACE IN_SUJ IN_AOG IN_AEC IN_AJB IN_AKC IN_AKG IN_APR IN_AN IN_GIC IN_AVS IN_A2Z IN_AVI IN_ADU IN_AE IN_ANY IN_ADP IN_DRG IN_AID IN_AVD IN_ASS IN_AAH IN_ASU IN_AAI IN_ABR IN_AAW IN_ABM IN_ALC IN_ABB IN_KNO IN_ANB IN_BEX IN_BCI IN_ABF IN_AHV IN_ABG IN_ABV IN_AHK IN_FBA IN_ABT IN_ABK IN_ASC IN_AEI IN_ANS IN_HWY IN_EPP IN_UCA IN_ETU IN_AC1 IN_ACS IN_AZZ IN_ACK IN_AFP IN_ACR IN_ACW IN_ARY IN_ANE IN_ACT IN_ADM IN_ADA IN_ADE IN_AAR IN_ANW IN_ACH IN_RSH IN_KRO IN_ALI IN_ADI IN_AFR IN_ADH IN_ASY IN_HEF IN_ABD IN_AIM IN_AEX IN_ALO IN_AXP IN_BCC IN_TYA IN_IRS IN_AFG IN_AYY IN_ADT IN_AAT IN_ALM IN_ADO IN_AFH IN_AMU IN_JBA IN_ADW IN_ADL IN_MUT IN_AP IN_AWR IN_DVA IN_AMV IN_RMH IN_ADV IN_AVV IN_AVK IN_AEZ IN_AEG IN_AEN IN_AES IN_AFI IN_AI2 IN_TLC IN_KPL IN_AGX IN_AGR IN_ADF IN_ITA IN_AT IN_GNH IN_AHL IN_AHD IN_AHB IN_EEL IN_AFO IN_CTE IN_AIA IN_APS _141777 IN_AII IN_AWT IN_AJN IN_AJS IN_AJG IN_AJE IN_AJR IN_AWW IN_KZX IN_AS _148773 IN_AAK IN_AKE IN_AUD IN_ICI IN_AHU IN_AGM IN_ALB IN_AOR IN_AHS IN_AL1 IN_LTY IN_AOB IN_LEM IN_ALE IN_ALX IN_LFC IN_AXZ IN_ALF IN_VRS IN_AHR IN_ENK IN_AI1 IN_UCE IN_AEK IN_AAC IN_LTL IN_ASA IN_ALN IN_CLG IN_AR IN_AIN IN_AED IN_ADG IN_RSN IN_AEO IN_ISA IN_AZD IN_LPR IN_AXT IN_LAT _147634 IN_ALG IN_AFU IN_AHG IN_LPI IN_ACV IN_LPN IN_AI IN_ALD IN_RVI IN_LA IN_ACN IN_LSA IN_ALS IN_AOL IN_LUF IN_PLS IN_AD IN_AMA IN_AMR IN_YCO IN_AXS IN_ADY IN_APY IN_AR1 IN_AMT IN_AJO IN_GEY IN_AMB IN_AGB IN_APN IN_ACM IN_ABU IN_AA IN_AUJ _148848 IN_CEM IN_AET IN_BUJ IN_AZI IN_OMC IN_ATQ IN_ATX IN_AMC IN_AFE IN_WWS IN_AMN IN_NE IN_AFD IN_AAL IN_RNA IN_AS1 IN_ASP IN_LSL IN_APD IN_EEP IN_URT IN_AGC;
02-19-2013 05:14 PM
Thanks Tom and Arthur for replying.
No, some of the codes start with numbers as you can see them hidden in that large code. So, I'm not sure what Tom suggested will work. Also, in this case, many codes start with IN_. But what if the variable names are unique?
My friend who knows sas more than me told me that if the variables are numeric, there is no need to write all codes. Just simple transpose function is enough, but he says that since in my data, variables are characters, I have to write all names. But I feel there should be an easier way to do this than writing all variables names as above.
02-19-2013 05:34 PM
All of the codes in your example appear to start with the string IN_
Regardless, your friend didn't quite advise you correctly. With proc transpose, all of the numeric variables will be included if there is no var statement. However, you can always specify var=_character_;
02-19-2013 06:04 PM
Do you just want to transpose everything? Then use VAR _ALL_. You can eliminate the extra row with the value of DATE using data set options.
proc transpose data=have
out=new (rename=(col1=value _name_=company)
where =(upcase(company) ne 'DATE')
proc print data=new; run;
02-19-2013 05:46 PM
Thanks Arthur. No, there are few codes like _141777 which dont start with IN_
So, basically what's the solution? I mean lets say we have the same variable names as above, what should be the code?
02-19-2013 05:55 PM
I'm assuming you're planning to store characters and numbers in one column based on your description above. That will mean they'll all have to be character in SAS because a column can be only one type.
Given that I'd run proc transpose twice, once for all numeric and once for all character.
Then stack the two datasets, converting the numbers to characters.
proc transpose data=work.Ph1a out=work.Ph1at1;
proc transpose data=work.Ph1a out=work.Ph1at2;
02-19-2013 06:03 PM
Or, even easier if you have both numeric and character fields (but not tested code so there could be typing errors):
proc transpose data=work.Ph1a
out=work.Ph1at1 (where=(_name_ ne 'date'));