I have a dataset that contain data related to a particular company in a row, over the period of time, say 20 years. So it has observed values of say 10 variables over period of 20 years.
So Like a1-a20 variables for 20 years data and then say b1-b20 variables for 20 years and so on. Like this there are rows related to n companies.
I want to transpose in such a way that data related to "a" transpose and keeps under "a" for 20 years.then next variable "b" will have 20 years of data transposed and so on. The company say C1 will get repeated 20 times for 20 years of data.
Suggestions?
Let me know if you have any questions.
Sorry, test data and output would probably clarify somewhat. You want something like:
id a1 a2 a3...a20 b1 b2...b20 c1..
If so then transpose data for the A variables, then the B variables, then the C variables and merge them together.
Transposing all variables using n datasteps for n variables and then merging is quite some time taking task, which I have on my last option. I was looking for any better method.
Well, the only other thing I can think of is array processing with by groups. So:
data want;
set have;
by id;
array a{20};
array b{20};
...
process result into next available array based on spec for a or b etc.
if last.id then output;
run;
Here's a good reference for that.
Example 3 is your situation.
SAS Learning Module: Reshaping wide to long using a data step
I am not entirely sure if the example you have given reflects the actual state of play in regards to your data or whether it is merely a best case scenario. The following will work if each set of variable follows the convention you have displayed in your example, chiefly that there will always be 20 variables for each grouping A1-20 B1-20 and so on.
PROC SQL NOPRINT;
SELECT
DISTINCT SUBSTR(NAME,1,1) INTO : OUTVARS SEPARATED BY " "
FROM DICTIONARY.COLUMNS
WHERE UPCASE(MEMNAME) = "HAVE"
AND NAME NOT IN ("COMPANY")
;
SELECT
DISTINCT NAME INTO : VARNAMES SEPARATED BY " "
FROM DICTIONARY.COLUMNS
WHERE UPCASE(MEMNAME) = "HAVE"
AND NAME NOT IN ("COMPANY")
ORDER BY NAME
;
QUIT;
%LET DIM1 = %SYSFUNC(LENGTH(%SYSFUNC(COMPRESS(&OUTVARS.))));
%LET DIM2 = %EVAL(%SYSFUNC(COUNTW(&VARNAMES.))/&DIM1.);
DATA WANT;
RETAIN COMPANY PERIOD;
LENGTH PERIOD $2;
SET HAVE;
ARRAY OUTVARS (&DIM1.) &OUTVARS.;
ARRAY INVARS (&DIM1.,&DIM2.) &VARNAMES.;
DO J = 1 TO &DIM2.;
DO I = 1 TO &DIM1.;
OUTVARS{I} = INVARS{I,J};
IF I = 1 THEN PERIOD = COMPRESS(VNAME(INVARS{I,J}),'0123456789','K');
END;
OUTPUT;
END;
KEEP COMPANY PERIOD &OUTVARS.;
RUN;
If the variables are all of the same type (numeric vs character) then just use a two dimensional array.
So if you know the variables are a1-a10, b1-b10, c1-c10 then code like this will work.
data want ;
set have ;
array inx(3,10) a1-a10 b1-b10 c1-c10 ;
array outx a b c ;
do year = 1 to 10 ;
do var=1 to 3;
outx(var) = inx(var,year) ;
end;
output;
end;
keep company year a b c ;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.