Hi, all !
I have a large data set with the following variables:
Firmid (character) year segmentID segmentsales siccode
For each firmid year combination, I need to create a vector that contains its individual segment sales as well as sales of segments the firm does not have
To do this,
1. first, I need to count the total number of distinct siccode in the entire dataset, in the sample data given below, there are 7 unique siccode
2. my vector will have 7 elements,
e.g. for firmid=1000, year=1999, the vector (20,10,0,0,0,0,0), since the firm year, has only 2 segments with nonzero sales.
for firmid=1000, year=2000, the vector (15,12,13,0,0,0,0),
How do I program this? can I do this without proc iml ?
Firmid year segmentID segmentsales siccode
1001 1999 1 20 4100
1001 1999 2 10 4200
1001 2000 1 15 4100
1001 2000 2 12 4200
1001 2000 3 13 5100
1002 1999 1 20 4100
1002 1999 2 13 9100
1002 2001 1 20 2867
1002 2001 2 13 3200
1002 2000 3 20 7100
Sincerely,
Lan
Looks to me like you are trying to perform a transposition. In the data step there is no concept of vector. That data construct is left to IML. You can use array's instead to probably reach whatever it is that your end goal entails.
data foo;
input firmid year segmentid segmentsales siccode;
cards;
1001 1999 1 20 4100
1001 1999 2 10 4200
1001 2000 1 15 4100
1001 2000 2 12 4200
1001 2000 3 13 5100
1002 1999 1 20 4100
1002 1999 2 13 9100
1002 2001 1 20 2867
1002 2001 2 13 3200
1002 2000 3 20 7100
;
run;
proc sort data=foo; by firmid year; run;
options missing=0;
proc transpose data=foo out=bar(drop=_:) prefix=segmentid;
by firmid year;
var segmentsales;
id segmentid;
run;
data use_array;
set bar;
array seg
do i=1 to dim(seg);
*something to array elements;
end;
run;
Building on AYBiBTU's solution we can avoid the TRANSPOSE step by transposing in the DATA step, which becomes:
data use_array;
set foo;
by firmid year;
array seg[5] _temporary_; /* make the dimension > largest number of segment id values */
if first.year then call missing(of seg{*});
seg{segmentid} = segmentsales;
if last.year then do i=1 to dim(seg);
*something to array elements;
end;
run;
Thank you so much, Art and AYBIBTU.
Lan
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.