Hi everybody! i'am new here, sorry if i am not very clear!
I have a problem with ARRAY:
i have data with firms identifiers, salaries and number of staff for each trimester of each year since 2000 to 2005.
SO I want to create a new table wich gives to me a unique line for each firm and in each i would have the salary by trimester per year (like 2000T1 2000T2 2000T3...2000T5) so i have to do this for the salary per trimester year and the number of staff per trimester year...
But i cant use ARRAY AND RETAIN,
if anybody has an idea, i thanks him !
Faced with doing this for multiple variables, I would code it by arrays because I can. But if you can't, then you may not have any confidence your solution is correct, and I would suggest you leave arrays alone.
Instead, look up the Factual Material on the Transpose Procedure. There are quite a few very well documented examples for rearranging rows of data into columns.
As a hint, you will have to transpose once each for salary and staff count, and then match merge the results together.
You might want to be very careful about your time line too. I thought a trimester was a three part period, with the third trimester being the last. That being so, I don't see how 2000 can have five trimesters. Perhaps it is a mistake, but it is mistakes like that, which would kill your use of an array.
can you be little more specific about your data structure like columns and rows or can u write sample dataset for which u want solution.
because untill have an idia about data its not easy to give solution.
it looks like that:
the columns: variables:
x1 /1200 /2 /2000/1
x1 /2100 /4 /2000/2
x1 /3440 /6 /2000/3
...for each ident as x we have observations for salary and staff per year trimester...
x1 /2300/ 6 /2005/4
etc and we have many ident from x1 to Xn...
so i want to create a new table which gives me
Only one ident per line but all staff and salaries per each year trimester
at least i have to create a new table with 4 trimesters per each of my 6 years but two times one for the staff an other for the salaries so 48 variables more ident and a anothre which doesnt matter.
Observe the example using the BY statement and how that relates to your IDENT variable.
Now observe the VAR statement and how that relates to your SALARY and STAFF variables.
Finally, observe that the ID statement will identify the variable that names the transposed columns.
Note that the PREFIX statement will identify the prefix to place before the value of your ID statement to name the column. So, with a prefix value of SAL and an ID of PERIOD you will have columns named SAL20001. For naming purposes you are better off constructing a variable called PERIOD that concatenates YEAR and "Trimester" into a value that does not contain characters you cannot use in variable naming.
Do the transposition first for SALARY and then for STAFF so that you can set the prefix appropriately for each VAR as you transpose it.
Then match merge the two tables together to give you a single record for each Ident value.
Message was edited by: dkvj
okay, i understood the problem, but i have really to do this with ARRAY and RETAIN.
I will try to TRANSPOSE the data table and i will see what it does. But i have to finish with array et retain!
I think i have to use it because there are too many variables to create.
They give to me this example, so i have to use it for my data...
input ID TIME SCORE;
01 1 3
01 2 4
01 3 5
02 1 7
02 3 9
03 1 6
03 2 5
03 3 4
proc sort data=MANYPER; by ID TIME; run;
array S S1-S3;
If first.id then do i=1 to 3;
If last.id then output;
keep id S1-S3;
the problem with that program is that mine have more conditions:
i have trimesters (4) per years (6) and i have to build a data which gives me salary and the staff for every variable...
i really dont know how to do...
array salary salary1-salary24;
array staff staff1-staff24;
if first.ident then do i=1 to 24;
if last.ident then output;
keep ident APE salary1-salary24 staff1-staff24;