Hello,
I have the following dataset:
value | string |
-0.04 | 000R201406 |
-0.03 | 000R201409 |
0.13 | 000R201412 |
0.14 | 000R201503 |
0.02 | 000R201509 |
0.01 | 000R201512 |
0.07 | 000R201603 |
0.15 | 000R201606 |
0.13 | 000R201609 |
0.1 | 000R201612 |
0.12 | 000R201612 |
0.2 | 000R201703 |
0.08 | 000R201706 |
0.08 | 000R201709 |
0.1 | 000R201712 |
0.17 | 000R201803 |
and I want to receive the following data set:
value_2 | value_1 | string |
-0.04 | 000R201406 | |
-0.03 | 000R201409 | |
0.13 | 000R201412 | |
0.14 | 000R201503 | |
0.02 | 000R201509 | |
0.01 | 000R201512 | |
0.07 | 000R201603 | |
0.15 | 000R201606 | |
0.13 | 000R201609 | |
0.12 | 0.1 | 000R201612 |
0.12 | 0.1 | 000R201612 |
0.2 | 000R201703 | |
0.08 | 000R201706 | |
0.08 | 000R201709 | |
0.1 | 000R201712 | |
0.17 | 000R201803 |
Meaning, if there are two rows with identical "string" the value of the first row will enter a variable called: value_1
and the value of the second row will enter a variable called: value_2.
Does anyone have an idea what code to use?
Thank you,
Lior
I don't understand why you would do that.
I could understand transposing the data so that groups with multiple observations get multiple values.
But then why would you write the same observation out twice? (or N times if the there are N observations in the group).
data have;
input value string :$10.;
cards;
-0.04 000R201406
-0.03 000R201409
0.13 000R201412
0.14 000R201503
0.02 000R201509
0.01 000R201512
0.07 000R201603
0.15 000R201606
0.13 000R201609
0.1 000R201612
0.12 000R201612
0.2 000R201703
0.08 000R201706
0.08 000R201709
0.1 000R201712
0.17 000R201803
;
proc transpose data=have out=want(drop=_name_) prefix=value_ ;
by string;
var value;
run;
proc print data=want;
run;
Result:
Obs string value_1 value_2 1 000R201406 -0.04 . 2 000R201409 -0.03 . 3 000R201412 0.13 . 4 000R201503 0.14 . 5 000R201509 0.02 . 6 000R201512 0.01 . 7 000R201603 0.07 . 8 000R201606 0.15 . 9 000R201609 0.13 . 10 000R201612 0.10 0.12 11 000R201703 0.20 . 12 000R201706 0.08 . 13 000R201709 0.08 . 14 000R201712 0.10 . 15 000R201803 0.17 .
The most intuitive solution is probably this double loop, the first with an embedded SET statement, the second with an embedded OUTPUT:
data want (drop=value i n);
do n=1 by 1 until (last.string);
set have;
by string;
array val {2} value1 value2;
val{n}=value;
end;
do i=1 to n;
output;
end;
run;
This depends crucially on there never being more than two observations for a given string value.
More compact, but more dense, is this use of self-merge with offset:
data want (drop=nxt_:);
merge have (rename=(value=value1))
have (rename=(string=nxt_string value=value2) firstobs=2);
value1=ifn(string=lag(string),lag(value1),value1);
value2=ifn(string=lag(string),lag(value2),ifn(string^=nxt_string,.,value2));
run;
Hi,
Thank you very much for your answer.
When I run your code on the "example figures", it works perfectly, however, in my real data I received the following error message in the log:
ERROR: Array subscript out of range at line 40 column 5.
n=3 last.ticker_period=1 CURR=USD MEASURE=EPS PDICITY=QTR UNITS=P/S value_forcast=0.17 ticker_period=02U7202109
FIRST.ticker_period=0 value1=0.26 value2=0.23 i=. _ERROR_=1 _N_=439
This is the code I run:
data number_value (drop=i n);
do n=1 by 1 until (last.ticker_period);
set FEW_GUIDANCE;
by ticker_period;
array val {2} value1 value2;
val{n}=value_forcast;
end;
do i=1 to n;
output;
end;
run;
I also export the SAS data file into excel and add it to this message.
I don't understand why I receive this error message..
Thank you,
Lior
I don't understand why you would do that.
I could understand transposing the data so that groups with multiple observations get multiple values.
But then why would you write the same observation out twice? (or N times if the there are N observations in the group).
data have;
input value string :$10.;
cards;
-0.04 000R201406
-0.03 000R201409
0.13 000R201412
0.14 000R201503
0.02 000R201509
0.01 000R201512
0.07 000R201603
0.15 000R201606
0.13 000R201609
0.1 000R201612
0.12 000R201612
0.2 000R201703
0.08 000R201706
0.08 000R201709
0.1 000R201712
0.17 000R201803
;
proc transpose data=have out=want(drop=_name_) prefix=value_ ;
by string;
var value;
run;
proc print data=want;
run;
Result:
Obs string value_1 value_2 1 000R201406 -0.04 . 2 000R201409 -0.03 . 3 000R201412 0.13 . 4 000R201503 0.14 . 5 000R201509 0.02 . 6 000R201512 0.01 . 7 000R201603 0.07 . 8 000R201606 0.15 . 9 000R201609 0.13 . 10 000R201612 0.10 0.12 11 000R201703 0.20 . 12 000R201706 0.08 . 13 000R201709 0.08 . 14 000R201712 0.10 . 15 000R201803 0.17 .
Hi,
You are right.
The new code that you wrote is making more sense. I will use this code instead of the first one.
Thank a lot!
Lior
Hi,
Could you please assist in adjusting the code if I want also a variable for value_last?
Thanks,
Lior
@lioradam wrote:
Hi,
Could you please assist in adjusting the code if I want also a variable for value_last?
Thanks,
Lior
If you have the output of the proc transpose and you know how many VALUE_xx variables were created then coalesce() will do it. For example if you ended up creating 13 VALUE_xx variables then the code might look like:
data new;
set want;
value_last = coalesce(of value_13-value_1);
run;
If you don't know how many there were you might try just re-merging with the original dataset. In fact you might want to use the UPDATE statement instead of the MERGE statement as that will automatically collapse to only one observation per group and will also handle ignoring missing values.
proc transpose data=have out=want(drop=_name_) prefix=value_ ;
by string;
var value;
run;
data want;
update want have(keep=string value rename=(value=value_last));
by string;
run;
Hi Tom,
I used the second code you wrote and it worked perfectly.
Thank you!
Lior
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.