BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
lioradam
Obsidian | Level 7

Hello,

 

I have the following dataset:

value string
-0.04 000R201406
-0.03 000R201409
0.13 000R201412
0.14 000R201503
0.02 000R201509
0.01 000R201512
0.07 000R201603
0.15 000R201606
0.13 000R201609
0.1 000R201612
0.12 000R201612
0.2 000R201703
0.08 000R201706
0.08 000R201709
0.1 000R201712
0.17 000R201803

 

and I want to receive the following data set:

 

value_2 value_1 string
  -0.04 000R201406
  -0.03 000R201409
  0.13 000R201412
  0.14 000R201503
  0.02 000R201509
  0.01 000R201512
  0.07 000R201603
  0.15 000R201606
  0.13 000R201609
0.12 0.1 000R201612
0.12 0.1 000R201612
  0.2 000R201703
  0.08 000R201706
  0.08 000R201709
  0.1 000R201712
  0.17 000R201803

 

Meaning, if there are two rows with identical  "string" the value of the first row will enter a variable called: value_1

and the value of the second row will enter a variable called: value_2.

Does anyone have an idea what code to use?

 

Thank you,

Lior

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

I don't understand why you would do that.

I could understand transposing the data so that groups with multiple observations get multiple values.

But then why would you write the same observation out twice? (or N times if the there are N observations in the group).

data have;
  input value string :$10.;
cards;
-0.04 000R201406
-0.03 000R201409
0.13 000R201412
0.14 000R201503
0.02 000R201509
0.01 000R201512
0.07 000R201603
0.15 000R201606
0.13 000R201609
0.1 000R201612
0.12 000R201612
0.2 000R201703
0.08 000R201706
0.08 000R201709
0.1 000R201712
0.17 000R201803
;

proc transpose data=have out=want(drop=_name_) prefix=value_ ;
  by string;
  var value;
run;

proc print data=want;
run;

Result:

Obs      string      value_1    value_2

  1    000R201406     -0.04        .
  2    000R201409     -0.03        .
  3    000R201412      0.13        .
  4    000R201503      0.14        .
  5    000R201509      0.02        .
  6    000R201512      0.01        .
  7    000R201603      0.07        .
  8    000R201606      0.15        .
  9    000R201609      0.13        .
 10    000R201612      0.10       0.12
 11    000R201703      0.20        .
 12    000R201706      0.08        .
 13    000R201709      0.08        .
 14    000R201712      0.10        .
 15    000R201803      0.17        .

View solution in original post

7 REPLIES 7
mkeintz
PROC Star

The most intuitive solution is probably this double loop, the first with an embedded SET statement, the second with an embedded OUTPUT:

 

data want (drop=value i n);
  do n=1 by 1 until (last.string);
    set have;
    by string;
    array val {2} value1 value2;
    val{n}=value;
  end;
  do i=1 to n;
    output;
  end;
run;

This depends crucially on there never being more than two observations for a given string value.

 

More compact, but more dense, is this use of self-merge with offset:

 

data want (drop=nxt_:);
  merge have (rename=(value=value1))
        have (rename=(string=nxt_string value=value2)  firstobs=2);

  value1=ifn(string=lag(string),lag(value1),value1);
  value2=ifn(string=lag(string),lag(value2),ifn(string^=nxt_string,.,value2));
run; 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
lioradam
Obsidian | Level 7

Hi,

Thank you very much for your answer.

When I run your code on the "example figures", it works perfectly, however, in my real data I received the following error message in the log:


ERROR: Array subscript out of range at line 40 column 5.
n=3 last.ticker_period=1 CURR=USD MEASURE=EPS PDICITY=QTR UNITS=P/S value_forcast=0.17 ticker_period=02U7202109
FIRST.ticker_period=0 value1=0.26 value2=0.23 i=. _ERROR_=1 _N_=439

 

This is the code I run:

data number_value (drop=i n);
do n=1 by 1 until (last.ticker_period);
set FEW_GUIDANCE;
by ticker_period;
array val {2} value1 value2;
val{n}=value_forcast;
end;
do i=1 to n;
output;
end;
run;

 

I also export the SAS data file into excel and add it to this message.

 

I don't understand why I receive this error message..

 

Thank you,

Lior

 

 

 

Tom
Super User Tom
Super User

I don't understand why you would do that.

I could understand transposing the data so that groups with multiple observations get multiple values.

But then why would you write the same observation out twice? (or N times if the there are N observations in the group).

data have;
  input value string :$10.;
cards;
-0.04 000R201406
-0.03 000R201409
0.13 000R201412
0.14 000R201503
0.02 000R201509
0.01 000R201512
0.07 000R201603
0.15 000R201606
0.13 000R201609
0.1 000R201612
0.12 000R201612
0.2 000R201703
0.08 000R201706
0.08 000R201709
0.1 000R201712
0.17 000R201803
;

proc transpose data=have out=want(drop=_name_) prefix=value_ ;
  by string;
  var value;
run;

proc print data=want;
run;

Result:

Obs      string      value_1    value_2

  1    000R201406     -0.04        .
  2    000R201409     -0.03        .
  3    000R201412      0.13        .
  4    000R201503      0.14        .
  5    000R201509      0.02        .
  6    000R201512      0.01        .
  7    000R201603      0.07        .
  8    000R201606      0.15        .
  9    000R201609      0.13        .
 10    000R201612      0.10       0.12
 11    000R201703      0.20        .
 12    000R201706      0.08        .
 13    000R201709      0.08        .
 14    000R201712      0.10        .
 15    000R201803      0.17        .
lioradam
Obsidian | Level 7

Hi,

You are right.

The new code that you wrote is making more sense. I will use this code instead of the first one.

 

Thank a lot!

Lior

lioradam
Obsidian | Level 7

Hi,

Could you please assist in adjusting the code if I want also a variable for value_last?

Thanks,

Lior

Tom
Super User Tom
Super User

@lioradam wrote:

Hi,

Could you please assist in adjusting the code if I want also a variable for value_last?

Thanks,

Lior


If you have the output of the proc transpose and you know how many VALUE_xx variables were created then coalesce() will do it.  For example if you ended up creating 13 VALUE_xx variables then the code might look like:

data new;
  set want;
  value_last = coalesce(of value_13-value_1);
run;

If you don't know how many there were you might try just re-merging with the original dataset.  In fact you might want to use the UPDATE statement instead of the MERGE statement as that will automatically collapse to only one observation per group and will also handle ignoring missing values.

proc transpose data=have out=want(drop=_name_) prefix=value_ ;
  by string;
  var value;
run;

data want;
  update want have(keep=string value rename=(value=value_last));
  by string;
run;

 

lioradam
Obsidian | Level 7

Hi Tom,

I used the second code you wrote and it worked perfectly.

Thank you!

 

Lior

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 970 views
  • 1 like
  • 3 in conversation