BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
asasha
Obsidian | Level 7

I have a data like this (22x101):

     col1 col2 col3 ... col100
x    1.12 0.15 0.99 ... 0.38
t1   1          
t2             1        
t3   1                
...
t20            1

Each t row will have exactly one indicator=1. Each column corresponding to an x value may have none, one, or multiple indicator=1.

 

For example, x=1.12 occurs at t1 and t3. And x=0.99 occurs at t2 and t20.

 

I need to collect the corresponding x where t has a 1 value and save them to a new data:

t1  1.12
t2  0.99
t3  1.12
...
t20 0.99

Please advise. Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

UNTESTED CODE

Since you haven't given the column which contains x t1 t2 a variable name, I refer to it as the variable named firstcolumn.

If you want tested code, you must provide data according to these instructions and not in any other form.

 

data x;
    set have(where=(upcase(firstcolumn)='X'));
    rename col1-col100=x1-x100;
run;

data want;
    if _n_=1 then set x;
    set have(where=(upcase(firstcolumn)=:'T'));
    array x x1-x100;
    first_occurence=whichn(1,of col:);
    wanted_value=x(first_occurence);
    keep firstcolumn wanted_value;
run;
    
    
--
Paige Miller

View solution in original post

11 REPLIES 11
LinusH
Tourmaline | Level 20

Is this a one time job?

If yes, do it manually (e.g. in Excel).

If not, ask for source fil(s) with proper structure.

Last resort, I would try the data step.

Put the first row in a retained array.

Then use array processing for the t-rows tro find which value in the x-array to use.

Untested though.

Data never sleeps
asasha
Obsidian | Level 7

This is not a one time job. I have been doing it manually in excel, but it would really help to automate it. The first data is available as shown, and I just need to extract the necessary information, so I can for example take the median of the 20 x values.

Kurt_Bremser
Super User

In which form is it available?

If SAS dataset: post a data step with datalines that creates this dataset in our SAS session(s).

If text file: copy/paste the text into a window opened with this button:

Bildschirmfoto 2020-04-07 um 08.32.59.jpg

If Excel: attach the Excel file.

PaigeMiller
Diamond | Level 26

UNTESTED CODE

Since you haven't given the column which contains x t1 t2 a variable name, I refer to it as the variable named firstcolumn.

If you want tested code, you must provide data according to these instructions and not in any other form.

 

data x;
    set have(where=(upcase(firstcolumn)='X'));
    rename col1-col100=x1-x100;
run;

data want;
    if _n_=1 then set x;
    set have(where=(upcase(firstcolumn)=:'T'));
    array x x1-x100;
    first_occurence=whichn(1,of col:);
    wanted_value=x(first_occurence);
    keep firstcolumn wanted_value;
run;
    
    
--
Paige Miller
FreelanceReinh
Jade | Level 19

Hi @asasha,


@asasha wrote:

I have a data like this (22x101): ...

This sounds like you haven't read the raw data yet? So you may want to create the desired dataset in one step (using LinusH's plan):

%let n=4; /* n=100 with your real data */

data want(keep=id v);
array col[&n];
retain col:;
input id $ @;
if _n_=1 then input col[*];
else do;
  input z1-z&n;
  v=col[whichn(1, of z:)];
  output;
end;
cards;
x    1.12 0.15 0.99 0.38
t1   1     .    .    .
t2    .    .   1     .
t3   1     .    .    .
t20   .    .   1     .
;
asasha
Obsidian | Level 7

@LinusH @Kurt_Bremser @FreelanceReinh @PaigeMiller 

Here's an exact subset example in cvs (22 rows, 17 columns). Alternatively, I just need to extract all 20 of the x's, regardless of which t they correspond to. Like I said, I have 100 of them (not just 15) and they are spread around. Just need the ones that were selected for t.

 

Parameter,t,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11,col12,col13,col14,col15
x,,0.98,0.882,0.7938,0.71442,0.642978,0.5786802,0.52081218,0.468730962,0.421857866,0.379672079,0.341704871,0.307534384,0.276780946,0.249102851,0.224192566
Opt,t1,,,,,,,1,,,,,,,,
Opt,t2,,,,,,,,,,1,,,,,
Opt,t3,,,,,,,,,,1,,,,,
Opt,t4,,,,,1,,,,,,,,,,
Opt,t5,,,,,,,,,,1,,,,,
Opt,t6,,,,,,,,1,,,,,,,
Opt,t7,,,,,,,,1,,,,,,,
Opt,t8,,,,,,,,,,,,1,,,
Opt,t9,,,,,,,,,,,1,,,,
Opt,t10,,,,,,,,1,,,,,,,
Opt,t11,,,,,,,,,1,,,,,,
Opt,t12,,,,,,,,,1,,,,,,
Opt,t13,,,,,,,,,,,1,,,,
Opt,t14,,,,1,,,,,,,,,,,
Opt,t15,,,,,,,1,,,,,,,,
Opt,t16,,1,,,,,,,,,,,,,
Opt,t17,,,,,,1,,,,,,,,,
Opt,t18,,,,,,,,,,1,,,,,
Opt,t19,,,,,,,,,,,,,1,,
Opt,t20,,,,,,1,,,,,,,,,

 

 

Kurt_Bremser
Super User

See this:

data want;
infile datalines dlm="," dsd truncover;
input
  Parameter $
  t $
  col1-col15
;
retain t1-t15;
array col {*} col1-col15;
array ta {*} t1-t15;
if parameter = "x"
then do;
  do i = 1 to dim(col);
    ta{i} = col{i};
  end;
end;
else do;
  do i = 1 to dim(col);
    if col{i}
    then do;
      x = ta{i};
      leave;
    end;
  end;
  output;
end;
keep t x;
datalines;
x,,0.98,0.882,0.7938,0.71442,0.642978,0.5786802,0.52081218,0.468730962,0.421857866,0.379672079,0.341704871,0.307534384,0.276780946,0.249102851,0.224192566
Opt,t1,,,,,,,1,,,,,,,,
Opt,t2,,,,,,,,,,1,,,,,
Opt,t3,,,,,,,,,,1,,,,,
Opt,t4,,,,,1,,,,,,,,,,
Opt,t5,,,,,,,,,,1,,,,,
Opt,t6,,,,,,,,1,,,,,,,
Opt,t7,,,,,,,,1,,,,,,,
Opt,t8,,,,,,,,,,,,1,,,
Opt,t9,,,,,,,,,,,1,,,,
Opt,t10,,,,,,,,1,,,,,,,
Opt,t11,,,,,,,,,1,,,,,,
Opt,t12,,,,,,,,,1,,,,,,
Opt,t13,,,,,,,,,,,1,,,,
Opt,t14,,,,1,,,,,,,,,,,
Opt,t15,,,,,,,1,,,,,,,,
Opt,t16,,1,,,,,,,,,,,,,
Opt,t17,,,,,,1,,,,,,,,,
Opt,t18,,,,,,,,,,1,,,,,
Opt,t19,,,,,,,,,,,,,1,,
Opt,t20,,,,,,1,,,,,,,,,
;
asasha
Obsidian | Level 7

this didn't work as my t is up to 20, and my col is up to 15 (or 100).

PaigeMiller
Diamond | Level 26

@asasha wrote:

this didn't work ...


For your future benefit, if you say "didn't work" and do not provide the LOG or the DATA to show what you did, and why it didn't work, then you don't get an answer. (I don't have an answer for you right now.)

 

So, show us the LOG (and if there is incorrect output, show us the incorrect output as well)

--
Paige Miller
Kurt_Bremser
Super User

The number of t observations is irrelevant with my code, and it should be obvious where to implement a larger number of columns.

Spoiler
Simply replace the 15 in the array definitions with 100

@asasha wrote:

this didn't work as my t is up to 20, and my col is up to 15 (or 100).


 

PaigeMiller
Diamond | Level 26

This data does not match the originally provided data, as the first column originally provided has a value of x in the first row, but not t1 in the second row.

 

Nevertheless, with minor changes, the code I provided should still work.

--
Paige Miller

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 1925 views
  • 1 like
  • 5 in conversation