I have a data like this (22x101):
col1 col2 col3 ... col100 x 1.12 0.15 0.99 ... 0.38 t1 1 t2 1 t3 1 ... t20 1
Each t row will have exactly one indicator=1. Each column corresponding to an x value may have none, one, or multiple indicator=1.
For example, x=1.12 occurs at t1 and t3. And x=0.99 occurs at t2 and t20.
I need to collect the corresponding x where t has a 1 value and save them to a new data:
t1 1.12 t2 0.99 t3 1.12 ... t20 0.99
Please advise. Thank you.
UNTESTED CODE
Since you haven't given the column which contains x t1 t2 a variable name, I refer to it as the variable named firstcolumn.
If you want tested code, you must provide data according to these instructions and not in any other form.
data x;
set have(where=(upcase(firstcolumn)='X'));
rename col1-col100=x1-x100;
run;
data want;
if _n_=1 then set x;
set have(where=(upcase(firstcolumn)=:'T'));
array x x1-x100;
first_occurence=whichn(1,of col:);
wanted_value=x(first_occurence);
keep firstcolumn wanted_value;
run;
Is this a one time job?
If yes, do it manually (e.g. in Excel).
If not, ask for source fil(s) with proper structure.
Last resort, I would try the data step.
Put the first row in a retained array.
Then use array processing for the t-rows tro find which value in the x-array to use.
Untested though.
This is not a one time job. I have been doing it manually in excel, but it would really help to automate it. The first data is available as shown, and I just need to extract the necessary information, so I can for example take the median of the 20 x values.
In which form is it available?
If SAS dataset: post a data step with datalines that creates this dataset in our SAS session(s).
If text file: copy/paste the text into a window opened with this button:
If Excel: attach the Excel file.
UNTESTED CODE
Since you haven't given the column which contains x t1 t2 a variable name, I refer to it as the variable named firstcolumn.
If you want tested code, you must provide data according to these instructions and not in any other form.
data x;
set have(where=(upcase(firstcolumn)='X'));
rename col1-col100=x1-x100;
run;
data want;
if _n_=1 then set x;
set have(where=(upcase(firstcolumn)=:'T'));
array x x1-x100;
first_occurence=whichn(1,of col:);
wanted_value=x(first_occurence);
keep firstcolumn wanted_value;
run;
Hi @asasha,
@asasha wrote:
I have a data like this (22x101): ...
This sounds like you haven't read the raw data yet? So you may want to create the desired dataset in one step (using LinusH's plan):
%let n=4; /* n=100 with your real data */
data want(keep=id v);
array col[&n];
retain col:;
input id $ @;
if _n_=1 then input col[*];
else do;
input z1-z&n;
v=col[whichn(1, of z:)];
output;
end;
cards;
x 1.12 0.15 0.99 0.38
t1 1 . . .
t2 . . 1 .
t3 1 . . .
t20 . . 1 .
;
@LinusH @Kurt_Bremser @FreelanceReinh @PaigeMiller
Here's an exact subset example in cvs (22 rows, 17 columns). Alternatively, I just need to extract all 20 of the x's, regardless of which t they correspond to. Like I said, I have 100 of them (not just 15) and they are spread around. Just need the ones that were selected for t.
Parameter,t,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11,col12,col13,col14,col15
x,,0.98,0.882,0.7938,0.71442,0.642978,0.5786802,0.52081218,0.468730962,0.421857866,0.379672079,0.341704871,0.307534384,0.276780946,0.249102851,0.224192566
Opt,t1,,,,,,,1,,,,,,,,
Opt,t2,,,,,,,,,,1,,,,,
Opt,t3,,,,,,,,,,1,,,,,
Opt,t4,,,,,1,,,,,,,,,,
Opt,t5,,,,,,,,,,1,,,,,
Opt,t6,,,,,,,,1,,,,,,,
Opt,t7,,,,,,,,1,,,,,,,
Opt,t8,,,,,,,,,,,,1,,,
Opt,t9,,,,,,,,,,,1,,,,
Opt,t10,,,,,,,,1,,,,,,,
Opt,t11,,,,,,,,,1,,,,,,
Opt,t12,,,,,,,,,1,,,,,,
Opt,t13,,,,,,,,,,,1,,,,
Opt,t14,,,,1,,,,,,,,,,,
Opt,t15,,,,,,,1,,,,,,,,
Opt,t16,,1,,,,,,,,,,,,,
Opt,t17,,,,,,1,,,,,,,,,
Opt,t18,,,,,,,,,,1,,,,,
Opt,t19,,,,,,,,,,,,,1,,
Opt,t20,,,,,,1,,,,,,,,,
See this:
data want;
infile datalines dlm="," dsd truncover;
input
Parameter $
t $
col1-col15
;
retain t1-t15;
array col {*} col1-col15;
array ta {*} t1-t15;
if parameter = "x"
then do;
do i = 1 to dim(col);
ta{i} = col{i};
end;
end;
else do;
do i = 1 to dim(col);
if col{i}
then do;
x = ta{i};
leave;
end;
end;
output;
end;
keep t x;
datalines;
x,,0.98,0.882,0.7938,0.71442,0.642978,0.5786802,0.52081218,0.468730962,0.421857866,0.379672079,0.341704871,0.307534384,0.276780946,0.249102851,0.224192566
Opt,t1,,,,,,,1,,,,,,,,
Opt,t2,,,,,,,,,,1,,,,,
Opt,t3,,,,,,,,,,1,,,,,
Opt,t4,,,,,1,,,,,,,,,,
Opt,t5,,,,,,,,,,1,,,,,
Opt,t6,,,,,,,,1,,,,,,,
Opt,t7,,,,,,,,1,,,,,,,
Opt,t8,,,,,,,,,,,,1,,,
Opt,t9,,,,,,,,,,,1,,,,
Opt,t10,,,,,,,,1,,,,,,,
Opt,t11,,,,,,,,,1,,,,,,
Opt,t12,,,,,,,,,1,,,,,,
Opt,t13,,,,,,,,,,,1,,,,
Opt,t14,,,,1,,,,,,,,,,,
Opt,t15,,,,,,,1,,,,,,,,
Opt,t16,,1,,,,,,,,,,,,,
Opt,t17,,,,,,1,,,,,,,,,
Opt,t18,,,,,,,,,,1,,,,,
Opt,t19,,,,,,,,,,,,,1,,
Opt,t20,,,,,,1,,,,,,,,,
;
this didn't work as my t is up to 20, and my col is up to 15 (or 100).
@asasha wrote:
this didn't work ...
For your future benefit, if you say "didn't work" and do not provide the LOG or the DATA to show what you did, and why it didn't work, then you don't get an answer. (I don't have an answer for you right now.)
So, show us the LOG (and if there is incorrect output, show us the incorrect output as well)
This data does not match the originally provided data, as the first column originally provided has a value of x in the first row, but not t1 in the second row.
Nevertheless, with minor changes, the code I provided should still work.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.