Hi,
I received a data table that looks like this:
Code | VAR2 | VAR3 | VAR4 | VAR5 | VAR6 |
C000501 | C000873:3 | C000501:3 | |||
C000873 | C003330:1 | C000873:39 | C003402:1 | C000501:3 | C001758:6 |
C001758 | C001758:12 | C003330:4 | C000873:6 | ||
C003330 | C001758:4 | C000873:1 | C003330:12 | ||
C003402 | C000873:1 | C003402:4 |
This is how I wish to transform it:
Code | C000501 | C000873 | C001758 | C003330 | C003402 |
C000501 | 3 | 3 | |||
C000873 | 3 | 39 | 6 | 1 | 1 |
C001758 | 6 | 12 | 4 | ||
C003330 | 1 | 4 | 12 | ||
C003402 | 1 | 4 |
This is just a sample. Original dataset has about 5000 rows and few thousand columns.
Thanks for the help. Let me know if anything is needed.
Just to "spell out" in code the approach @Reeza suggested.
data have;
infile cards truncover;
input (code var2 var3 var4 var5 var6) (:$15.);
cards;
C000501 C000873:3 C000501:3
C000873 C003330:1 C000873:39 C003402:1 C000501:3 C001758:6
C001758 C001758:12 C003330:4 C000873:6
C003330 C001758:4 C000873:1 C003330:12
C003402 C000873:1 C003402:4
;
run;
data long(keep=code key value);
set have;
array vars {*} var2 - var6;
length key $32 value 8;
do _i=1 to dim(vars);
if not(missing(vars[_i])) then
do;
key=scan(vars[_i],1,':');
value=input(scan(vars[_i],2,':'),best32.);
output;
end;
end;
run;
proc transpose data=long out=wide(drop=_:);
by code notsorted;
id key;
var value;
run;
Read the data and create two datasets. Pull the column headings from the var one name per obs. Sort and eliminate dups. Second dataset each obs has the code ,the coll, and the count (after the colon); Do a file print. read the cols data set and print the codes across the top line. Read the code dataset .Move across the page puting the counts under the code column names.
I don’t know if you want to use file print but here is a strt of some code The code has errors may give you some ideas.
data code cols(keep=col);
infile cards truncover;
input (code var2 var3 var4 var5 var6) (:$15.);
if var2 ne " " then do;
col=scan(var2,1,':'); output cols;
count=scan(var2,2,':'); output code; end;
if var3 ne " " then do;
col=scan(var3,1,':'); output cols;
count=scan(var3,2,':'); output code; end;
if var4 ne " " then do;
col=scan(var4,1,':'); output cols;
count=scan(var4,2,':'); output code; end;
if var5 ne " " then do;
col=scan(var5,1,':'); output cols;
count=scan(var5,2,':'); output code; end;
if var6 ne " " then do;
col=scan(var6,1,':'); output cols;
count=scan(var6,2,':'); output code; end;
cards;
C000501 C000873:3 C000501:3
C000873 C003330:1 C000873:39 C003402:1 C000501:3 C001758:6
C001758 C001758:12 C003330:4 C000873:6
C003330 C001758:4 C000873:1 C003330:12
C003402 C000873:1 C003402:4
; proc print data=code; id code col count; run;
proc print data=cols; id col; run;
data code; set code; if count gt " "; keep code col count;
proc print ; run;
proc sort data=cols; by col;
proc sort; data=code; by code col;
data cols; set cols; by col; if last.col; proc print; run;
data; set cols; file print;
x+10;
put @x col@;
data; set code; by code; file print;
if first.code then do;
x=0; put @5 code @; end;
X+10; put @x count @;
if last.code then put '. ';
run;
Just to "spell out" in code the approach @Reeza suggested.
data have;
infile cards truncover;
input (code var2 var3 var4 var5 var6) (:$15.);
cards;
C000501 C000873:3 C000501:3
C000873 C003330:1 C000873:39 C003402:1 C000501:3 C001758:6
C001758 C001758:12 C003330:4 C000873:6
C003330 C001758:4 C000873:1 C003330:12
C003402 C000873:1 C003402:4
;
run;
data long(keep=code key value);
set have;
array vars {*} var2 - var6;
length key $32 value 8;
do _i=1 to dim(vars);
if not(missing(vars[_i])) then
do;
key=scan(vars[_i],1,':');
value=input(scan(vars[_i],2,':'),best32.);
output;
end;
end;
run;
proc transpose data=long out=wide(drop=_:);
by code notsorted;
id key;
var value;
run;
Yes indeed a very good solution provided with arrays, alternatively with tranpose step we could get the same result as below
proc sort data=have;
by code;
run;
proc transpose data=have out=new;
by code;
var var2-var6;
run;
data new2;
set new;
col2=scan(col1,2,':');
col1=scan(col1,1,':');
if col1 ne '';
drop _name_;
run;
proc transpose data=new2 out=trans(drop=_name_);
by code;
var col2;
id col1;
run;
Please try and check.
Thanks,
Jag
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.