BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bhupesh102
Calcite | Level 5

 

 

 

 

Hi,

 

I received a data table that looks like this:

 

Code VAR2 VAR3 VAR4 VAR5 VAR6
C000501 C000873:3 C000501:3      
C000873 C003330:1 C000873:39 C003402:1 C000501:3 C001758:6
C001758 C001758:12 C003330:4 C000873:6    
C003330 C001758:4 C000873:1 C003330:12    
C003402 C000873:1 C003402:4      

 

This is how I wish to transform it:

 

Code C000501 C000873 C001758 C003330 C003402
C000501 3 3      
C000873 3 39 6 1 1
C001758   6 12 4  
C003330   1 4 12  
C003402   1     4

 

 

This is just a sample. Original dataset has about 5000 rows and few thousand columns.

 

Thanks for the help. Let me know if anything is needed.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Just to "spell out" in code the approach @Reeza suggested.


data have;
  infile cards truncover;
  input (code var2 var3 var4 var5 var6) (:$15.);
  cards;
C000501 C000873:3 C000501:3 
C000873 C003330:1 C000873:39 C003402:1 C000501:3 C001758:6
C001758 C001758:12 C003330:4 C000873:6 
C003330 C001758:4 C000873:1 C003330:12 
C003402 C000873:1 C003402:4 
;
run;

data long(keep=code key value);
  set have;
  array vars {*} var2 - var6;
  length key $32 value 8;
  do _i=1 to dim(vars);
    if not(missing(vars[_i])) then
      do;
        key=scan(vars[_i],1,':');
        value=input(scan(vars[_i],2,':'),best32.);
        output;
      end;
  end;
run;

proc transpose data=long out=wide(drop=_:);
  by code notsorted;
  id key;
  var value;
run;

Capture.PNG

View solution in original post

5 REPLIES 5
Reeza
Super User
Step 1 - transform you data to a form to the following form using the SCAN and OUTPUT functions.

Code1 Code2 Num
C000501 C000873 3
C000501 C000501 3
...

etc.

Then try a Proc Transpose
Jim_G
Pyrite | Level 9

Read the data and create two datasets. Pull the column headings from the var one name per obs. Sort and eliminate dups. Second dataset each obs has the code ,the coll, and the count (after the colon); Do a file print. read the cols data set and print the codes across the top line. Read the code dataset .Move across the page puting the counts under the code column names.

 

I don’t know if you want to use file print but here is a strt of some code The code has errors may give you some ideas.

 

data code cols(keep=col);
infile cards truncover;
input (code var2 var3 var4 var5 var6) (:$15.);
if var2 ne " " then do;
col=scan(var2,1,':'); output cols;
count=scan(var2,2,':'); output code; end;
if var3 ne " " then do;
col=scan(var3,1,':'); output cols;
count=scan(var3,2,':'); output code; end;
if var4 ne " " then do;
col=scan(var4,1,':'); output cols;
count=scan(var4,2,':'); output code; end;
if var5 ne " " then do;
col=scan(var5,1,':'); output cols;
count=scan(var5,2,':'); output code; end;
if var6 ne " " then do;
col=scan(var6,1,':'); output cols;
count=scan(var6,2,':'); output code; end;
cards;
C000501 C000873:3 C000501:3
C000873 C003330:1 C000873:39 C003402:1 C000501:3 C001758:6
C001758 C001758:12 C003330:4 C000873:6
C003330 C001758:4 C000873:1 C003330:12
C003402 C000873:1 C003402:4
; proc print data=code; id code col count; run;
proc print data=cols; id col; run;
data code; set code; if count gt " "; keep code col count;
proc print ; run;
proc sort data=cols; by col;
proc sort; data=code; by code col;

data cols; set cols; by col; if last.col; proc print; run;
data; set cols; file print;
x+10;
put @x col@;

data; set code; by code; file print;
if first.code then do;
x=0; put @5 code @; end;
X+10; put @x count @;
if last.code then put '. ';
run;

Patrick
Opal | Level 21

Just to "spell out" in code the approach @Reeza suggested.


data have;
  infile cards truncover;
  input (code var2 var3 var4 var5 var6) (:$15.);
  cards;
C000501 C000873:3 C000501:3 
C000873 C003330:1 C000873:39 C003402:1 C000501:3 C001758:6
C001758 C001758:12 C003330:4 C000873:6 
C003330 C001758:4 C000873:1 C003330:12 
C003402 C000873:1 C003402:4 
;
run;

data long(keep=code key value);
  set have;
  array vars {*} var2 - var6;
  length key $32 value 8;
  do _i=1 to dim(vars);
    if not(missing(vars[_i])) then
      do;
        key=scan(vars[_i],1,':');
        value=input(scan(vars[_i],2,':'),best32.);
        output;
      end;
  end;
run;

proc transpose data=long out=wide(drop=_:);
  by code notsorted;
  id key;
  var value;
run;

Capture.PNG

bhupesh102
Calcite | Level 5
Thank you Reeza and especially patrick for making it very clear. It Worked!!!
Jagadishkatam
Amethyst | Level 16

Yes indeed a very good solution provided with arrays, alternatively with tranpose step we could get the same result as below

 

proc sort data=have;
by code;
run;

proc transpose data=have out=new;
by code;
var var2-var6;
run;

data new2;
set new;
col2=scan(col1,2,':');
col1=scan(col1,1,':');
if col1 ne '';
drop _name_;
run;

proc transpose data=new2 out=trans(drop=_name_);
by code;
var col2;
id col1;
run; 

Please try and check.

 

Thanks,

Jag

Thanks,
Jag

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1081 views
  • 1 like
  • 5 in conversation