split char into multiple columns and then append the values into one c...

Ronein · Posted 01-13-2022 02:02 AM

Hello

I am doing the following actions on the raw data set:

1- Split variable transfer_description into multiple variables

2-vertically combine the values from step 1

3- frequency distribution of the resulted step 2

My question:

How can I improve step 2 in order that the code will do it automatically for any number of fields that are created in step 1?


/***Raw data***/
data have;
infile datalines delimiter='|';
informat transfer_description $200.;
input ID  transfer_description  amount;
datalines;
111|Transfer to my friend Joe Kaplan for his help to build my home|1000 
111|Salary IBM|2000
111|Salary WIZZ|3980
111|Transfer to my father|500
333|Transfer to my gf|4000
333|Son help|3000
222|Salary IBM|1500
222|Charity|2500
222|Charity|3000
222|transfer to my friend Jula|8000
;
Run;


/****split string into multiple new varaibles****/
%let max_wrd=;
proc sql noprint;
  select max(countw(transfer_description,', ')) into :max_wrd trimmed
  from have
  ;
quit;
%put &max_wrd;

data wanted(drop=_:);
  set have;
  array AAA[&max_wrd] $15 ;
  do _i = 1 to dim(AAA);
    AAA[_i] = scan(transfer_description,_i,', ');
    if missing(AAA[_i]) then leave;
  end;
run;
 
/****appending  the values **/
/***Question: Hoe can I improve this step to automatic code that union by the number of fields that are created in previous step??****/
proc sql;
create table Wanted2(Where=(A ne '')) as
select AAA1 as A
from wanted
UNION  ALL
select AAA2 as A
from wanted
UNION  ALL
select AAA3 as A
from wanted
UNION  ALL
select AAA4 as A
from wanted
UNION  ALL
select AAA5 as A
from wanted
UNION  ALL
select AAA6 as A
from wanted
UNION  ALL
select AAA7 as A
from wanted
UNION  ALL
select AAA8 as A
from wanted
UNION  ALL
select AAA9 as A
from wanted
UNION  ALL
select AAA10 as A
from wanted
UNION  ALL
select AAA11 as A
from wanted
UNION  ALL
select AAA12 as A
from wanted
UNION  ALL
select AAA13 as A
from wanted
;
quit;

/***Frequency distribution***/
proc sql;
select A,
      count(*) as nr
from Wanted2
group by A
order by nr desc
;
quit;

japelin · Posted 01-13-2022 02:17 AM

how about this code?

data wanted;
  set have;
  length new_desc $200;
  j=count(trim(transfer_description),' ');
  do i=0 to j;
    new_desc=scan(transfer_description,i+1);
    output;
  end;
  keep new_desc;
run;

proc sql;
select new_desc,
      count(*) as nr
from Wanted
group by new_desc
order by nr desc
;
quit;

andreas_lds · Posted 01-13-2022 02:23 AM

@japelin

A beautiful demonstration underlining that long beats wide in most cases.

Ksharp · Posted 01-13-2022 06:41 AM

/***Raw data***/
data have;
infile datalines delimiter='|';
informat transfer_description $200.;
input ID  transfer_description  amount;
datalines;
111|Transfer to my friend Joe Kaplan for his help to build my home|1000 
111|Salary IBM|2000
111|Salary WIZZ|3980
111|Transfer to my father|500
333|Transfer to my gf|4000
333|Son help|3000
222|Salary IBM|1500
222|Charity|2500
222|Charity|3000
222|transfer to my friend Jula|8000
;
Run;


/****split string into multiple new varaibles****/
%let max_wrd=;
proc sql noprint;
  select max(countw(transfer_description,', ')) into :max_wrd trimmed
  from have
  ;
quit;
%put &max_wrd;

data wanted(drop=_:);
  set have;
  array AAA[&max_wrd] $15 ;
  do _i = 1 to dim(AAA);
    AAA[_i] = scan(transfer_description,_i,', ');
    if missing(AAA[_i]) then leave;
  end;

  n+1;  /*<------*/
run;


proc transpose data=wanted out=wanted2(rename=(col1=A) where=(A is not missing)) ;
by n;
var AAA:;
run;

/***Frequency distribution***/
proc sql;
select A,
      count(*) as nr
from Wanted2
group by A
order by nr desc
;
quit;

split char into multiple columns and then append the values into one column

Re: split char into multiple columns and then append the values into one column

Re: split char into multiple columns and then append the values into one column

Re: split char into multiple columns and then append the values into one column

split char into multiple columns and then append the values into one column

Re: split char into multiple columns and then append the values into one column

Re: split char into multiple columns and then append the values into one column

Re: split char into multiple columns and then append the values into one column

Registration is open

SAS Training: Just a Click Away