BookmarkSubscribeRSS Feed
sahaji
Calcite | Level 5

Hi
I have a dataset with the following info

1234 30004 USD

1234 30012 USD

1289 30056 USD

1389 45009 INR

1456 56789 INR

1456 30067 USD

1567 30099 USD

1567 67890 GBP

1567 30088 INR

1567 30092 INR

I need ouput like this

1234   USD

1289   USD

1389   INR

1456   INR,USD

1567   USD,GBP,INR

how to do it,please help me

9 REPLIES 9
Haikuo
Onyx | Level 15

data have;

input id1 $ id2 $ currency $;

cards;

1234 30004 USD

1234 30012 USD

1289 30056 USD

1389 45009 INR

1456 56789 INR

1456 30067 USD

1567 30099 USD

1567 67890 GBP

1567 30088 INR

1567 30092 INR

;

proc sql;

  select max (ct) into :ct from (select count(*) as ct from have group by id1) ;quit;

data want;

array t(&ct) $3 _temporary_;

call missing(of t(*));

  do _n_=1 by 1 until (last.id1);

  set have;

by id1 notsorted;

if currency not in t then

t(_n_)=currency;

  end;

  cat_currency=catx(',', of t(*));

  keep id1 cat_currency;

run;

 

As you can see, this code is not as efficient as I want it to be, an additional pre-process has to be done to get the range of array().

Haikuo

sahaji
Calcite | Level 5

Thanks for the reply and can you please provide the code using base sas

sascom10
Calcite | Level 5

Alterrnate solution -

proc sql;
  create table want1 as
    select id
   , currency
   from have
group by 1,2
    order by 1,2;
quit;

proc transpose data=want1 out=want_wide;
  by id;
  var currency;
run;

data want_final;
  set want_wide;
  currency1=trim(col1)||' '||trim(col2)||' '||trim(col3);
run;

Haikuo
Onyx | Level 15

I have over-complicated your problem, just simply do:

data want1;

  do until (last.id1);

  set have;

by id1 notsorted;

length cat_currency $50;

if findw(cat_currency,currency)=0 then cat_currency=catx(',',cat_currency,currency);

  end;

  keep id1 cat_currency;

run;

No array or macro variable is needed.

Haikuo

sahaji
Calcite | Level 5

I tried like this, but  i am not getting accurate answer

The actual data is in dataset  smple

DATA SAMPLE1;

LENGTH CURRENCY CURRENCY1 $50;

SET SAMPLE;

BY CUST_ID;

IF FIRST.CUST_ID THEN CURRENCY1='';

PUT _ALL_;

CURRENCY1=CATX(',',trim(CURRENCY1),trim(CURRENCY));

IF LAST.CUST_ID;

RUN;

PGStats
Opal | Level 21

You need to check for currency to already be in currency1. Also, trim is not necessary in in catx arguments, but it is in findw word argument. Most importantly, you must use a retain statement on currency1 because otherwise it is reset to missing at every iteration. Try it this way :

DATA SAMPLE1;

LENGTH CURRENCY CURRENCY1 $50;

retain currency1;

SET SAMPLE;

BY CUST_ID;

IF FIRST.CUST_ID THEN call missing(CURRENCY1);

if findw(currency1,trim(currency))=0 then CURRENCY1=CATX(',', CURRENCY1, CURRENCY);

IF LAST.CUST_ID;

RUN;

PG

PG
avishek2411
Calcite | Level 5

data temp;
input id zip cntr $;
cards;
1234 30004 USD
1234 30012 USD
1289 30056 USD
1389 45009 INR
1456 56789 INR
1456 30067 USD
1567 30099 USD
1567 67890 GBP
1567 30088 INR
1567 30092 INR
;

proc sort data=temp(drop=zip) out=temp1 nodupkey;
by id cntr;
run;

data temp2(drop=cntr);
set temp1;
retain cntry;
length cntry $20.;
by id cntr;
if first.id and last.id then do;
cntry=cntr;
output temp2;
end;
else if first.id and last.id = 0 then do;
cntry=cntr;
end;
else if first.id = 0 and last.id = 0 then do;
cntry = catx(',',cntry,cntr);
end;
else if first.id = 0 and last.id = 1 then do;
cntry = catx(',',cntry,cntr);
output temp2;
end;
run;

mkeintz
PROC Star

@avishek2411: I think you have the right idea.  But the data step after proc sort can be simplified greatly by have a SET and BY statement inside a DO UNTIL (LAST.ID) loop:

 

 

data temp;
input id zip cntr $;
cards;
1234 30004 USD
1234 30012 USD
1289 30056 USD
1389 45009 INR
1456 56789 INR
1456 30067 USD
1567 30099 USD
1567 67890 GBP
1567 30088 INR
1567 30092 INR
;

proc sort data=temp(drop=zip) out=temp1 nodupkey;
by id cntr;
run;
data want;
  do until (last.id);
	set temp1;
	by id;
    length cntr_list $20;
	cntr_list=catx(',',cntr_list,cntr);
  end;
  drop cntr;
run;

 

 The only downside is that the currency list for each id is alphabetic.  It is no longer in the order of appearance.  Have to ask @sahaji whether that is meaningful deficiency.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mk0459
Calcite | Level 5

data d;
input ID SAL CU$;
datalines;
1234 30004 USD

1234 30012 USD

1289 30056 USD

1389 45009 INR

1456 56789 INR

1456 30067 USD

1567 30099 USD

1567 67890 GBP

1567 30088 INR

1567 30092 INR
;
run;



data s;
length curr_joined $ 50;
retain curr_joined ;
set d;

by id;
if first.ID then curr_joined=CU;


else if not exist(curr_joined) && findw(curr_joined,CU) =0
THEN curr_joined = catx(', ', curr_joined, CU);

if last.ID;

run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 6684 views
  • 3 likes
  • 7 in conversation