I have a real time scenario

sahaji · Posted 05-20-2013 02:05 PM

Hi
I have a dataset with the following info

1234 30004 USD

1234 30012 USD

1289 30056 USD

1389 45009 INR

1456 56789 INR

1456 30067 USD

1567 30099 USD

1567 67890 GBP

1567 30088 INR

1567 30092 INR

I need ouput like this

1234 USD

1289 USD

1389 INR

1456 INR,USD

1567 USD,GBP,INR

how to do it,please help me

Haikuo · Posted 05-20-2013 02:29 PM

data have;

input id1 $ id2 $ currency $;

cards;

1234 30004 USD

1234 30012 USD

1289 30056 USD

1389 45009 INR

1456 56789 INR

1456 30067 USD

1567 30099 USD

1567 67890 GBP

1567 30088 INR

1567 30092 INR

;

proc sql;

select max (ct) into :ct from (select count(*) as ct from have group by id1) ;quit;

data want;

array t(&ct) $3 _temporary_;

call missing(of t(*));

do _n_=1 by 1 until (last.id1);

set have;

by id1 notsorted;

if currency not in t then

t(_n_)=currency;

end;

cat_currency=catx(',', of t(*));

keep id1 cat_currency;

run;

As you can see, this code is not as efficient as I want it to be, an additional pre-process has to be done to get the range of array().

Haikuo

sahaji · Posted 05-20-2013 02:39 PM

Thanks for the reply and can you please provide the code using base sas

sascom10 · Posted 05-20-2013 02:34 PM

Alterrnate solution -

proc sql;
create table want1 as
    select id
   , currency
   from have
group by 1,2
    order by 1,2;
quit;

proc transpose data=want1 out=want_wide;
by id;
var currency;
run;

data want_final;
set want_wide;
currency1=trim(col1)||' '||trim(col2)||' '||trim(col3);
run;

Haikuo · Posted 05-20-2013 02:39 PM

I have over-complicated your problem, just simply do:

data want1;

do until (last.id1);

set have;

by id1 notsorted;

length cat_currency $50;

if findw(cat_currency,currency)=0 then cat_currency=catx(',',cat_currency,currency);

end;

keep id1 cat_currency;

run;

No array or macro variable is needed.

Haikuo

sahaji · Posted 05-20-2013 03:12 PM

I tried like this, but i am not getting accurate answer

The actual data is in dataset smple

DATA SAMPLE1;

LENGTH CURRENCY CURRENCY1 $50;

SET SAMPLE;

BY CUST_ID;

IF FIRST.CUST_ID THEN CURRENCY1='';

PUT _ALL_;

CURRENCY1=CATX(',',trim(CURRENCY1),trim(CURRENCY));

IF LAST.CUST_ID;

RUN;

PGStats · Posted 05-20-2013 04:02 PM

You need to check for currency to already be in currency1. Also, trim is not necessary in in catx arguments, but it is in findw word argument. Most importantly, you must use a retain statement on currency1 because otherwise it is reset to missing at every iteration. Try it this way :

DATA SAMPLE1;

LENGTH CURRENCY CURRENCY1 $50;

retain currency1;

SET SAMPLE;

BY CUST_ID;

IF FIRST.CUST_ID THEN call missing(CURRENCY1);

if findw(currency1,trim(currency))=0 then CURRENCY1=CATX(',', CURRENCY1, CURRENCY);

IF LAST.CUST_ID;

RUN;

PG

avishek2411 · Posted 05-27-2017 04:37 PM

data temp;
input id zip cntr $;
cards;
1234 30004 USD
1234 30012 USD
1289 30056 USD
1389 45009 INR
1456 56789 INR
1456 30067 USD
1567 30099 USD
1567 67890 GBP
1567 30088 INR
1567 30092 INR
;

proc sort data=temp(drop=zip) out=temp1 nodupkey;
by id cntr;
run;

data temp2(drop=cntr);
set temp1;
retain cntry;
length cntry $20.;
by id cntr;
if first.id and last.id then do;
cntry=cntr;
output temp2;
end;
else if first.id and last.id = 0 then do;
cntry=cntr;
end;
else if first.id = 0 and last.id = 0 then do;
cntry = catx(',',cntry,cntr);
end;
else if first.id = 0 and last.id = 1 then do;
cntry = catx(',',cntry,cntr);
output temp2;
end;
run;

mkeintz · Posted 10-14-2018 05:47 PM

@avishek2411: I think you have the right idea. But the data step after proc sort can be simplified greatly by have a SET and BY statement inside a DO UNTIL (LAST.ID) loop:

data temp;
input id zip cntr $;
cards;
1234 30004 USD
1234 30012 USD
1289 30056 USD
1389 45009 INR
1456 56789 INR
1456 30067 USD
1567 30099 USD
1567 67890 GBP
1567 30088 INR
1567 30092 INR
;

proc sort data=temp(drop=zip) out=temp1 nodupkey;
by id cntr;
run;
data want;
  do until (last.id);
	set temp1;
	by id;
    length cntr_list $20;
	cntr_list=catx(',',cntr_list,cntr);
  end;
  drop cntr;
run;

The only downside is that the currency list for each id is alphabetic. It is no longer in the order of appearance. Have to ask @sahaji whether that is meaningful deficiency.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

mk0459 · Posted 10-11-2018 04:02 PM

data d;
input ID SAL CU$;
datalines;
1234 30004 USD

1234 30012 USD

1289 30056 USD

1389 45009 INR

1456 56789 INR

1456 30067 USD

1567 30099 USD

1567 67890 GBP

1567 30088 INR

1567 30092 INR
;
run;

data s;
length curr_joined $ 50;
retain curr_joined ;
set d;

by id;
if first.ID then curr_joined=CU;

else if not exist(curr_joined) && findw(curr_joined,CU) =0
THEN curr_joined = catx(', ', curr_joined, CU);

if last.ID;

run;

I have a real time scenario

Re: I have a real time scenario

Re: I have a real time scenario

Re: I have a real time scenario

Re: I have a real time scenario

Re: I have a real time scenario

Re: I have a real time scenario

Re: I have a real time scenario

Re: I have a real time scenario

Re: I have a real time scenario

The 2025 SAS Hackathon has begun!

SAS Training: Just a Click Away