Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Summing several records into one record.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-26-2010 04:23 AM

Each ID have several records. I want one record for each ID.

Each ID have 1 to 2 PROD:s. Each record can have CAT in {'A','B'}. If an ID have CAT='A' in**any** of it's records then CAT should become 'A', otherwise 'B'.

The first PROD in time should become PR1, the second PR2(if any).

The corresponding sums of AMOUNT should be assigned to AM1 and AM2 resp.

Corresponding days D1 and D2 should be calculated by taking the last DATE2 - the first DATE1 for each PROD. First DATE1 for the first PROD should be saved as STRT1 and last DATE2 for the first PROD should be saved as STP1.

If a second PROD exists then STRT2 and STP2 should be saved in the same manner.

Example:

ID***PROD**DATE1*****DATE2*****AMOUNT**CAT

10***111****20050108**20050110****100*******'B'

10***111****20050112**20050117****500*******'A'

10***111****20050121**20050124****200*******'B'

10***222****20050110**20050112****150*******'B'

10***222****20050115**20050120****250*******'B'

20***222****20010601**20010616****700*******'B'

should become:

ID**PR1*STRT1****STP1**D1**AM1*PR2*STRT2**STP2****D2**AM2*CAT

10**111*050108**050124*16**800**222**050110**050120*10**400***'A'

20**222*010601**010616*10**700***(.)***(BLANK)*(BLANK)*(.)**(.)****'B'

Note that the first PROD in time becomes PR1 regardless of what PROD.

This must be possible to do in a single data step. Any help is much appreciated.

Each ID have 1 to 2 PROD:s. Each record can have CAT in {'A','B'}. If an ID have CAT='A' in

The first PROD in time should become PR1, the second PR2(if any).

The corresponding sums of AMOUNT should be assigned to AM1 and AM2 resp.

Corresponding days D1 and D2 should be calculated by taking the last DATE2 - the first DATE1 for each PROD. First DATE1 for the first PROD should be saved as STRT1 and last DATE2 for the first PROD should be saved as STP1.

If a second PROD exists then STRT2 and STP2 should be saved in the same manner.

Example:

ID***PROD**DATE1*****DATE2*****AMOUNT**CAT

10***111****20050108**20050110****100*******'B'

10***111****20050112**20050117****500*******'A'

10***111****20050121**20050124****200*******'B'

10***222****20050110**20050112****150*******'B'

10***222****20050115**20050120****250*******'B'

20***222****20010601**20010616****700*******'B'

should become:

ID**PR1*STRT1****STP1**D1**AM1*PR2*STRT2**STP2****D2**AM2*CAT

10**111*050108**050124*16**800**222**050110**050120*10**400***'A'

20**222*010601**010616*10**700***(.)***(BLANK)*(BLANK)*(.)**(.)****'B'

Note that the first PROD in time becomes PR1 regardless of what PROD.

This must be possible to do in a single data step. Any help is much appreciated.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deleted_user

11-26-2010 10:00 AM

You have/had another recent thread where BY GROUP PROCESSING was demonstrated as a way to address the need to group data-rows based on defined conditions (generated using IF / THEN statements).

Have you attempted any coding at all, given that this data-sample appears to be similar to the "selecting records" post/thread you had going previously? Suggest you reveal SAS programming code that you have attempted to use to solve this query, then obtain useful and most suitable feedback to make your SAS program most effective.

Of course, unless you're just looking to have someone else write the SAS program for you? And, so, there are individuals on these forum channels willing to do so, possibly.

Scott Barry

SBBWorks, Inc.

Have you attempted any coding at all, given that this data-sample appears to be similar to the "selecting records" post/thread you had going previously? Suggest you reveal SAS programming code that you have attempted to use to solve this query, then obtain useful and most suitable feedback to make your SAS program most effective.

Of course, unless you're just looking to have someone else write the SAS program for you? And, so, there are individuals on these forum channels willing to do so, possibly.

Scott Barry

SBBWorks, Inc.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-26-2010 10:39 AM

Scott Barry,

I was expecting a reply from you and with the kind of comments that it actually contains.

Wouldn't it be nice if you could become the SAS Institute official censor of this forum?

I was expecting a reply from you and with the kind of comments that it actually contains.

Wouldn't it be nice if you could become the SAS Institute official censor of this forum?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deleted_user

11-26-2010 03:03 PM

Hello EmestoC,

Thsi is a possible solution, may be not an optimal one...

[pre]

data i;

infile datalines delimiter="*'";

input ID PROD D1 $ D2 $ AMOUNT CAT $;

date1=MDY(SUBSTR(d1,5,2),SUBSTR(d1,7,2),SUBSTR(d1,1,4));

date2=MDY(SUBSTR(d2,5,2),SUBSTR(d2,7,2),SUBSTR(d2,1,4));

format date1 date2 date7.;

drop d1 d2;

datalines;

10***111****20050108**20050110****100*******'B'

10***111****20050112**20050117****500*******'A'

10***111****20050121**20050124****200*******'B'

10***222****20050110**20050112****150*******'B'

10***222****20050115**20050120****250*******'B'

20***222****20010601**20010616****700*******'B'

;

run;

/* CAT */;

proc sort data=i(keep=id cat where=(Cat='A')) out=c;

by id;

run;

/* AMT */;

proc SQL;

create table a as

select ID, Prod, SUM(AMOUNT) as AM, count(*) as N

from i

group by ID, Prod

;quit;

/* DAYS */;

proc sort data=i out=i;

by id prod date1;

run;

data d;

retain dt1;

set i;

if first.prod then dt1=date1;

if last.prod then do; d=date2-dt1; output; end;

by ID prod;

keep ID prod date1 date2 d;

run;

/* Prods into macro variables */;

proc SQL;

select distinct prod into 1-2 from i

%let p1=%trim(&p1);

%let p2=%trim(&p2);

;quit;

/* MERGE */;

data da;

merge d a;

by id prod;

run;

/* TRANSPOSE date1 date2 d and am */;

proc transpose data=da out=d1 prefix=d1_;

var date1;

id prod;

by id;

run;

proc transpose data=da out=d2 prefix=d2_;

var date1;

id prod;

by id;

run;

proc transpose data=da out=days prefix=d_;

var d;

id prod;

by id;

run;

proc transpose data=da out=am prefix=am_;

var am;

id prod;

by id;

run;

data r0;

retain ID STRT1 STP1 D1 AM1 STRT2 STP2 D2 AM2;

merge d1 (rename=(d1_&p1=STRT1 d1_&p2=STRT2))

d2 (rename=(d2_&p1=STP1 d2_&p2=STP2 ))

days(rename=(d_&p1 =D1 d_&p2 =D2 ))

am (rename=(am_&p1=AM1 am_&p2=AM2 ));

if STRT1=. and STP1=. and D1=. and AM1=. then do;

STRT1=STRT2; STP1=STP2; D1=D2; AM1=AM2;

STRT2=. ; STP2= .; D2= .; AM2= .;

end;

by id;

drop _name_;

run;

/* Add CAT */;

proc SQL;

create table r as

select a.*,

case

when b.cat="A" then "A"

else "B"

end as CAT

from r0 as a left join c as b

on a.ID=b.ID

;quit;

[/pre]

Sincerely,

SPR

Thsi is a possible solution, may be not an optimal one...

[pre]

data i;

infile datalines delimiter="*'";

input ID PROD D1 $ D2 $ AMOUNT CAT $;

date1=MDY(SUBSTR(d1,5,2),SUBSTR(d1,7,2),SUBSTR(d1,1,4));

date2=MDY(SUBSTR(d2,5,2),SUBSTR(d2,7,2),SUBSTR(d2,1,4));

format date1 date2 date7.;

drop d1 d2;

datalines;

10***111****20050108**20050110****100*******'B'

10***111****20050112**20050117****500*******'A'

10***111****20050121**20050124****200*******'B'

10***222****20050110**20050112****150*******'B'

10***222****20050115**20050120****250*******'B'

20***222****20010601**20010616****700*******'B'

;

run;

/* CAT */;

proc sort data=i(keep=id cat where=(Cat='A')) out=c;

by id;

run;

/* AMT */;

proc SQL;

create table a as

select ID, Prod, SUM(AMOUNT) as AM, count(*) as N

from i

group by ID, Prod

;quit;

/* DAYS */;

proc sort data=i out=i;

by id prod date1;

run;

data d;

retain dt1;

set i;

if first.prod then dt1=date1;

if last.prod then do; d=date2-dt1; output; end;

by ID prod;

keep ID prod date1 date2 d;

run;

/* Prods into macro variables */;

proc SQL;

select distinct prod into 1-2 from i

%let p1=%trim(&p1);

%let p2=%trim(&p2);

;quit;

/* MERGE */;

data da;

merge d a;

by id prod;

run;

/* TRANSPOSE date1 date2 d and am */;

proc transpose data=da out=d1 prefix=d1_;

var date1;

id prod;

by id;

run;

proc transpose data=da out=d2 prefix=d2_;

var date1;

id prod;

by id;

run;

proc transpose data=da out=days prefix=d_;

var d;

id prod;

by id;

run;

proc transpose data=da out=am prefix=am_;

var am;

id prod;

by id;

run;

data r0;

retain ID STRT1 STP1 D1 AM1 STRT2 STP2 D2 AM2;

merge d1 (rename=(d1_&p1=STRT1 d1_&p2=STRT2))

d2 (rename=(d2_&p1=STP1 d2_&p2=STP2 ))

days(rename=(d_&p1 =D1 d_&p2 =D2 ))

am (rename=(am_&p1=AM1 am_&p2=AM2 ));

if STRT1=. and STP1=. and D1=. and AM1=. then do;

STRT1=STRT2; STP1=STP2; D1=D2; AM1=AM2;

STRT2=. ; STP2= .; D2= .; AM2= .;

end;

by id;

drop _name_;

run;

/* Add CAT */;

proc SQL;

create table r as

select a.*,

case

when b.cat="A" then "A"

else "B"

end as CAT

from r0 as a left join c as b

on a.ID=b.ID

;quit;

[/pre]

Sincerely,

SPR

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deleted_user

11-30-2010 01:44 AM

Hi.

Why are your logic so complex always.

I totally be messed up by it.

[pre]

options missing=.;

data temp;

input ID $ PROD $ DATE1 : yymmdd8. DATE2 : yymmdd8. AMOUNT CAT$;

datalines;

10 111 20050108 20050110 100 B

10 111 20050112 20050117 500 A

10 111 20050121 20050124 200 B

10 222 20050110 20050112 150 B

10 222 20050115 20050120 250 B

20 222 20010601 20010616 700 B

;run;

proc format ;

value st

.= ' '

low-high=[yymmddn8.]

;

run;

proc sort data=temp;

by id prod;

run;

data op;

set temp;

by id prod;

retain pr1 pr2 strt1 strt2 stp1 stp2 am1 am2 _cat d1 d2;

keep id pr1 pr2 strt1 strt2 stp1 stp2 am1 am2 _cat d1 d2;

format strt1 stp1 strt2 stp2 st.;

if first.prod and last.prod then do;

pr1=prod;

strt1=date1;

stp1=date2;

d1=date2-date1;

am1=amount;

_cat=cat;

call missing(pr2,strt2,stp2,d2,am2);

output;

end;

else do;

if first.prod then am=0;

am+amount;

if first.prod and first.id then do;

strt1=date1;

pr1=prod;

_cat=cat;

end;

if cat eq 'A' then _cat='A' ;

if last.prod and not last.id then do;

am1=am;

stp1=date2;

end;

if first.prod and not first.id then do;

strt2=date1;

pr2=prod;

end;

if last.prod and last.id then do;

am2=am;

stp2=date2;

d1=stp1-strt1;

d2=stp2-strt2;

output;

end;

end;

run;

proc report data=op nowd out=result(drop=_break_);

column id pr1 strt1 stp1 d1 am1 pr2 strt2 stp2 d2 am2 _cat ;

run;

proc print noobs;run;

[/pre]

Good Luck.

Ksharp

Why are your logic so complex always.

I totally be messed up by it.

[pre]

options missing=.;

data temp;

input ID $ PROD $ DATE1 : yymmdd8. DATE2 : yymmdd8. AMOUNT CAT$;

datalines;

10 111 20050108 20050110 100 B

10 111 20050112 20050117 500 A

10 111 20050121 20050124 200 B

10 222 20050110 20050112 150 B

10 222 20050115 20050120 250 B

20 222 20010601 20010616 700 B

;run;

proc format ;

value st

.= ' '

low-high=[yymmddn8.]

;

run;

proc sort data=temp;

by id prod;

run;

data op;

set temp;

by id prod;

retain pr1 pr2 strt1 strt2 stp1 stp2 am1 am2 _cat d1 d2;

keep id pr1 pr2 strt1 strt2 stp1 stp2 am1 am2 _cat d1 d2;

format strt1 stp1 strt2 stp2 st.;

if first.prod and last.prod then do;

pr1=prod;

strt1=date1;

stp1=date2;

d1=date2-date1;

am1=amount;

_cat=cat;

call missing(pr2,strt2,stp2,d2,am2);

output;

end;

else do;

if first.prod then am=0;

am+amount;

if first.prod and first.id then do;

strt1=date1;

pr1=prod;

_cat=cat;

end;

if cat eq 'A' then _cat='A' ;

if last.prod and not last.id then do;

am1=am;

stp1=date2;

end;

if first.prod and not first.id then do;

strt2=date1;

pr2=prod;

end;

if last.prod and last.id then do;

am2=am;

stp2=date2;

d1=stp1-strt1;

d2=stp2-strt2;

output;

end;

end;

run;

proc report data=op nowd out=result(drop=_break_);

column id pr1 strt1 stp1 d1 am1 pr2 strt2 stp2 d2 am2 _cat ;

run;

proc print noobs;run;

[/pre]

Good Luck.

Ksharp