BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sagulolo
Quartz | Level 8

Hi all,

 

I have a dataset and SQL below, this is the best i can do, I know it can be done in one step of Datastep or SQL. Need advice how to do it in one step(Datastep or SQL) instead of mine by 2 step.

 

data have;
infile cards dlm='|';
input ACC_NO SUB_ACCNO PRODUCT$ ACC_PLUS_PRODUCT :$20. CATEGORY $ AMT;
cards;
11111 | 12345 | GOLD | 11111GOLD | B | 50
11111 | 24895 | BLACK | 11111BLACK | B | 25
11111 | 24445 | BLACK | 11111BLACK | B | 75
11111 | 24445 | SILVER | 11111SILVER | B | 140
11111 | 23456 | SILVER | 11111SILVER | S | 70
11111 | 86475 | GOLD | 11111GOLD | S | 130
22222 | 34567 | SILVER | 22222SILVER | B | 100
22222 | 12458 | GOLD | 22222GOLD | B | 50
22222 | 12557 | GOLD | 22222GOLD | S | 100
33333 | 23785 | GOLD | 33333GOLD | B | 75
33333 | 12968 | SILVER | 33333SILVER | B | 110
33333 | 12968 | SILVER | 33333SILVER | S | 110
33333 | 54368 | SILVER | 33333SILVER | S | 50
33333 | 94256 | SILVER | 33333SILVER | S | 175
;

proc sql;
	create table flow_list_remove_duplicate(drop=AMT) as
	select *, 
	SUM(AMT) as count_AMT
	from have
	group by ACC_NO ;
quit;

proc sort data=work.flow_list_remove_duplicate;
	by ACC_NO;
run;

DATA FINAL;
	SET WORK.flow_list_remove_duplicate;
	by ACC_NO;
	if first.ACC_NO then output;
run;
	

Thank in advance

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

In the context of your query the other variables like subaccno don't make much sense, if you take a distinct acc_no, and sum amount then that is the data row, the other variables could be any of the records as they don't get summed.  If you just want the last row in the group and a sum, then:

data want (drop=amt);
  set have;
  retain count_amt;
  by acc_no;
  if first.acc_no then count_amt=amt;
else count_amt=sum(count_amt,amt);
if last.acc_no;
run;

View solution in original post

5 REPLIES 5
sagulolo
Quartz | Level 8

Expected result as below:-

 

ObsACC_NOSUB_ACCNOPRODUCTACC_PLUS_PRODUCTCATEGORYcount_AMT
11111186475GOLD11111GOLDS490
22222234567SILVER22222SILVERB250
33333394256SILVER33333SILVERS520
RW9
Diamond | Level 26 RW9
Diamond | Level 26

In the context of your query the other variables like subaccno don't make much sense, if you take a distinct acc_no, and sum amount then that is the data row, the other variables could be any of the records as they don't get summed.  If you just want the last row in the group and a sum, then:

data want (drop=amt);
  set have;
  retain count_amt;
  by acc_no;
  if first.acc_no then count_amt=amt;
else count_amt=sum(count_amt,amt);
if last.acc_no;
run;
sagulolo
Quartz | Level 8

Hi RW9,

 

Thank for the code. Would appreciated if you could give guidance on the Proc SQL as well(if you're convenience).

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

In what sense sorry?  For the summation, SQL looks at the group by columns, so only takes ACC_NO and AMT, as these are the only items which relate to the summary statistic.  As for trying to get the last record out within SQL, this tends to be trickier.  SQL does not have the concept of a series of observations, it only looks at logical blocks of observations.  So you would need to have a logical way of finding the last record, maybe looking at your test data: where SUB_ACCNO=max(SUB_ACCNO).  Anyways, for this it is far simpler to do in datastep.

sagulolo
Quartz | Level 8

Hi RW9,

 

You're absolute right, this is actually a test data(apologize for the poor test data). At the moment, there is no any criteria need to select which data after the summation done. I believe next will be asking criteria like last record, latest date(which i didn't put in), max/min or etc.

 

Thank again for your guidance, with sample code of this "where SUB_ACCNO=max(SUB_ACCNO)", at least I have idea how to do it in  Datastep and SQL as well.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 750 views
  • 0 likes
  • 2 in conversation