Do you have an eloquent solution to an already working solution?

bfrye · Posted 11-21-2014 08:31 AM

This is a problem that I have solved but wonder if someone has an elegant solution? I suspect there may be a better solution.

I have digits associated with M and B.

DATA Numbers;

input MB $ 1 Digit1 $ 3 Digit2 $ 4 Digit3 $ 5 ;

Datalines;

M 733

M 234

M 736

M 627

M 740

B 414

B 802

B 631

B 495

B 588

;

Run;

I am only interested in the number of ‘M’ digits.

I want to know what ‘M’ digit values are greater than ‘B’ or what ‘M’ digit values that do not appear with ‘B’.

I also want to know the count or occurrence. (At the end, I multiplied by 2 because I wanted to double the count)

After I separated the digits associated with M and B and got counts using Proc Freq, I spent much time looking at ‘holding’ the current record and ‘looking’ forward at the next record to do the math using set point=, etc. I also looked at Retain and Lag. Then I looked at Proc Transpose. Here are my results.

Proc Print Data = Numbers;

Title "Numbers";

Run;

Numbers 15:53 Thursday, November 20, 2014 1

Obs MB Digit1 Digit2 Digit3

1 M 7 3 3

2 M 2 3 4

3 M 7 3 6

4 M 6 2 7

5 M 7 4 0

6 B 4 1 4

7 B 8 0 2

8 B 6 3 1

9 B 4 9 5

10 B 5 8 8

Data M (KEEP= MB Number ) B (KEEP= MB Number );

Set Numbers;

Number = Digit1;

If MB = "M" Then

Output M; Else Output B;

Number = Digit2;

If MB = "M" Then

Output M; Else Output B;

Number= Digit3;

If MB = "M" Then

Output M; Else Output B;

run;

Data M;

Set M;

If Number > "0"; /*I might have a 1 or 2 digit number in the source data*/

NM=Number;

Run;

Proc sort data=M;

By NM MB Number;

Run;

Data B;

Set B;

NB=Number;

Run;

Proc sort data=B;

By NB MB Number;

Run;

Data SetMB;

Set M B;

By Number DESCENDING MB ;

Run;

Proc Print Data=SetMB;

Title "Initial Data";

Run;

Initial Data 15:53 Thursday, November 20, 2014 2

Obs MB Number NM NB

1 B 0 0

2 B 1 1

3 B 1 1

4 M 2 2

5 M 2 2

6 B 2 2

7 M 3 3

8 M 3 3

9 M 3 3

10 M 3 3

11 B 3 3

12 M 4 4

13 M 4 4

14 B 4 4

15 B 4 4

16 B 4 4

17 B 5 5

18 B 5 5

19 M 6 6

20 M 6 6

21 B 6 6

22 M 7 7

23 M 7 7

24 M 7 7

25 M 7 7

26 B 8 8

27 B 8 8

28 B 8 8

29 B 9 9

Proc Freq Data=SetMB Noprint;

Tables MB*Number/ Out=Summary (Drop = Percent);

Run;

Proc Sort Data=Summary;

By Number Descending MB ;

Run;

Proc Print Data=Summary;

Title "Summary";

Run;

Summary 15:53 Thursday, November 20, 2014 3

Obs MB Number COUNT

1 B 0 1

2 B 1 2

3 M 2 2

4 B 2 1

5 M 3 4

6 B 3 1

7 M 4 2

8 B 4 3

9 B 5 2

10 M 6 2

11 B 6 1

12 M 7 4

13 B 8 3

14 B 9 1

Proc Transpose Data=Summary out=wide prefix=NBR;

By Number;

ID MB;

Var Count;

Run;

Data Wide;

Set Wide;

If NBRM > NBRB;

If NBRB < 0 Then NBRB = 0;

Reqd = 2 * (NBRM - NBRB);

Run;

/* From Summary for the Number 2, I have COUNT=2 ‘M’s and COUNT=1 ‘B’ with a difference of 1 but is 2 when multiplied by 2 */

Proc Print Data=Wide;

Var Number Reqd;

Title "Extra Numbers Required";

Run;

Extra Numbers Required 4

15:53 Thursday, November 20, 2014

Obs Number Reqd

1 2 2

2 3 6

3 6 2

4 7 8

Ksharp · Posted 11-21-2014 09:02 AM

How to compare these digits between ‘M’ and ‘B’ ?

art297 · Posted 11-21-2014 11:21 AM

The following could be simplified even further, but it achieves the same result as you showed in your example:

DATA M (keep=number) B (keep=number);

input MB $ 1 Digit1 $ 3 Digit2 $ 4 Digit3 $ 5 ;

array digits(*) dig:;

do i=1 to dim(digits);

if digits(i) gt 0 then do;

number=digits(i);

if MB='M' then output M;

else output B;

end;

Datalines;

M 733

M 234

M 736

M 627

M 740

B 414

B 802

B 631

B 495

B 588

;

proc sql;

create table msum as

select number,count(number) as mcount

from M

group by number

;

create table rsum as

select number,max(0,count(number)) as rcount

from B

group by number

;

create table want as

select l.number,(mcount-max(0,rcount))*2 as Reqd

from msum l

left join rsum r

on l.number=r.number

having mcount>rcount

;

quit;

rivieralad · Posted 11-21-2014 11:29 AM

Hiya

This is quite easy to do in a single data step by using arrays...see below.

Hope this helps, as it is quite a bit simpler than multiple data steps and procedures.

If you have any questions feel free to send me a message or e-mail.

Cheers

Chris

data want;

* define variables to hold data

* not interested in B numbers if no matching M numbers and don't need M number if 0 so don't need to catch any B number if 0;

* therefore set up variables to hold count of numbers 1 to 9 for B and M;

length

code

$1.

m_numb1 - m_numb9

b_numb1 - b_numb9

number

numb

reqd

mnumb

bnumb

8.

;

* specify which variables to keep;

keep

numb

reqd

mnumb

bnumb

;

* label the output variables;

label

numb = 'Number'

reqd = 'Required'

mnumb = 'M Count'

bnumb = 'B Count'

;

* retain code across source data reads until new value found;

retain

code

;

* use infile statement so we specify what happens at end of file;

infile datalines eof = gotit;

* define arrays to hold count of numbers for each code;

array m_numb(9) m_numb1 - m_numb9 (9*0);

array b_numb(9) b_numb1 - b_numb9 (9*0);

* read in code and first digit then output code B or code M and number greater than 0;

input code $1. +1 number 1. @;

if code eq 'B' and number <> 0 then

b_numb(number) = b_numb(number) + 1;

else if number <> 0 then

m_numb(number) = m_numb(number) + 1;

* read in second digit then output code B or code M and number greater than 0;

input number 1. @;

if code eq 'B' and number <> 0 then

b_numb(number) = b_numb(number) + 1;

else if number <> 0 then

m_numb(number) = m_numb(number) + 1;

* read in second digit then output code B or code M and number greater than 0;

input number 1.;

if code eq 'B' and number <> 0 then

b_numb(number) = b_numb(number) + 1;

else if number <> 0 then

m_numb(number) = m_numb(number) + 1;

* make sure we don't process the eof label until required;

return;

gotit:

do i = 1 to 9;

if b_numb(i) < m_numb(i) then do;

numb = i;

reqd = (m_numb(i) - b_numb(i)) * 2;

mnumb = m_numb(i);

bnumb = b_numb(i);

output;

end;

Datalines;

M 733

M 234

M 736

M 627

M 740

B 414

B 802

B 631

B 495

B 588

;

Run;

Number	Required	M Count	B Count
2	2	2	1
3	6	4	1
6	2	2	1
7	8	4	0

Haikuo · Posted 11-21-2014 11:38 AM

The more, the merrier, Here is another solution based on Array:

DATA Numbers;

input MB $ 1 Digit1 $ 3 Digit2 $ 4 Digit3 $ 5;

Datalines;

M 733

M 234

M 736

M 627

M 740

B 414

B 802

B 631

B 495

B 588

;

Run;

data want;

set numbers end=last;

array m(0:9) _temporary_;

array b(0:9) _temporary_;

array d digit:;

do over d;

if mb='M' then

m(input(d,1.))+1;

else b(input(d,1.))+1;

end;

if last then

do _n_=0 to 9;

reqd=sum(m(_n_),-b(_n_));

if reqd>0 then

do;

reqd=reqd*2;

number=_n_;

output;

end;

keep number reqd;

run;

bfrye · Posted 11-26-2014 01:53 PM

Eloquent is the word.

Thanks

MarkWik · Posted 11-26-2014 02:46 PM

Dear Hai kuo,

What does it take for an aspiring SAS wannabe programmer to become somebody like you. I have noticed a lot of amazing solutions of yours and many of my friends deem you as the superstar of SAS. Perhaps, are you Mr.Ian Whitlock no.2? or you have overtaken him?. You seem to be gifted. Can you offer any piece of guidance or advice for all those Hai kuo wannabe's to help learn faster such as pointing to materials,docs and beyond?

Dear Art and others, how long has it been since the forum existed, i.e when did it start? It seems easy trace back in SAS L, but here is it possible to search threads by date starting from the very first like 1996 sas-l digest?

Many Thanks indeed,

Mark

art297 · Posted 11-26-2014 03:10 PM

Mark: You are off by around 10 years concerning SAS-L. Yes, the current log only goes back to Jan 1996, but the original logs exist on the Marist University listserv. You can find the first log, November 1986, at: http://vm.marist.edu/htbin/wlvgl?L=SAS-L&LOG=LOG8611

The SAS Discussion forum history is also available on-line at: Browse - Content | Communities on SAS

As you'll see, the first forum post shown comes from March 2006

Haikuo · Posted 11-26-2014 06:56 PM

Dear Mark,

What a deja vu ! I said something similar to less than 2 years ago, right here on this forum:

Paul dorfman, Ksharp, Data _null_, Art, Tom, FriedEgg, PG, Joe Matise, Net Wooding and many many others on this forum and SAS-L are still way ahead of me in term of SAS skills, but as you can see, I made my progress in a period time that is not too long (<5 years of SAS on hand), and you can do it as well.

Well, now you mentioned Ian, I have to say, if Ian was a SAS professor, I am in 1st grade at best, it still takes me to read 2-3 times before I could comfortably understand his papers, especially those on Macros.

The best way to learn SAS is hanging out here at the forum or SAS-L.

Anyway, I want to thank you for your attention and your kind words, even though I know there is still a long way ahead of me. And thank you for your SAS endeavors.

Haikuo

Ksharp · Posted 11-22-2014 03:38 AM

OK. If I understood what you mean.


DATA Numbers;
     input  MB $ 1 Digit1 $ 3 Digit2 $ 4 Digit3 $ 5;
     Datalines;
M 733
M 234
M 736
M 627
M 740
B 414
B 802
B 631
B 495
B 588
;
Run;

data _null_;
 if _n_ eq 1 then do;
  declare hash h();
  h.definekey('key');
  h.definedata('key','count');
  h.definedone();
end;
set Numbers end=last;
array x{*} Digit: ;
do i=1 to dim(x);
 key=x{i};
 if MB='M' then do;
   if h.find()=0 then do;count+2;h.replace();end;
    else do;count=2;h.replace();end;
 end;
 else if MB='B' then do;
   if h.find()=0 then do;count+(-2);h.replace();end;
 end;
end;
 if last then h.output(dataset:'want(where=(count gt 0))');
run;

Xia Keshan

Do you have an eloquent solution to an already working solution?

Re: Do you have an eloquent solution to an already working solution?

Re: Do you have an eloquent solution to an already working solution?

Re: Do you have an eloquent solution to an already working solution?

Re: Do you have an eloquent solution to an already working solution?

Re: Do you have an eloquent solution to an already working solution?

Re: Do you have an eloquent solution to an already working solution?

Re: Do you have an eloquent solution to an already working solution?

Re: Do you have an eloquent solution to an already working solution?

Re: Do you have an eloquent solution to an already working solution?

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away