BookmarkSubscribeRSS Feed
avnsas
Fluorite | Level 6

Hi, I am trying to extract top 3 minimum and maximum actual sales of each country (by group)  from the dataset sashelp.prdsal2. My code is not dynamic, If any one could suggest how to incorporate changes within a single datastep, that would be nice. The program should be dynamic (if instead of top-3 we need top-n then it should work by rplacing 3 by n). the output should look like the attached screenshot. My code is -

 

 

 

proc sort data= sashelp.prdsal2 out=orig;
by country actual;
run;

Data stage1;
set orig;
by country;
retain min ;
if first.country=1 then min=actual ;
if last.country =1 then max=actual ;
if last.country=1;
keep country min max;
run;

Data orig_6;
set orig;
by country;
if first.country=1 then delete;
if last.country =1 then delete; 
run;

data stage2;
set orig_6;
by country;
retain min ;
if first.country=1 then min=actual ;
if last.country =1 then max=actual ;
if last.country=1;
keep country min max;
run;

Data orig_12;
set orig_6;
by country;
if first.country=1 then delete;
if last.country =1 then delete; 
run;

data stage3;
set orig_12;
by country;
retain min ;
if first.country=1 then min=actual ;
if last.country =1 then max=actual ;
if last.country=1;
keep country min max;
run;

Data finalstage;
set stage1 stage2 stage3;
proc sort data=finalstage ;
by country min max;
run;

output.png

9 REPLIES 9
Reeza
Super User

Look at PROC UNIVARIATE. The second example on the documentation details how to extract these numbers and the NEXTRVAL & NEXTROBS control how many are displayed. 

 

Heres a link to the documentation:

http://support.sas.com/documentation/cdl/en/procstat/66703/HTML/default/viewer.htm#procstat_univaria...

 


@avnsas wrote:

Hi, I am trying to extract top 3 minimum and maximum actual sales of each country (by group)  from the dataset sashelp.prdsal2. My code is not dynamic, If any one could suggest how to incorporate changes within a single datastep, that would be nice. The program should be dynamic (if instead of top-3 we need top-n then it should work by rplacing 3 by n). the output should look like the attached screenshot. My code is -

 

 

 

proc sort data= sashelp.prdsal2 out=orig;
by country actual;
run;

Data stage1;
set orig;
by country;
retain min ;
if first.country=1 then min=actual ;
if last.country =1 then max=actual ;
if last.country=1;
keep country min max;
run;

Data orig_6;
set orig;
by country;
if first.country=1 then delete;
if last.country =1 then delete; 
run;

data stage2;
set orig_6;
by country;
retain min ;
if first.country=1 then min=actual ;
if last.country =1 then max=actual ;
if last.country=1;
keep country min max;
run;

Data orig_12;
set orig_6;
by country;
if first.country=1 then delete;
if last.country =1 then delete; 
run;

data stage3;
set orig_12;
by country;
retain min ;
if first.country=1 then min=actual ;
if last.country =1 then max=actual ;
if last.country=1;
keep country min max;
run;

Data finalstage;
set stage1 stage2 stage3;
proc sort data=finalstage ;
by country min max;
run;

output.png


 

avnsas
Fluorite | Level 6

Hi, thanks for the reply but I am looking for a solution via datastep only, no PROC FREQ / PROC SUMMARY / PROC MEANS etc. Can something be done using DO-Loop  or point=  ? 

ballardw
Super User

Why do you want to repeatedly use a hammer (datastep) when Proc Means will do this with one pass through the data (the tool that fits the job)?

Reeza
Super User

@ballardw wrote:

Why do you want to repeatedly use a hammer (datastep) when Proc Means will do this with one pass through the data (the tool that fits the job)?


Because it's homework. 

avnsas
Fluorite | Level 6

I agree it is much easier to do via any of the proc steps mentioned, but I was curious to know how it can be done via datastep only for the purpose of learning [and yes it is homework, you caught me].

PGStats
Opal | Level 21

Note: This task can be done in two steps: a sort and a data step. For extra bonus points.

 

Hint: you will need arrays.

Hint: consider using the remainder (MOD) operation.

PG
KachiM
Rhodochrosite | Level 12

Here is a data step solution - a proc and two data steps. No other Procs. I feel that the solution is useful for learning to code. The steps are:

[1] Sort the data set by Country. This step can be avoided if we are prepared to store the ACTUAL into a _temporary_ array for each

country and then use SORTN().

 

[2] Pre-process the sorted data set, to get the number of Countries and the number of observations per Country.

 

[3] In the last data step, find the starting number of _N_ for each group, save them in an array (LOW[*]) and the ending _N_ s

in another array (K[*]).

 

[4] Then get the 3 (dynamic number -- &topbot) values from the bottom side and top side.

 

[5] You get the output as you have wanted.

 

 

Better way is to say the Community what you found instead of keeping silent.

proc sort data= sashelp.prdsal2(keep = country actual) out=prdsal;
by country actual;
run;


data _null_;
   length group $100;
   retain group;
   do num = 1 by 1 until(last.country);
      set prdsal end = eof;
      by country;
   end;
   if last.country then  call catx(', ', group, num);
   if eof then do;
      call symputx('group', group);
      ngroup = countw(group, ',');
      call symputx('ngroup', ngroup);
   end;
run;

%put &group;

%put &ngroup;

%let topbot = 3;

data bottop;
   array k[&ngroup] _temporary_ (&group);
   array low[&ngroup] _temporary_ (&group);
   low[1] = 1;
   do i = 2 to &ngroup;
      low[i] = k[i - 1] + 1;
      k[i] = k[i - 1] + k[i];
   end;

   do g = 1 to &ngroup ;
      do i = 1 to ⌶
         p = low[g] + i - 1;
         set prdsal point = p;
         Bot = actual;
         p = k[g] - &topbot + i;
         set prdsal point = p;
         Top = actual;
         output ;
      end;
   end;
stop;
drop i g actual;
run;      
      
proc print data = Bottop;
run;
avnsas
Fluorite | Level 6

thanks for the great code, really helpful explaination !!

PGStats
Opal | Level 21

If you insist on doing your own programming, one simple way is like this:

 

%let n = 3;

proc sort data= sashelp.prdsal2 out=orig;
by country actual;
run;

data mins;
set orig; by country;
if first.country then order = 0;
order + 1;
if order <= &n;
actualMin = actual;
keep country actualMin;
run;

proc sort data=orig;
by country descending actual;
run;

data maxs;
set orig; by country;
if first.country then order = 0;
order + 1;
if order <= &n;
actualMax = actual;
keep country actualMax;
run;

data want;
merge mins maxs; by country;
run;

proc print noobs; run;

PG

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1639 views
  • 4 likes
  • 5 in conversation