Hi all,
data is available in pdf file. how to group the variables depend on one value. Thanks in advance all.
pdf file all the information is available and desired result also available
so if you know the what is sufficient process for this result please post your valid comments.
Thanks in advance.
Your constraints seems to be not enough. I am getting a solution that is fluke.
data want;
retain Tot 0 numemp 0 ;
set have end = eof;
flag = 0;
Tot + salaries;
numemp + 1;
if Tot > 20000 then do;
if Tot > 24000 and ^eof then do;
temp = tot - salaries;
Tot = temp;
flag = 1;
put Tot = salaries = temp =;
end;
output;
Tot = ifN(flag, Salaries, 0);
numemp = 0;
end;
keep Tot numemp;
run;
I missed to revise the number of employees when adjustment is made.
The revised program takes care of this.
data want;
retain Tot 0 numemp 0 ;
set have end = eof;
flag = 0;
Tot + salaries;
numemp + 1;
if Tot > 20000 then do;
if Tot > 24000 and ^eof then do;
temp = tot - salaries;
Tot = temp;
numemp +- 1;
flag = 1;
put Tot = salaries = temp =;
end;
output;
Tot = ifN(flag, Salaries, 0);
numemp = ifN(flag, 1, 0);
end;
keep Tot numemp;
run;
It was some different from my output. first sort the data in ascending order in this case.
i need only 10 groups. i am not understanding how to divide the groups. in this case only 10 groups. But some cases its only five depending on value in this case 50000. So if you know this process please post. how to divide the groups based on one value.
sir your code is working but last two records are missing in this case. its skipping last 2 records
Your posts have been mainly telling to find 10 Groups by a kind of plus or minus approach.
Here is an approach which may meet your needs.
The Algorithm used:
[1] Use the Proc univariate and get the Percentiles for Deciles(10 Groups.
[2] Load the Deciles in an Array, and determine the i-th group where Salaries falls.
[3] Have an another Array to accumulate the number of records that falls in the group.
[4] At the end reading the data set, Use the arrays to output Group, End_Salary, and Count.
The following PROC finds the ten Percentils(P_10, P_20, ....P_100).
proc univariate data = have noprint;
var salaries;
output pctlpre = P_ pctlpts = 10 to 100 by 10;
run;
The Percentiles are:
P_10 P_20 P_30 P_40 P_50 P_60 P_70 P_80 P_90 P_100
1800 2950 3600 4850 6000 6500 7300 8050 9050 9600
This helps to determine the Group.
For instance, Salaries <= 1800 belongs to First Group. Salaries <= 6000 belongs to 5-th Group.
The program below with explanatory comments will get the output.
data want;
if _n_ = 1 then do;
set data1;
end;
/* Load P_10 to P_100 in array k[ ] */
array k
/* Have array m[ ] to hold the counts of obsevations falling in each Group */
array m[10] _temporary_;
do until(eof);
set have end = eof;
/* Check the correct i and increment the counter(in m). Then leave the i-loop
to process the next record */
do i = 1 to 9;
if salaries <= k then do;
m + 1;
leave;
end;
/* If Salaries > P_90 then increment the counter for 10-th Group */
else if salaries > k[9] then do;
m[10] + 1;
leave;
end;
end;
end;
/* Use the Array m[ ] for COUNT and k[ ] for END_SALARY */
do group = 1 to 10;
end_salary = k[group];
count = m[group];
output;
end;
stop;
keep group end_salary count;
run;
The output is:
end_
group salary count
1 1800 4
2 2950 4
3 3600 4
4 4850 4
5 6000 5
6 6500 4
7 7300 3
8 8050 4
9 9050 4
10 9600 4
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.