DATA Step, Macro, Functions and more

Data step

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 17
Accepted Solution

Data step

Hi All,

 

This time I am more descriptive for what I need. I created some of the numbers using the data step and numbers are as follows

data test;

input a b c d;

cards;

10 9 13 10

7 7 9 8

6 7 6 8

2 3 3 4

3 4 6 6

6 7 6 7

8 7 7 7

9 8 11 10

9 9 9 9

;

run;

 

I would like to acheive below information

CaseabcdFlag_aFlag_bFlag_c Flag dCountFinal_number
1109131010913.310
277987.9838
3676867.837
4676767..47
598111098111049
6233423.433
73466346.44
8877787..48
999999...49

 

Case:1   Any number repeat assgin missing value to the next repeated variable like Flag_d is missing and count column should be 3 because only one missing and in the final_number column I need 10 because minimum of middle two number . e.g 9,10,10,13 middle two number 10 is the minimum numer  
Case2: Similar as above example
Case 3:repeated number assign missing value in respective flag variable following the flag_a and count 3 as there are only one missing values in the flag variables and final number is 7
Case 4: there are two missing values in the flag variables such as flag_c and flag_d therefore count is set to 4 and final_number is 7
Case5: there are no repeated numbers means no missing flag count 4 and get the final_number is 3

Accepted Solutions
Solution
‎04-04-2017 10:38 AM
Super User
Posts: 9,671

Re: Data step

data test;
infile cards truncover expandtabs;
input case a b c d;
cards;
1	10	9	13	10	10	9	13	.	3	10
2	7	7	9	8	7	.	9	8	3	8
3	6	7	6	8	6	7	.	8	3	7
4	6	7	6	7	6	7	.	.	4	7
5	9	8	11	10	9	8	11	10	4	9
6	2	3	3	4	2	3	.	4	3	3
7	3	4	6	6	3	4	6	.	4	4
8	8	7	7	7	8	7	.	.	4	8
9	9	9	9	9	9
;
run;
data have;
 set test;
 array x{4} _temporary_;
 array y{4} a b c d;
 array z{4} flag_a flag_b flag_c flag_d;

 call missing(of x{*});
 do i=1 to dim(y);
  if y{i} not in x then do;
   x{i}=y{i};
   z{i}=y{i};
  end;
 end;
 count=ifn(nmiss(of z{*})=1,3,4);
 idx=whichn(min(of x{*}),of x{*});
 if n(of x{*}) ne 1 then x{idx}=.;
 final_number=min(of x{*});
 drop i idx;
run;
proc print;run;

View solution in original post


All Replies
Super User
Posts: 10,466

Re: Data step

You appear to be missing part of the requirement.

I think this will assign the flag values though it is a tad difficult to be sure since you appear to have changed the order of the data from the starting data.

 

data test;
   input a b c d;
   array v a b c d;
   array f flag_a flag_b flag_c flag_d;
   do i= 1 to dim(v);
      if whichn(v[i], of v(*))=i then f[i]=v[i];
   end;
   drop i;
cards;
10 9 13 10
7 7 9 8
6 7 6 8
2 3 3 4
3 4 6 6
6 7 6 7
8 7 7 7
9 8 11 10
9 9 9 9
;
run;

 

You should not show your results in a different order that the starting example data unless you specifically provide a sort rule.

I cannot tell if the requirement includes adding a case variable or not. Or what the rule would be from the data given.

 

Your 'count' appears to be inconsistent. What are you counting?

And your "final number" does not make any sense to me at all? You need to provide a much more detailed description.

Occasional Contributor
Posts: 17

Re: Data step

Hi ballardw

Thank you so much for your generous help. My point is how many missing values in the flag variables (from flag_a to flag_d). If there is one missing value than count should set to 3 otherwise 4. For the Final rating I need to get the middle number of the flag variables such as middle of (flag_a - flag_d)

I hope this will explain to you my problem
Super User
Posts: 10,466

Re: Data step

Continuation of https://communities.sas.com/t5/Base-SAS-Programming/Data-step/m-p/346267

 


umesh1 wrote:
Hi ballardw

Thank you so much for your generous help. My point is how many missing values in the flag variables (from flag_a to flag_d). If there is one missing value than count should set to 3 otherwise 4. For the Final rating I need to get the middle number of the flag variables such as middle of (flag_a - flag_d)

I hope this will explain to you my problem
6 2 3 3 4 2 3 . 4 3 3
7 3 4 6 6 3 4 6 . 4

4

In the above two rowns you have a "count" of 3 for the first row and a count of 4 for the second. But both are missing exactly one flag value.

 

And your "middle" definition is not consistent either. On your row 8 with flag_a=8 and flag_b=7 you set 8 as the final but for row 5 you use 9 for the final when there are two values greater. So is your rule a round up or round down or something else.

Solution
‎04-04-2017 10:38 AM
Super User
Posts: 9,671

Re: Data step

data test;
infile cards truncover expandtabs;
input case a b c d;
cards;
1	10	9	13	10	10	9	13	.	3	10
2	7	7	9	8	7	.	9	8	3	8
3	6	7	6	8	6	7	.	8	3	7
4	6	7	6	7	6	7	.	.	4	7
5	9	8	11	10	9	8	11	10	4	9
6	2	3	3	4	2	3	.	4	3	3
7	3	4	6	6	3	4	6	.	4	4
8	8	7	7	7	8	7	.	.	4	8
9	9	9	9	9	9
;
run;
data have;
 set test;
 array x{4} _temporary_;
 array y{4} a b c d;
 array z{4} flag_a flag_b flag_c flag_d;

 call missing(of x{*});
 do i=1 to dim(y);
  if y{i} not in x then do;
   x{i}=y{i};
   z{i}=y{i};
  end;
 end;
 count=ifn(nmiss(of z{*})=1,3,4);
 idx=whichn(min(of x{*}),of x{*});
 if n(of x{*}) ne 1 then x{idx}=.;
 final_number=min(of x{*});
 drop i idx;
run;
proc print;run;
Occasional Contributor
Posts: 17

Re: Data step

Hi All,

 

I have below data set and I need to create the extra columns based to the data set.data set is as follows

 

data test;

input id a b c d;

cards;

1 . 1 1 2

2 . 1 2 2

3 . 1 5 1

4 . 1 2 4

5 . 1 2 4

6 . 1 1 2

7 4 3 4 2

;

run;

 

I need to add column count and Want. In the case 1 there are two 1's then want column should have 1

in Case 4 and 5 there is no repeated number that time I need maximum number which is 4. IN case 7 4 is repeated number and maximum number then I need 4 in want column

 

idabcdcountWant
1.11231
2.12232
3.15131
4.12434
5.12434
6.11231
7434244

 

Let me know if some one needs more clarifcation, please.

 

Thanks in advance

Super User
Posts: 9,671

Re: Data step

Please bring a brand new session to let more sas user to see it.

I think the most easy way is using IML code.

 

data test;
input id a b c d;
cards;
1 . 1 1 2
2 . 1 2 2
3 . 1 5 1
4 . 1 2 4
5 . 1 2 4
6 . 1 1 2
7 4 3 4 2
;
run;
proc iml;
use test nobs nobs;
read all var{a b c d} into x[c=vname];
close;

want=j(nobs,1,.);
do i=1 to nobs;
call tabulate(level,freq,x[i,]);
idx=loc(freq=max(freq));
want[i]=max(level[idx]);
end;

create w var{want};
append ;
close;
quit;

data want;
 merge test w;
run;

 

Occasional Contributor
Posts: 17

Re: Data step

Hi,

Somehow I do not have the proc IML access due to the limited access. Could you help me to write the code in the data step or other way, please?

Thanks,
Super User
Posts: 9,671

Re: Data step

You can download the SAS University Edition from sas website. It is totally free and you can use IML.

Super User
Posts: 9,671

Re: Data step

OK. Here is .

 

data test;
input id a b c d;
count=n(a,b,c,d);
cards;
1 . 1 1 2
2 . 1 2 2
3 . 1 5 1
4 . 1 2 4
5 . 1 2 4
6 . 1 1 2
7 4 3 4 2
;
run;

data temp;
 set test;
 array x{*} a b c d;
 do i=1  to dim(x);
   if not missing(x{i}) then do;value=x{i};output;end;
 end;
 keep id value;
run;
proc freq data=temp noprint ;
table id*value/out=temp1 list;
run;
proc sort data=temp1;
by id count value;
run;
data temp2;
 set temp1;
 by id;
 if last.id;
run;

data want;
 merge test temp2(rename=(value=want) keep=id value);
 by id;
run;
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 201 views
  • 2 likes
  • 3 in conversation