BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nihv
Calcite | Level 5

Hello fellow SAS users!

 

I have a rather large dataset (several million), with 52*3 variables (one for each week of the year, for the years 2016-2018), named y_1601-y_1652, y_1701-y_1752, and y_1801-y_1852.

What I am trying to do, is to create a dataset consisting of observations where the code 111 occurs in any of the weeks.

In other words, if 111 is not observed a single time in any week over the three years, I want to delete those observations.

 

For example:

Data have

id  y_1601  y_1602  y_1603 ...

1                                           ...

2       23                      111    ...

3       52         111       111    ...

4                                 102    ...

5       111       111       382    ...

6                    162       102    ...

7                    111        12     ...

8                      1         953    ...

9       111        64          2      ...

10                  111                 ...

...       ...           ...         ...      ...

 

Data want

id  y_1601  y_1602  y_1603 ...

2       23                      111    ...

3       52         111       111    ...

5       111       111       382    ...

7                    111        12     ...

9       111        64          2      ...

10                  111                 ...

...       ...           ...         ...      ...

 

Any help is greatly appreciated, and I hope I have not somehow missed a solution to this problem somewhere on the forum.

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

I believe the WHICHN function (if the values are numeric) or WHICHC function (if the values are character) will perform this task for you.

--
Paige Miller

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26

I believe the WHICHN function (if the values are numeric) or WHICHC function (if the values are character) will perform this task for you.

--
Paige Miller
nihv
Calcite | Level 5
Thank you so much for your help and quick response 🙂
novinosrin
Tourmaline | Level 20
data want;
set have;
array t(*) y_1601  y_1602  y_1603;/*and the rest*/
if 111 in t;
run;
/*or*/
data want;
set have;
array t(*) y_1601  y_1602  y_1603;/*and the rest*/
if whichn(111, of t[*]);
run;

Untested coz your sample isn't readable

data_null__
Jade | Level 19

IN ARRAY.

 

Data have;
   infile cards missover;
   input id  y_1601  y_1602  y_1603;
   cards;
1   
2 23 . 111 
3 52 111 111 
4 . . 102 
5 111 111 382 
6 . 162 102 
7 . 111 12 
8 . 1 953 
9 111 64 2
10 . 111 .
;;;;
   run;
proc print;
   run; 

data want;
   set have;
   array _y[*] y_:;
   if 111 in _y;
   run;
proc print;
   run; 
PaigeMiller
Diamond | Level 26

Thanks, @data_null__. That works!

 

However, I have never seen or used "IN ARRAY" syntax before, I did not know it existed, so I went to search for documentation, and ... well ... searching the docs for "IN" yields a bazillion un-related hits. Anybody have a link to the documentation where this syntax is explained?

 

Nevermind, I found it: https://documentation.sas.com/?docsetId=lrcon&docsetTarget=p00iah2thp63bmn1lt20esag14lh.htm&docsetVe...

--
Paige Miller
novinosrin
Tourmaline | Level 20

 

 

http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000780367.htm

 

The IN Operator in Numeric Comparisons

 

You can use a shorthand notation to specify a range of sequential integers to search. The range is specified by using the syntax M:N as a value in the list to search, where M is the lower bound and N is the upper bound. M and N must be integers, and M, N, and all the integers between M and N are included in the range. For example, the following statements are equivalent.

  • y = x in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10);

  • y = x in (1:10);

You can use multiple ranges in the same IN list, and you can use ranges with other constants in an IN list. The following example shows a range that is used with other constants to test if X is 0, 1, 2, 3, 4, 5, or 9.

if x in (0,9,1:5);

 

You can also use the IN operator to search an array of numeric values. For example, the following code creates an array a, defines a constant x, and then uses the IN operator to search for x in array a Note that the array initialization syntax of array a{10} (2*1:5) creates an array that contains the initial values of 1, 2, 3, 4, 5, 1, 2, 3, 4, 5.

data _null_;
   array a{10} (2*1:5);
   x=99;
   y = x in a;
   put y=;
   a{5} = 99;
   y = x in a;
   put y=;
run;

 

 

Results From Using the IN Operator to Search an Array of Numeric Values (partial output)

1    data _null_;
2       array a[10] (2*1:5);
3       x=99;
4       y = x in a;
5       put y=;
6       a[5] = 99;
7       y = x in a;
8       put y=;
9    run;
y=0
y=1

Note:   PROC SQL does not support this syntax.  [cautionend]

Reeza
Super User
if whichn(111, of y_: );

 

You don't need an array declaration if your data is as shown and the only variables starting with Y_ are the variables you're interested in. 

 

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 922 views
  • 6 likes
  • 5 in conversation