BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nihv
Calcite | Level 5

Hello fellow SAS users!

 

I have a rather large dataset (several million), with 52*3 variables (one for each week of the year, for the years 2016-2018), named y_1601-y_1652, y_1701-y_1752, and y_1801-y_1852.

What I am trying to do, is to create a dataset consisting of observations where the code 111 occurs in any of the weeks.

In other words, if 111 is not observed a single time in any week over the three years, I want to delete those observations.

 

For example:

Data have

id  y_1601  y_1602  y_1603 ...

1                                           ...

2       23                      111    ...

3       52         111       111    ...

4                                 102    ...

5       111       111       382    ...

6                    162       102    ...

7                    111        12     ...

8                      1         953    ...

9       111        64          2      ...

10                  111                 ...

...       ...           ...         ...      ...

 

Data want

id  y_1601  y_1602  y_1603 ...

2       23                      111    ...

3       52         111       111    ...

5       111       111       382    ...

7                    111        12     ...

9       111        64          2      ...

10                  111                 ...

...       ...           ...         ...      ...

 

Any help is greatly appreciated, and I hope I have not somehow missed a solution to this problem somewhere on the forum.

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

I believe the WHICHN function (if the values are numeric) or WHICHC function (if the values are character) will perform this task for you.

--
Paige Miller

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26

I believe the WHICHN function (if the values are numeric) or WHICHC function (if the values are character) will perform this task for you.

--
Paige Miller
nihv
Calcite | Level 5
Thank you so much for your help and quick response 🙂
novinosrin
Tourmaline | Level 20
data want;
set have;
array t(*) y_1601  y_1602  y_1603;/*and the rest*/
if 111 in t;
run;
/*or*/
data want;
set have;
array t(*) y_1601  y_1602  y_1603;/*and the rest*/
if whichn(111, of t[*]);
run;

Untested coz your sample isn't readable

data_null__
Jade | Level 19

IN ARRAY.

 

Data have;
   infile cards missover;
   input id  y_1601  y_1602  y_1603;
   cards;
1   
2 23 . 111 
3 52 111 111 
4 . . 102 
5 111 111 382 
6 . 162 102 
7 . 111 12 
8 . 1 953 
9 111 64 2
10 . 111 .
;;;;
   run;
proc print;
   run; 

data want;
   set have;
   array _y[*] y_:;
   if 111 in _y;
   run;
proc print;
   run; 
PaigeMiller
Diamond | Level 26

Thanks, @data_null__. That works!

 

However, I have never seen or used "IN ARRAY" syntax before, I did not know it existed, so I went to search for documentation, and ... well ... searching the docs for "IN" yields a bazillion un-related hits. Anybody have a link to the documentation where this syntax is explained?

 

Nevermind, I found it: https://documentation.sas.com/?docsetId=lrcon&docsetTarget=p00iah2thp63bmn1lt20esag14lh.htm&docsetVe...

--
Paige Miller
novinosrin
Tourmaline | Level 20

 

 

http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000780367.htm

 

The IN Operator in Numeric Comparisons

 

You can use a shorthand notation to specify a range of sequential integers to search. The range is specified by using the syntax M:N as a value in the list to search, where M is the lower bound and N is the upper bound. M and N must be integers, and M, N, and all the integers between M and N are included in the range. For example, the following statements are equivalent.

  • y = x in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10);

  • y = x in (1:10);

You can use multiple ranges in the same IN list, and you can use ranges with other constants in an IN list. The following example shows a range that is used with other constants to test if X is 0, 1, 2, 3, 4, 5, or 9.

if x in (0,9,1:5);

 

You can also use the IN operator to search an array of numeric values. For example, the following code creates an array a, defines a constant x, and then uses the IN operator to search for x in array a Note that the array initialization syntax of array a{10} (2*1:5) creates an array that contains the initial values of 1, 2, 3, 4, 5, 1, 2, 3, 4, 5.

data _null_;
   array a{10} (2*1:5);
   x=99;
   y = x in a;
   put y=;
   a{5} = 99;
   y = x in a;
   put y=;
run;

 

 

Results From Using the IN Operator to Search an Array of Numeric Values (partial output)

1    data _null_;
2       array a[10] (2*1:5);
3       x=99;
4       y = x in a;
5       put y=;
6       a[5] = 99;
7       y = x in a;
8       put y=;
9    run;
y=0
y=1

Note:   PROC SQL does not support this syntax.  [cautionend]

Reeza
Super User
if whichn(111, of y_: );

 

You don't need an array declaration if your data is as shown and the only variables starting with Y_ are the variables you're interested in. 

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1466 views
  • 6 likes
  • 5 in conversation