BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Alireza_Boloori
Fluorite | Level 6

Thank you very much for your thoughts in advance!

 

I use SAS 9.4. I have a data set with 17170 rows, and I want to create an array based on one of variables in this data set. Then, I want to use that array in another data set to filter its observations (i.e., more than 1 million observations). So, if any observation in the new data set is not among those 17170 different values, then that specific row has to be removed. 

 

Here is the code I used (where libsas refers to my local directory):

 

data libsas.new_data;
     set libsas.data1; /* data1 has 17170 rows */
     array schedule(*) X1; /* I want to form array "schedule" based on variable X1 in data1 */

     set libsas.data2; /* data2 is the second data set in which I want to filter observations */
     if X2 in schedule; /* X2 is the variable in data2 where I want to filter observations */
run;

 

However, what I get is a data with 17170 rows (i.e., the same size as my first data), while it should be much more that. Would you please help me to understand what's wrong? Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

You got that result because that is the way SAS works unless told otherwise. If you had read both datasets within separate DO UNTIL loops, you'd have gotten what you wanted. e.g., 

data data1;
  input x1;
  cards;
3
4
5
7
;

data data2;
  input x2 z;
  cards;
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
3 3
4 4
5 5
6 6
7 7
3 3
4 4
5 5
6 6
7 7
;

data new_data (drop=i x1);
  array schedule(20000) _temporary_; 
  do until (eof1);
    set data1 end=eof1;
    i+1;
    schedule(i)=x1;
  end;
  do until (eof2);
    set data2 end=eof2;
    if X2 in schedule then output;
  end;
run;

Art, CEO, AnalystFinder.com

 

View solution in original post

7 REPLIES 7
Astounding
PROC Star

Your array contains one element named X1.  It does not contain thousands of elements, just one.

 

SAS contains many ways to select matches.  Here is one:

 

proc sql;

create table libsas.new_data as select * from libsas.data2

where x2 in (select distinct x1 from libsas.data1);

quit;

art297
Opal | Level 21

You got that result because that is the way SAS works unless told otherwise. If you had read both datasets within separate DO UNTIL loops, you'd have gotten what you wanted. e.g., 

data data1;
  input x1;
  cards;
3
4
5
7
;

data data2;
  input x2 z;
  cards;
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
3 3
4 4
5 5
6 6
7 7
3 3
4 4
5 5
6 6
7 7
;

data new_data (drop=i x1);
  array schedule(20000) _temporary_; 
  do until (eof1);
    set data1 end=eof1;
    i+1;
    schedule(i)=x1;
  end;
  do until (eof2);
    set data2 end=eof2;
    if X2 in schedule then output;
  end;
run;

Art, CEO, AnalystFinder.com

 

Reeza
Super User
SQL is the easiest way to solve this. See @Astounding solution. A SAS data step doesn't process data all at once, it reads a line at a time.But this means it doesn't have information on the records either ahead or below, though there are ways around that. But in general, in SAS, an array is simply a way to reference variables in the same row.
art297
Opal | Level 21

@Reeza@Alireza_Boloori's subject line indicated NO COMBINE of MERGE.

 

And I don't know if I'd say the SQL solution is easier but, in this case, should run about 100 times faster than the datastep approach.

 

Art, CEO, AnalystFinder.com

 

Alireza_Boloori
Fluorite | Level 6

@Reeza Thank you for your feedback! I was thinking the same about using SQL. I guess I have to start learning it.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1052 views
  • 1 like
  • 4 in conversation