turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Proc Print a Subset of Data

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-03-2013 12:03 PM

Hello all, I was hoping someone could help me with a formula that uses Proc Print on one data set for only certain values which are populated in a separate data set. For example if I have the following two tables:

Unique_Identifier |
---|

1 |

2 |

3 |

Unique_Identifier | Name | Sales |
---|---|---|

1 | Joe | 175 |

2 | Bill | 165 |

3 | John | 150 |

4 | Mark | 200 |

5 | Barkevious | 100 |

How can I write a procedure so that I print of only the values that populate in table 1 under unique identifier?

I have tried just using

proc print data=Table.1; Where Unique_Identifier EQ "1"; run;

proc print data=Table.1; Where Unique_Identifier EQ "1"; run;

proc print data=Table.1; Where Unique_Identifier EQ "1"; run;

And while this works when running a smaller set of data if you try and do this for thousands of variables (which I am hoping to do) it takes to long. Any help I could get would be greatly appreciated, thank you in advance!

Accepted Solutions

Solution

10-03-2013
06:21 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-03-2013 06:21 PM

Yes, as ballardw said you can use an IN operator for selecting set of values. But you mentioned that to print the observations where uniqueidentiier in first table(lookup) are 35k and sec table is 5million. In that case an efficient method is to use PROC SQL. One method is explained above in my first reply and more efficient will be:

So you have in table1 your set of uniq values which you want to print from table2 which contains a larger set of values.

ODS CSV File="C:\Temp.csv";

proc sql;

select a.*

from table2 as a, table1 as b

where a.uniqident = b.uniqident;

quit;

ODS CSV File="C:\Temp.csv";

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-03-2013 12:10 PM

What you are attempting is not clear. Why would you run the exact same code 3 times? Are wanting to create an output table for each value of unique_identifier? That would be BY group processing which would require sorting if not already done.

proc print data= datasetname;

by Unique_identifer;

run;

If this isn't what your looking for provide some examples of desired output.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-03-2013 12:42 PM

Thank you for the quick reply sorry for not being more clear, I made an error in the example code, it was supposed to look like this:

proc print data=Table.1; Where Unique_Identifier EQ "1"; run;

proc print data=Table.1; Where Unique_Identifier EQ "2"; run;

proc print data=Table.1; Where Unique_Identifier EQ "3"; run;

The idea is that the second table contains many unique identifiers and I only want to print a subset of the data. My goal would be for the output table to simply print the contents of the rows in the second table with the Unique_Identifier value of 1,2 and 3. I am hoping to do this on a large data set where I know the values of ~35k unique identifiers and need to extract them from a second table that contains ~5million rows of unique identifiers.

Hope that clarifies things

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-03-2013 12:31 PM

Hi,

As ballardw said your query is not clear but what I learned from the query is that you want to compare the values of Unique Identifier in the two tables and print out the selected obs.

For that you can use Proc SQL

Ex: proc sql;

select *

from table2

where select uniqueidentifier in (select uniqueidentifier from table1 where ........);

quit;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-03-2013 01:20 PM

%macro print_out();

proc sql;

select distinct count(uniqueidentifier)

from table1

into: distinct_vars;

quit;

proc sort data = table1 nodupkey out = uniq;

by uniqueidentifier;

run;

data _null_;

set uniq;

call symput('varb'||LEFT(PUT(_N_, 4.)), TRIM(uniqueidentifier));

run;

do i = 1 to distinct_vars;

ods html file = "Print_&distinct_vars..xls";

proc print data = table2;

where uniqueidentifier = "&&varb&i";

run;

ods html close;

%mend print_out;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-03-2013 01:20 PM

I think this should solve the issue....

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-03-2013 05:16 PM

I'm not nearly advanced as you guys and just need to extract the data so here is what I ended up doing, let me know if you have any thoughts on this

ODS CSV File="C:\Temp.csv";

Proc Print data=Table2;

where

Unique_Identifier EQ 1 or

Unique_Identifier EQ 2 or

Unique Identifier EQ 3;

run;

ODS CSV CLOSE;

This worked perfectly for a smaller set of data but I ask for replies because like I said I need to run this on a much larger dataset and am worried about speed/performance. Any help would be much appreciated, thanks!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-03-2013 05:44 PM

Shortcut in syntax for a list of variables is to use the IN operator

where Unique_identifier in (1,2,3);

Put the values in quotes if they are character values.

Solution

10-03-2013
06:21 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-03-2013 06:21 PM

Yes, as ballardw said you can use an IN operator for selecting set of values. But you mentioned that to print the observations where uniqueidentiier in first table(lookup) are 35k and sec table is 5million. In that case an efficient method is to use PROC SQL. One method is explained above in my first reply and more efficient will be:

So you have in table1 your set of uniq values which you want to print from table2 which contains a larger set of values.

ODS CSV File="C:\Temp.csv";

proc sql;

select a.*

from table2 as a, table1 as b

where a.uniqident = b.uniqident;

quit;

ODS CSV File="C:\Temp.csv";

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-04-2013 08:41 AM

Awesome, thank you for your help!