SAS Procedures

mwilso8 · Posted 10-03-2013 12:03 PM

Hello all, I was hoping someone could help me with a formula that uses Proc Print on one data set for only certain values which are populated in a separate data set. For example if I have the following two tables:

Unique_Identifier
1
2
3

Unique_Identifier	Name	Sales
1	Joe	175
2	Bill	165
3	John	150
4	Mark	200
5	Barkevious	100

How can I write a procedure so that I print of only the values that populate in table 1 under unique identifier?

I have tried just using

proc print data=Table.1; Where Unique_Identifier EQ "1"; run;

And while this works when running a smaller set of data if you try and do this for thousands of variables (which I am hoping to do) it takes to long. Any help I could get would be greatly appreciated, thank you in advance!

hsharmas · Posted 10-03-2013 06:21 PM

Yes, as ballardw said you can use an IN operator for selecting set of values. But you mentioned that to print the observations where uniqueidentiier in first table(lookup) are 35k and sec table is 5million. In that case an efficient method is to use PROC SQL. One method is explained above in my first reply and more efficient will be:

So you have in table1 your set of uniq values which you want to print from table2 which contains a larger set of values.

ODS CSV File="C:\Temp.csv";

proc sql;

select a.*

from table2 as a, table1 as b

where a.uniqident = b.uniqident;

quit;

ODS CSV File="C:\Temp.csv";

View solution in original post

ballardw · Posted 10-03-2013 12:10 PM

What you are attempting is not clear. Why would you run the exact same code 3 times? Are wanting to create an output table for each value of unique_identifier? That would be BY group processing which would require sorting if not already done.

proc print data= datasetname;

by Unique_identifer;

run;

If this isn't what your looking for provide some examples of desired output.

mwilso8 · Posted 10-03-2013 12:42 PM

Thank you for the quick reply sorry for not being more clear, I made an error in the example code, it was supposed to look like this:

proc print data=Table.1; Where Unique_Identifier EQ "1"; run;

proc print data=Table.1; Where Unique_Identifier EQ "2"; run;

proc print data=Table.1; Where Unique_Identifier EQ "3"; run;

The idea is that the second table contains many unique identifiers and I only want to print a subset of the data. My goal would be for the output table to simply print the contents of the rows in the second table with the Unique_Identifier value of 1,2 and 3. I am hoping to do this on a large data set where I know the values of ~35k unique identifiers and need to extract them from a second table that contains ~5million rows of unique identifiers.

Hope that clarifies things

hsharmas · Posted 10-03-2013 12:31 PM

Hi,

As ballardw said your query is not clear but what I learned from the query is that you want to compare the values of Unique Identifier in the two tables and print out the selected obs.

For that you can use Proc SQL

Ex: proc sql;

select *

from table2

where select uniqueidentifier in (select uniqueidentifier from table1 where ........);

quit;

hsharmas · Posted 10-03-2013 01:20 PM

%macro print_out();

proc sql;

select distinct count(uniqueidentifier)

from table1

into: distinct_vars;

quit;

proc sort data = table1 nodupkey out = uniq;

by uniqueidentifier;

run;

data _null_;

set uniq;

call symput('varb'||LEFT(PUT(_N_, 4.)), TRIM(uniqueidentifier));

run;

do i = 1 to distinct_vars;

ods html file = "Print_&distinct_vars..xls";

proc print data = table2;

where uniqueidentifier = "&&varb&i";

run;

ods html close;

%mend print_out;

hsharmas · Posted 10-03-2013 01:20 PM

I think this should solve the issue....

mwilso8 · Posted 10-03-2013 05:16 PM

I'm not nearly advanced as you guys and just need to extract the data so here is what I ended up doing, let me know if you have any thoughts on this

ODS CSV File="C:\Temp.csv";

Proc Print data=Table2;

where

Unique_Identifier EQ 1 or

Unique_Identifier EQ 2 or

Unique Identifier EQ 3;

run;

ODS CSV CLOSE;

This worked perfectly for a smaller set of data but I ask for replies because like I said I need to run this on a much larger dataset and am worried about speed/performance. Any help would be much appreciated, thanks!

ballardw · Posted 10-03-2013 05:44 PM

Shortcut in syntax for a list of variables is to use the IN operator

where Unique_identifier in (1,2,3);

Put the values in quotes if they are character values.

hsharmas · Posted 10-03-2013 06:21 PM

Yes, as ballardw said you can use an IN operator for selecting set of values. But you mentioned that to print the observations where uniqueidentiier in first table(lookup) are 35k and sec table is 5million. In that case an efficient method is to use PROC SQL. One method is explained above in my first reply and more efficient will be:

So you have in table1 your set of uniq values which you want to print from table2 which contains a larger set of values.

ODS CSV File="C:\Temp.csv";

proc sql;

select a.*

from table2 as a, table1 as b

where a.uniqident = b.uniqident;

quit;

ODS CSV File="C:\Temp.csv";

mwilso8 · Posted 10-04-2013 08:41 AM

Awesome, thank you for your help!

SAS Procedures

Proc Print a Subset of Data

Re: Proc Print a Subset of Data

Re: Proc Print a Subset of Data

Re: Proc Print a Subset of Data

Re: Proc Print a Subset of Data

Re: Proc Print a Subset of Data

Re: Proc Print a Subset of Data

Re: Proc Print a Subset of Data

Re: Proc Print a Subset of Data

Re: Proc Print a Subset of Data

Re: Proc Print a Subset of Data

Pattern matching and subset

proc contents -prevent print

Create a subset data after removing outliers

Subsetting data

Referencing SAS Macro Variable in Data Subsetting

Follow Us

What is...

SAS Procedures

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...