turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Retrive and populate data

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-06-2008 02:08 AM

Hi all,

I have a question regarding retrieving data from a large database file. I have data in this format:

num1 num2 num3 A B Unique_obs

1, 1, 1, 100.0, 70.0, -10.5

1, 6, 30, 100.0, 70.0, 23.0

1, 8, 31, 100.0, 70.0, 32.0

1, 1, 1, 105.0, 70.0, -7.5

1, 6, 30, 105.0, 70.0, 22.0

1, 8, 31, 105.0, 70.0, 16.0

1, 1, 1, 110.0, 70.0, -2.0

1, 6, 30, 110.0, 70.0, 12.0

1, 8, 31, 110.0, 70.0, 21.0

and so forth...

Now, I would to retrieve the maximum value of Unique_obs per A and B combination. For example..for the first three rows of the above hypothetical dataset (A=100.0, B=70.0), the max value of the Unique_obs is 32.0; for the rows 4-6 (A=105.0; B=70.0), the max value of Unique_obs is 22.0 and for rows 7-9 (A=110.0; B=70.0), max value is 21.0.

I would like to retrieve the max. value of Unique_obs for each combination of A and B such that SAS selects and outputs the whole row containing the max. value observation (ex: 1, 8, 31, 100.0, 70.0, 32.0). Can anybody offer an advice? Do I use PROC SQL? Thank you.

I have a question regarding retrieving data from a large database file. I have data in this format:

num1 num2 num3 A B Unique_obs

1, 1, 1, 100.0, 70.0, -10.5

1, 6, 30, 100.0, 70.0, 23.0

1, 8, 31, 100.0, 70.0, 32.0

1, 1, 1, 105.0, 70.0, -7.5

1, 6, 30, 105.0, 70.0, 22.0

1, 8, 31, 105.0, 70.0, 16.0

1, 1, 1, 110.0, 70.0, -2.0

1, 6, 30, 110.0, 70.0, 12.0

1, 8, 31, 110.0, 70.0, 21.0

and so forth...

Now, I would to retrieve the maximum value of Unique_obs per A and B combination. For example..for the first three rows of the above hypothetical dataset (A=100.0, B=70.0), the max value of the Unique_obs is 32.0; for the rows 4-6 (A=105.0; B=70.0), the max value of Unique_obs is 22.0 and for rows 7-9 (A=110.0; B=70.0), max value is 21.0.

I would like to retrieve the max. value of Unique_obs for each combination of A and B such that SAS selects and outputs the whole row containing the max. value observation (ex: 1, 8, 31, 100.0, 70.0, 32.0). Can anybody offer an advice? Do I use PROC SQL? Thank you.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-06-2008 03:14 AM

Hi,

proc sql;

select * from table_name

group by a,b having unique_obs=max(unique_obs);

quit;

You will get the required output.

proc sql;

select * from table_name

group by a,b having unique_obs=max(unique_obs);

quit;

You will get the required output.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-07-2008 12:08 AM

Thank you for the response. I have another question related to my previous query. I have raw data in large text files in the data format as described in my previous posting. I was wondering what would be the syntax in PROC SQL to be used for retrieving data from text files (similar to an 'infile' statement in the data step that can be used to read data from text files into SAS). Can anybody advise? Thanks in advance.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-07-2008 12:18 PM

There is no equivalent to a DATA step and INFILE / INPUT statement processing with PROC SQL. The SAS support website has technical papers on this topic - one is listed below (as a link). In this paper, there is an illustration of how you must code a PROC SQL invocation to load a table, but it would be from instream textual data content, not a "flat file" type of input.

Scott Barry

SBBWorks, Inc.

http://www2.sas.com/proceedings/sugi29/269-29.pdf

Scott Barry

SBBWorks, Inc.

http://www2.sas.com/proceedings/sugi29/269-29.pdf

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-09-2008 04:47 PM

Hi,

Thanks for the information. My knowledge of SQL in SAS is minimal, so I am little unwilling to further proceed in analyzing my data using SQL. However, I am very comfortable using other procedures in SAS and have adequate knowledge to enable me to understand the logic behind the SAS codes.

I would like to briefly explain the data structure that I am working with and the end result I would like to achieve using these data sets.

I have a single text file containing temperature observations from five thousand locations, for one year, in this format: year, month, day, latitude, longitide, temperature. Therefore, each location, that has a unique combination of latitude and longitude, has daily temperature observations for one year. Now, I would like to select a row (year, month, day, latitude, longitude, temperature) that has the maximum value of temperature for that location for the entire year. Like that, I need the maximum value of temperature, for each of the 5000 locations, to be saved in a single text file.

The SAS code I used:

data abc;

infile 'C:\work\xyz.txt' dlm=',' firstobs=2;

input year month day lat long maxtemp;

proc sort data=abc out=tmp1;

by lat long maxtemp;

run;

data abc1; set tmp1;

by lat; if last.lat = 1;

run;

proc print data=abc1;

run;

Although, the SAS log doesn't show any problem, I am unable to get the max. temperature values for all the locations. Any offer of advice will be greatly appreciated.

Thanks for the information. My knowledge of SQL in SAS is minimal, so I am little unwilling to further proceed in analyzing my data using SQL. However, I am very comfortable using other procedures in SAS and have adequate knowledge to enable me to understand the logic behind the SAS codes.

I would like to briefly explain the data structure that I am working with and the end result I would like to achieve using these data sets.

I have a single text file containing temperature observations from five thousand locations, for one year, in this format: year, month, day, latitude, longitide, temperature. Therefore, each location, that has a unique combination of latitude and longitude, has daily temperature observations for one year. Now, I would like to select a row (year, month, day, latitude, longitude, temperature) that has the maximum value of temperature for that location for the entire year. Like that, I need the maximum value of temperature, for each of the 5000 locations, to be saved in a single text file.

The SAS code I used:

data abc;

infile 'C:\work\xyz.txt' dlm=',' firstobs=2;

input year month day lat long maxtemp;

proc sort data=abc out=tmp1;

by lat long maxtemp;

run;

data abc1; set tmp1;

by lat; if last.lat = 1;

run;

proc print data=abc1;

run;

Although, the SAS log doesn't show any problem, I am unable to get the max. temperature values for all the locations. Any offer of advice will be greatly appreciated.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-10-2008 10:51 AM

I think the neatest solution is to combine your first data step import the data into sas, can be done as a view if wish not to store your data twice. Then use the above suggested SQL on that table/view.

/Linus

/Linus

Data never sleeps

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-12-2008 05:42 PM

Thanks Binod and Linus. Although I was a little reluctant to use Proc SQL earlier, eventually I did, and it was far faster than the Data step of SAS. I added a few other things such as creating a table in SQL and exporting the resultant output as an MS Access file using the Export Wizard in SAS. It worked great and the SQL procedure took less than a minute to output the results. Way to go!!

Kris

Kris