turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- How to use where in proc sql to come out dataset l...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-09-2017 08:14 AM

Dear all,

I'm writing a code by sas university edition (9.4) to come out a dataset with column i, j, X, Y. The code is as below:

```
data work.Rand;
%macro Normal_Simulation;
call streaminit(567);
do i=1 to 50;
X=Rand("normal",0.02*j-1,1.0);
IF X>3 OR X<-3 THEN DO
Y=X;
X=0;
i=i-1;
END;
ELSE Y=0;
/*u=Rand("uniform");*/
output;
end;
%mend;
do j=0 to 99;
%Normal_Simulation;
end;
run;
PROC SQL;
CREATE TABLE work.query AS
SELECT j , i , X , Y FROM work.rand;
/*WHERE J=&j AND Y=0;*/
RUN;
QUIT;
```

Can I use any sql expression in Where statement here to come out like only the first 10 percentile dataset of X of each j in this table? Thank you.

MKW

Accepted Solutions

Solution

12-10-2017
07:59 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Michaelcwang2

12-10-2017 07:59 PM

Dear Tom,

For certain reason, it still doesn't work in this way.

```
proc sql ;
173 create table want as
174 select X.*
175 from rand1 X , percentile1 P_n
176 where X.j= percentile1.j
177 and X.x > P_n.p95
```

But I did solve my problem by using PROC MEAN to create another data set by P90 of each j, then merge it with original dataset "Rand" and Merge two by j and Delete observation if X<P90 column. Anyway, thank you for all your support!

MKW

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Michaelcwang2

12-09-2017 11:50 AM

1. Never define a macro inside a data step

2. Show what you have (Data) and what you want

3. Explain the actual problem you’re trying to solve

4. Don’t be loopy.

Before making a macro, you should also start with working code. Can you show what your solution looks like before it’s a macro?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

12-09-2017 04:02 PM

Hi Reeza,

By running the macro, I'll have 200 normal distributions with moving mean populated by 50 (or a few more) variable X with index Y as 0 or 1 if they are beyond a limit. I can collect these X (50*200) then analyze its descriptive statistics with PROC MEAN by each j.

Now I have special interest over the largest few data like n percentile of each distribution, so I would to slice them from this data set out and do the same analysis with similar PROC MEAN.

```
proc means
data=work.query
chartype NWAY
mean std min max n vardef=df skew SKEWNESS KURT KURTOSIS median;
var X;
output
out=work.skewtemp
skew=Distskew KURT=DISKURT max=DISmax median=DISmedian min=DISmin;
where (j between 0 and 199) and Y=0;
class J;
run;
```

Hopefully it helps to clarify my problem. Thank you!

MKW

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Michaelcwang2

12-09-2017 12:57 PM

After removing the unnecessary macro and correcting errors (like the missng semicolon after then do), this is your code.

```
data work.Rand;
do j = 0 to 99;
call streaminit(567);
do i = 1 to 50;
X = Rand("normal",0.02*j-1,1.0);
if X > 3 or X < -3
then do;
Y = X;
X = 0;
i = i - 1;
end;
else Y = 0;
/*u=Rand("uniform");*/
output;
end;
end;
run;
proc sql;
create table work.query as
select j, i, X, Y
from work.rand
/*where J = &j and Y = 0*/
;
quit;
```

From where would you get &j?

---------------------------------------------------------------------------------------------

Maxims of Maximally Efficient SAS Programmers

How to convert datasets to data steps

How to post code

Maxims of Maximally Efficient SAS Programmers

How to convert datasets to data steps

How to post code

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Michaelcwang2

12-09-2017 08:06 PM

If I use PROC Univariate to come out a data file of n-percentile of X by J, is there a way to sql to get X>these values for each j from original dataset ? Thank you.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Michaelcwang2

12-09-2017 09:27 PM

If you have one dataset, HAVE, with J and many X values and another dataset, MEANS, with J and a cutoff value, say P95, then just join them.

```
proc sql ;
create table want as
select a.*
from have a , means b
where a.j= b.j
and a.x > b.p95
;
quit;
```

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Michaelcwang2

12-09-2017 09:55 PM

Thank you, Tom. Will check if it solves .

Solution

12-10-2017
07:59 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Michaelcwang2

12-10-2017 07:59 PM

Dear Tom,

For certain reason, it still doesn't work in this way.

```
proc sql ;
173 create table want as
174 select X.*
175 from rand1 X , percentile1 P_n
176 where X.j= percentile1.j
177 and X.x > P_n.p95
```

But I did solve my problem by using PROC MEAN to create another data set by P90 of each j, then merge it with original dataset "Rand" and Merge two by j and Delete observation if X<P90 column. Anyway, thank you for all your support!

MKW

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Michaelcwang2

12-10-2017 11:13 PM

If your goal is to figure out what's higher than the 95th percentile, I would use the RANK proc instead and then filter that out directly.