BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Michaelcwang2
Obsidian | Level 7

Dear all,

I'm writing a code by sas university edition (9.4) to come out a dataset with column i, j, X, Y. The code is as below:

data work.Rand;

%macro Normal_Simulation;

call streaminit(567);
do i=1 to 50;
	X=Rand("normal",0.02*j-1,1.0);
	IF X>3 OR X<-3 THEN DO
	Y=X;
	X=0;
	i=i-1;
	END;
	
	ELSE Y=0;
	/*u=Rand("uniform");*/
output;
end;

%mend;

do j=0 to 99;
%Normal_Simulation;
end;
run;

PROC SQL;
   CREATE TABLE work.query AS
   SELECT j , i , X , Y FROM work.rand; 
   /*WHERE J=&j AND Y=0;*/
   RUN;
   QUIT;

Can I use any sql expression in Where statement here to come out like only the first 10 percentile dataset of X of each j in this table? Thank you.

 

MKW

 

1 ACCEPTED SOLUTION

Accepted Solutions
Michaelcwang2
Obsidian | Level 7

Dear Tom,

For certain reason, it still doesn't work in this way.

proc sql ;
 173        create table want as
 174        select X.*
 175        from rand1 X , percentile1 P_n
 176        where X.j= percentile1.j
 177          and X.x > P_n.p95

But I did solve my problem by using PROC MEAN to create another data set by P90 of each j, then merge it with original dataset "Rand" and Merge two by j and Delete observation if X<P90 column. Anyway, thank you for all your support!

 

MKW 

View solution in original post

8 REPLIES 8
Reeza
Super User

1. Never define a macro inside a data step

2. Show what you have (Data) and what you want

3. Explain the actual problem you’re trying to solve

4. Don’t be loopy. 

 

Before making a macro, you should also start with working code. Can you show what your solution looks like before it’s a macro?

Michaelcwang2
Obsidian | Level 7

Hi Reeza,

By running the macro, I'll have 200 normal distributions with moving mean populated by 50 (or a few more) variable X with index Y as 0 or 1 if they are beyond a limit. I can collect these X (50*200) then analyze its descriptive statistics with PROC MEAN by each j.

Now I have special interest over the largest few data like n percentile of each distribution, so I would to slice them from this data set out and do the same analysis with similar PROC MEAN.

proc means 
   data=work.query 
     chartype NWAY 
     mean std min max n vardef=df skew SKEWNESS KURT KURTOSIS median;
   var X;
   output 
   out=work.skewtemp 
     skew=Distskew KURT=DISKURT max=DISmax median=DISmedian min=DISmin; 
     where (j between 0 and 199) and Y=0;       
     class J;	
   run;

Hopefully it helps to clarify my problem. Thank you!

 

MKW

Kurt_Bremser
Super User

After removing the unnecessary macro and correcting errors (like the missng semicolon after then do), this is your code.

data work.Rand;
do j = 0 to 99;
  call streaminit(567);
  do i = 1 to 50;
    X = Rand("normal",0.02*j-1,1.0);
    if X > 3 or X < -3
    then do;
      Y = X;
      X = 0;
      i = i - 1;
    end;
    else Y = 0;
    /*u=Rand("uniform");*/
    output;
  end;
end;
run;

proc sql;
create table work.query as
select j, i, X, Y
from work.rand 
/*where J = &j and Y = 0*/
;
quit;

From where would you get &j?

Michaelcwang2
Obsidian | Level 7
If I use PROC Univariate to come out a data file of n-percentile of X by J, is there a way to sql to get X>these values for each j from original dataset ? Thank you.
Tom
Super User Tom
Super User

If you have one dataset, HAVE, with J and many X values and another dataset, MEANS, with J and a cutoff value, say P95, then just join them.

proc sql ;
create table want as
select a.*
from have a , means b
where a.j= b.j 
  and a.x > b.p95
;
quit;

 

Michaelcwang2
Obsidian | Level 7
Thank you, Tom. Will check if it solves .
Michaelcwang2
Obsidian | Level 7

Dear Tom,

For certain reason, it still doesn't work in this way.

proc sql ;
 173        create table want as
 174        select X.*
 175        from rand1 X , percentile1 P_n
 176        where X.j= percentile1.j
 177          and X.x > P_n.p95

But I did solve my problem by using PROC MEAN to create another data set by P90 of each j, then merge it with original dataset "Rand" and Merge two by j and Delete observation if X<P90 column. Anyway, thank you for all your support!

 

MKW 

Reeza
Super User

If your goal is to figure out what's higher than the 95th percentile, I would use the RANK proc instead and then filter that out directly.

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1524 views
  • 0 likes
  • 4 in conversation