Hi Team,
I was trying to understand this simple set statement and confused a little bit. Could someone help me understand the logic behind this???
I COULD SEE THAT THIS PROGRAM OUTPUTS THE FIRST 10 of THE 19 OBS FROM SASHELP>CLASS!!!!!!!!!!But why is it like that
I thought it gives observations greater than height 65 from the first set and all the obs from the second set!!!
data trial2;
set sashelp.class(keep=height);
if height > 62.5;
set sashelp.class;
run;
This is reproduced from SAS documentation :
data drugxyz;
set trial5(keep=sample);
if sample > 2;
set trial5;
run;
Regards
955 proc sql noprint ;
956 select count(*) into :nobs
957 from sashelp.class where height > 62.5
958 ;
959 %put nobs=&nobs ;
nobs= 10
960 quit;
Think about what the subsetting IF statement does.
How many times do you think the second SET statement is going to execute?
955 proc sql noprint ;
956 select count(*) into :nobs
957 from sashelp.class where height > 62.5
958 ;
959 %put nobs=&nobs ;
nobs= 10
960 quit;
Think about what the subsetting IF statement does.
How many times do you think the second SET statement is going to execute?
Hi,
I guess you showed me what happens with the first set stmnt.
i dint still get the usefulness of the second set stmnt
Regards
OK GOT U.
It is subsetted and then the same no of observations are taken when u put the second set stmnt.
If the second set statement has a diff dataset than the first then additive effect????right???
Thnks
I would be hard fetched to find a practical application for such a contrived program.
How about you are screening subjects to find those that meet the enrollment criteria. When you find a subject you need to assign them the next randomization number. So the first SET statement could be list of subjects in chronological order. The subsetting IF statement tests if they meet the entrance criteria. The second SET statement will be the list of randomizations. So only subjects that meet the criteria make it past the IF statement. Then the second SET statement adds in the values of the variables from the randomization file (randomization number, treatment, etc.). So the resulting dataset is the list of randomized subjects with their randomization information.
Thanks Tom for the detailed information.
BUT running these steps below:
data trial2;
set sashelp.class(keep=height);
if height > 62.5;
set sashelp.class;
run;
I am getting observations in which height is lessthan 62.5...
How can that be possible
Regards
Because HEIGHT is a variable in the dataset referenced in the second SET statement. So the value from that record overwrites any value that is already there for the current iteration.
Check the result when you run this data step.
data new ;
height = 100;
set sashelp.class;
run;
Karun,
Or, similar to Tom's suggestion,just look at your results and compare them with sashelp.class:
sashelp.class
obs name sex age height weight
1 | Alfred | M | 14 | 69.0 | 112.5 |
---|---|---|---|---|---|
2 | Alice | F | 13 | 56.5 | 84.0 |
3 | Barbara | F | 13 | 65.3 | 98.0 |
4 | Carol | F | 14 | 62.8 | 102.5 |
5 | Henry | M | 14 | 63.5 | 102.5 |
6 | James | M | 12 | 57.3 | 83.0 |
7 | Jane | F | 12 | 59.8 | 84.5 |
8 | Janet | F | 15 | 62.5 | 112.5 |
9 | Jeffrey | M | 13 | 62.5 | 84.0 |
10 | John | M | 12 | 59.0 | 99.5 |
11 | Joyce | F | 11 | 51.3 | 50.5 |
12 | Judy | F | 14 | 64.3 | 90.0 |
13 | Louise | F | 12 | 56.3 | 77.0 |
14 | Mary | F | 15 | 66.5 | 112.0 |
15 | Philip | M | 16 | 72.0 | 150.0 |
16 | Robert | M | 12 | 64.8 | 128.0 |
17 | Ronald | M | 15 | 67.0 | 133.0 |
18 | Thomas | M | 11 | 57.5 | 85.0 |
19 | William | M | 15 | 66.5 | 112.0 |
trial2
1 | 69.0 | Alfred | M | 14 | 112.5 |
---|---|---|---|---|---|
2 | 56.5 | Alice | F | 13 | 84.0 |
3 | 65.3 | Barbara | F | 13 | 98.0 |
4 | 62.8 | Carol | F | 14 | 102.5 |
5 | 63.5 | Henry | M | 14 | 102.5 |
6 | 57.3 | James | M | 12 | 83.0 |
7 | 59.8 | Jane | F | 12 | 84.5 |
8 | 62.5 | Janet | F | 15 | 112.5 |
9 | 62.5 | Jeffrey | M | 13 | 84.0 |
10 | 59.0 | John | M | 12 | 99.5 |
Notice that you are only getting 10 records and they match the first 10 records from sashelp.class. You are getting 10 records because there are 10 records that meet your condition. However, the 2nd set statement is only enacted if the first one's condition is met.
Thus, there are 10 records that meet your condition but, because of your 2nd set statement, you are actually outputting the first 10 records from sashelp.class.
There are several ways to end a data step besides clicking 'break' button. Stop, Abort or nature ending. Your question involves nature ending which is defined by the table finished first within the same data step. In your original scenario, the first table finished first:
data trial2;
set sashelp.class(keep=height);
if height > 62.5;
ct+1;
set sashelp.class;
run;
you can check how many times it actually executes passing through "if height > 62.5;" by adding a counter 'ct'. When it sequentially reaches the end marker of first table, data step stops regardless how many records left to be read in the second one.
As you may start wondering, another nature ending is to end the second one first, then you need to change access method of the first table from 'sequential' to 'random' by using point=, by doing that, you never get to reach the end marker of first table, even after the last record has been read. The following example is to show how the second table ends first, so you can have all 19 records from second table. however, "if height > 62.5;" is literally useless here, as the second table will always rewrite PDV before output.
data trial2;
ct+1;
set sashelp.class(keep=height) point=ct nobs=nobs;
if height > 62.5;
set sashelp.class;
if ct=nobs then ct=0;
run;
Haikuo
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.