Please feel free to tell me to RTFM (just give friendly location). I am using SAS v9.4. I have a reasonably large data set, that I need to select certain rows to output to a new data set.I'm attaching a couple of examples. The variables are: Obs Grpelem Grindex Index Pfqmn Ffqmn Minfqmn Maxfqmn Bwmn Pkfqme Ffqme Minfqme Maxfwme Bwme Fmcthresh Sub Index1
1 2 23 1 39000 39000 38000 41000 2900 38600 38600 38400 39800 2200 . 8073F -2
2 2 25 1 70300 70300 64400 72200 7800 68100 68100 67500 69700 2600 . 8073F -2
3 2 25 2 44900 44900 43900 46800 2900 44800 44800 44000 46400 2600 23300 8073F -1
.
.
.17 2 148 1 38000 38000 36100 41000 4800 37900 37900 37900 39000 2100 . 8073F -2
18 2 178 1 37100 37100 35100 39000 3900 36200 36200 36100 37400 2300 . 8073F -2
19 4 183 1 37100 37100 35100 39000 3900 36800 36900 36600 37900 2200 . 8073F -2
20 3 201 1 36100 36100 35100 39000 3900 35900 35900 36200 36900 2000 . 8073F -2
21 3 201 2 67300 33600 66400 69300 2900 67300 44900 66400 69300 2900 9000 8073F -1
22 3 201 3 33200 33200 32200 35100 2900 33200 33200 32500 35000 2500 11700 8073F -1
I need to select the rows where Grindex has the same value in multiple rows (e.g. rows 2,3 + rows 20, 21, 22 in the example above). The actual number in the Grindex column will vary, but some rows will have repeats for each subject. These are the rows I need to put into a new data set for further processing. I can't provide any code, as this one has me stumped. The attempts I've made have failed as I can't figure out how to select multiple rows on a variable that will change multiple times within a subject and between subjects.
It's difficult to see where SUBJECT comes from or how it might be used, but this should be a good set of tools to apply:
data want;
set have;
by grindex notsorted;
if first.grindex=0 or last.grindex=0;
run;
It's conceivable (but not clear) that you would need to add another variable to the BY statement.
It's 100% forgivable that you didn't find this in the documentation. It's a combination of tools that you wouldn't normally see in your first few months of programming.
Simplifying your problem, see if something like this works for your situation:
data have;
input obs grp num;
datalines;
1 1 25
2 1 25
3 1 44
4 2 3
5 3 64
6 3 64
;
run;
proc sql;
create table want(drop=_cnt) as
select *, sum(1) as _cnt from have
group by grp,num
having _cnt gt 1
order by obs
;quit;
-unison
It's difficult to see where SUBJECT comes from or how it might be used, but this should be a good set of tools to apply:
data want;
set have;
by grindex notsorted;
if first.grindex=0 or last.grindex=0;
run;
It's conceivable (but not clear) that you would need to add another variable to the BY statement.
It's 100% forgivable that you didn't find this in the documentation. It's a combination of tools that you wouldn't normally see in your first few months of programming.
This did exactly what I wanted. Thanks! Can see I need to play with this code to fully understand it, but at least I can move forward with the rest of the data cleanup/checking.
For learning purposes, it's a good idea to play with a small sample of your data so that you're not bogged down by the details. In your case, you can take a look at what "first.grindex" and "last.grindex" are for each observation by doing something like the following (using a simplified 'have' dataset):
data have;
input obs grp num;
datalines;
1 1 25
2 1 25
3 1 44
3 2 25
4 2 3
5 3 64
6 3 64
;
run;
data want;
set have;
by num notsorted;
first_val = first.num;
last_val = last.num;
/* if first.num=0 or last.num=0;*/
run;
Cheers!
-unison
Yup, this is what I had in mind also. Just have to get existing data pulled together and cleaned up, so analysis can begin. Will have to circle back to this next week. Thanks Again!!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.