Dear all,
if I have a data that looks like this
data myattrmap;
infile datalines;
input id $6. research_type $6. ;
datalines;
100 RY2819
20 RC7856
1 RA5034
2 RF1044
3 RW3399
4 RV6060
4667 RR7034
;
run;
and I want to output the first, second, third ..... reseaches group and sorted by the id's. How can I do that?
Try this:
data myattrmap;
infile datalines;
input id @6 research_type $6. sub_research $10.;
datalines;
100 RY2819 RY2819_1
100 RY2819 RY2819_2
100 RY2819 RY2819_3
20 RC7856 RC7856_1
20 RC7856 RC7856_2
1 RA5034 RA5034_1
1 RA5034 RA5034_2
1 RA5034 RA5034_3
2 RF1044 RF1044_2
2 RF1044 RF1044_6
3 RW3399 RW3399_3
3 RW3399 RW3399_4
3 RW3399 RW3399_5
4 RV6060 RV6060_1
4 RV6060 RV6060_2
4 RV6060 RV6060_3
4667 RR7034 RR7034_2
4667 RR7034 RR7034_3
;
run;
proc sort data=myattrmap;
by research_type sub_research;
run;
data myattrmap1;
set myattrmap;
by research_type;
if first.research_type then seq_num=0;
seq_num+1;
run;
I want to output the first, second, third ..... reseaches group and sorted by the id's
I'm not sure really what this means. By output, you mean you want this is a data set? Can you show us the desired output?
I don't see "research group" in the data you have posted.
Depending on the expected result, either use proc sort + proc print or proc report, maybe more complex steps are required, but because you have not told us what exactly is expected, i can't recommend anything.
sorry for not explaining well. The variable is research type, it was a mistake, I made.
Actually the data set has many variables which I wish to reduce to only the id and the research type.
The research type has subgroups e.g. RY2819 has subgroups RY2819_1, RY2819_2, RY2819_3. So that I then have, something like this
data myattrmap;
infile datalines;
input id $6. research_type $6. sub_research ;
datalines;
100 RY2819 RY2819_1
100 RY2819 RY2819_2
100 RY2819 RY2819_3
20 RC7856 RC7856_1
20 RC7856 RC7856_2
1 RA5034 RA5034_1
1 RA5034 RA5034_2
1 RA5034 RA5034_3
2 RF1044 RF1044_2
2 RF1044 RF1044_6
3 RW3399 RW3399_3
3 RW3399 RW3399_4
3 RW3399 RW3399_5
4 RV6060 RV6060_1
4 RV6060 RV6060_2
4 RV6060 RV6060_3
4667 RR7034 RR7034_2
4667 RR7034 RR7034_3
;
run;
I need an outputted Sequential number showing the 1., 2. 3., ........research type within the id
so that at the end my output looks like this
ID seq_num research_type sub_research
100 1 RY2819 RY2819_1
100 2 RY2819 RY2819_2
100 3 RY2819 RY2819_3
20 1 RC7856 RC7856_1
20 2 RC7856 RC7856_2
1 1 RA5034 RA5034_1
1 2 RA5034 RA5034_2
1 3 RA5034 RA5034_3
2 1 RF1044 RF1044_2
2 2 RF1044 RF1044_6
3 1 RW3399 RW3399_3
3 2 RW3399 RW3399_4
3 3 RW3399 RW3399_5
4 1 RV6060 RV6060_1
4 2 RV6060 RV6060_2
4 3 RV6060 RV6060_3
4667 1 RR7034 RR7034_2
4667 2 RR7034 RR7034_3
I hope I have now clearly explained it
Try this:
data myattrmap;
infile datalines;
input id @6 research_type $6. sub_research $10.;
datalines;
100 RY2819 RY2819_1
100 RY2819 RY2819_2
100 RY2819 RY2819_3
20 RC7856 RC7856_1
20 RC7856 RC7856_2
1 RA5034 RA5034_1
1 RA5034 RA5034_2
1 RA5034 RA5034_3
2 RF1044 RF1044_2
2 RF1044 RF1044_6
3 RW3399 RW3399_3
3 RW3399 RW3399_4
3 RW3399 RW3399_5
4 RV6060 RV6060_1
4 RV6060 RV6060_2
4 RV6060 RV6060_3
4667 RR7034 RR7034_2
4667 RR7034 RR7034_3
;
run;
proc sort data=myattrmap;
by research_type sub_research;
run;
data myattrmap1;
set myattrmap;
by research_type;
if first.research_type then seq_num=0;
seq_num+1;
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.