Dear all,
if I have a data that looks like this
data myattrmap;
infile datalines;
input id $6. research_type $6. ;
datalines;
100 RY2819
20 RC7856
1 RA5034
2 RF1044
3 RW3399
4 RV6060
4667 RR7034
;
run;
and I want to output the first, second, third ..... reseaches group and sorted by the id's. How can I do that?
Try this:
data myattrmap;
infile datalines;
input id @6 research_type $6. sub_research $10.;
datalines;
100 RY2819 RY2819_1
100 RY2819 RY2819_2
100 RY2819 RY2819_3
20 RC7856 RC7856_1
20 RC7856 RC7856_2
1 RA5034 RA5034_1
1 RA5034 RA5034_2
1 RA5034 RA5034_3
2 RF1044 RF1044_2
2 RF1044 RF1044_6
3 RW3399 RW3399_3
3 RW3399 RW3399_4
3 RW3399 RW3399_5
4 RV6060 RV6060_1
4 RV6060 RV6060_2
4 RV6060 RV6060_3
4667 RR7034 RR7034_2
4667 RR7034 RR7034_3
;
run;
proc sort data=myattrmap;
by research_type sub_research;
run;
data myattrmap1;
set myattrmap;
by research_type;
if first.research_type then seq_num=0;
seq_num+1;
run;
I want to output the first, second, third ..... reseaches group and sorted by the id's
I'm not sure really what this means. By output, you mean you want this is a data set? Can you show us the desired output?
I don't see "research group" in the data you have posted.
Depending on the expected result, either use proc sort + proc print or proc report, maybe more complex steps are required, but because you have not told us what exactly is expected, i can't recommend anything.
sorry for not explaining well. The variable is research type, it was a mistake, I made.
Actually the data set has many variables which I wish to reduce to only the id and the research type.
The research type has subgroups e.g. RY2819 has subgroups RY2819_1, RY2819_2, RY2819_3. So that I then have, something like this
data myattrmap;
infile datalines;
input id $6. research_type $6. sub_research ;
datalines;
100 RY2819 RY2819_1
100 RY2819 RY2819_2
100 RY2819 RY2819_3
20 RC7856 RC7856_1
20 RC7856 RC7856_2
1 RA5034 RA5034_1
1 RA5034 RA5034_2
1 RA5034 RA5034_3
2 RF1044 RF1044_2
2 RF1044 RF1044_6
3 RW3399 RW3399_3
3 RW3399 RW3399_4
3 RW3399 RW3399_5
4 RV6060 RV6060_1
4 RV6060 RV6060_2
4 RV6060 RV6060_3
4667 RR7034 RR7034_2
4667 RR7034 RR7034_3
;
run;
I need an outputted Sequential number showing the 1., 2. 3., ........research type within the id
so that at the end my output looks like this
ID seq_num research_type sub_research
100 1 RY2819 RY2819_1
100 2 RY2819 RY2819_2
100 3 RY2819 RY2819_3
20 1 RC7856 RC7856_1
20 2 RC7856 RC7856_2
1 1 RA5034 RA5034_1
1 2 RA5034 RA5034_2
1 3 RA5034 RA5034_3
2 1 RF1044 RF1044_2
2 2 RF1044 RF1044_6
3 1 RW3399 RW3399_3
3 2 RW3399 RW3399_4
3 3 RW3399 RW3399_5
4 1 RV6060 RV6060_1
4 2 RV6060 RV6060_2
4 3 RV6060 RV6060_3
4667 1 RR7034 RR7034_2
4667 2 RR7034 RR7034_3
I hope I have now clearly explained it
Try this:
data myattrmap;
infile datalines;
input id @6 research_type $6. sub_research $10.;
datalines;
100 RY2819 RY2819_1
100 RY2819 RY2819_2
100 RY2819 RY2819_3
20 RC7856 RC7856_1
20 RC7856 RC7856_2
1 RA5034 RA5034_1
1 RA5034 RA5034_2
1 RA5034 RA5034_3
2 RF1044 RF1044_2
2 RF1044 RF1044_6
3 RW3399 RW3399_3
3 RW3399 RW3399_4
3 RW3399 RW3399_5
4 RV6060 RV6060_1
4 RV6060 RV6060_2
4 RV6060 RV6060_3
4667 RR7034 RR7034_2
4667 RR7034 RR7034_3
;
run;
proc sort data=myattrmap;
by research_type sub_research;
run;
data myattrmap1;
set myattrmap;
by research_type;
if first.research_type then seq_num=0;
seq_num+1;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.