- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have a dataset with comments similar to the dataset below. I would like to output each comments and its duplicates into different sheet in excel. How do I find the duplicate, after sorting without deleting duplicate and outputting them to each excel sheet?
For example:
data have;
length comment $ 1000;
input comment$ 1 - 1000;
datalines;
Since it is impossible to completely and completely quit using dependent drugs, it is better to face drug abuse with a positive, rational and practical attitude and point of view
Substance abuse refers to the intermittent or persistent excessive use of psychoactive substances that is contrary to social norms or is irrelevant or inconsistent with accepted medical practice.Such abuse is far from tentative use, social entertainment or situational need, but gradually turns into a state of intensive use, which leads to the formation of dependence.
Substance abuse refers to the intermittent or persistent excessive use of psychoactive substances that is contrary to social norms or is irrelevant or inconsistent with accepted medical practice.Such abuse is far from tentative use, social entertainment or situational need, but gradually turns into a state of intensive use, which leads to the formation of dependence.
Need safe spaces and safe disposable sites
Need safe spaces and safe disposable sites
More attention has been paid to the concept of harm reduction.
NO
No opinion
Marijuana should be legal.
Need safe spaces and safe disposable sites
The use of drugs should be vigorously suppressed
The use of drugs should be vigorously suppressed
Very bad behavior
Need safe spaces and safe disposable sites
Increase control
nothing
nothing
I just hope there will be when I need it
I just hope there will be when I need it
Increase control
No opinion
;
run;
Expected results:
Sheet 1
Need safe spaces and safe disposable sites
Need safe spaces and safe disposable sites
Need safe spaces and safe disposable sites
Need safe spaces and safe disposable sites
Sheet 2:
nothing
nothing
sheet 3:
I just hope there will be when I need it
I just hope there will be when I need it
sheet 4:
Substance abuse refers to the intermittent or persistent excessive use of psychoactive substances that is contrary to social norms or is irrelevant or inconsistent with accepted medical practice.Such abuse is far from tentative use, social entertainment or situational need, but gradually turns into a state of intensive use, which leads to the formation of dependence.
Substance abuse refers to the intermittent or persistent excessive use of psychoactive substances that is contrary to social norms or is irrelevant or inconsistent with accepted medical practice.Such abuse is far from tentative use, social entertainment or situational need, but gradually turns into a state of intensive use, which leads to the formation of dependence.
and so on.
Thanks
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Does that data read in correctly for you? It doesn't for me.
If order of the results matter you need to add a variable to control the order or you may have one in your data set already so leaving that as an exercise for you to solve.
If you have other destinations open you may want to close them as otherwise you'll get a lot of printed output.
proc sort data=have out=duplicates nouniquekey;
by comment;
run;
ods excel file='/home/fkhurshed/Demo1/duplicates.xlsx' options(sheet_interval="bygroup" sheet_label = '' sheet_name = "Duplicate");
options nobyline;
proc print data=duplicates;
by comment;
var comment;
run;
ods excel close;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Does that data read in correctly for you? It doesn't for me.
If order of the results matter you need to add a variable to control the order or you may have one in your data set already so leaving that as an exercise for you to solve.
If you have other destinations open you may want to close them as otherwise you'll get a lot of printed output.
proc sort data=have out=duplicates nouniquekey;
by comment;
run;
ods excel file='/home/fkhurshed/Demo1/duplicates.xlsx' options(sheet_interval="bygroup" sheet_label = '' sheet_name = "Duplicate");
options nobyline;
proc print data=duplicates;
by comment;
var comment;
run;
ods excel close;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This code works but I wanted to see all other variables so excluded the "var" statement so that sas can include all variables in the dataset. However, sas included all the variable except "comment". Why is that? Is it because am using "by comment"? How could I have comment and all other variables in the dataset. I don't want to use keep statement because I have a lot of variables.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
VAR statement controls what is displayed. Remove the VAR statement and I think it prints all variables by default.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Reeza Yes i removed the var and all variables was displayed except comment. I want variable "comment" to be displayed too. How will I get it ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
var _all_;
Select all variables for printing then.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content