Hi,
I want distinct values to be shown in the report for each column based on grouped by C and D columns. Could anyone please help me in achieving the below mentioned output? I have attached Test.txt file which contains Input data.
Input data:-
A | B | C | D | E | F |
101 | 1 | aaaa | aaa | a | aa |
101 | 1 | aaaa | aaa | b | aa |
101 | 1 | aaaa | aaa | c | bb |
101 | 1 | aaaa | aaa | d | cc |
101 | 1 | aaaa | aaa | a | aa |
101 | 1 | aaaa | aaa | b | cc |
101 | 1 | aaaa | aaa | c | aa |
101 | 1 | aaaa | aaa | d | aa |
101 | 1 | aaaa | aaa | a | cc |
101 | 1 | aaaa | aaa | b | cc |
101 | 2 | bbbb | bbb | c | dd |
101 | 2 | bbbb | bbb | d | ee |
101 | 2 | bbbb | bbb | e | ff |
101 | 2 | bbbb | bbb | f | dd |
101 | 2 | bbbb | bbb | c | ee |
101 | 2 | bbbb | bbb | d | ff |
101 | 2 | bbbb | bbb | e | dd |
101 | 2 | bbbb | bbb | f | ee |
101 | 2 | bbbb | bbb | c | ff |
101 | 2 | bbbb | bbb | d | dd |
101 | 2 | bbbb | bbb | e | ee |
101 | 2 | bbbb | bbb | f | ff |
102 | 1 | cccc | ddd | a | gg |
102 | 1 | cccc | ddd | b | hh |
102 | 1 | cccc | ddd | c | gg |
102 | 1 | cccc | ddd | d | jj |
102 | 1 | cccc | ddd | e | kk |
102 | 1 | cccc | ddd | f | ll |
102 | 2 | dddd | eee | g | mm |
102 | 2 | dddd | eee | h | nn |
102 | 2 | dddd | eee | i | rr |
102 | 2 | dddd | eee | j | pp |
102 | 2 | dddd | eee | k | |
102 | 2 | dddd | eee | l | rr |
102 | 2 | dddd | eee | m | ss |
102 | 2 | dddd | eee | n | tt |
Desired Output:-
A | B | C | D | E | F |
101 | 1 | aaaa | aaa | a | aa |
101 | 1 | b | aa | ||
101 | 1 | c | bb | ||
101 | 1 | d | cc | ||
101 | 2 | bbbb | bbb | c | dd |
101 | 2 | d | ee | ||
101 | 2 | e | ff | ||
101 | 2 | f | |||
102 | 1 | cccc | ddd | a | gg |
102 | 1 | b | hh | ||
102 | 1 | c | jj | ||
102 | 1 | d | kk | ||
102 | 1 | e | ll | ||
102 | 1 | f | |||
102 | 2 | dddd | eee | g | mm |
102 | 2 | h | nn | ||
102 | 2 | i | rr | ||
102 | 2 | j | pp | ||
102 | 2 | k | |||
102 | 2 | l | ss | ||
102 | 2 | m | tt | ||
102 | 2 | n |
In your proc report (assuming sorted):
proc report data=...;
columns a b c d e f;
define a / 'ColA';
define b / 'ColB';
define c / group order=internal; /* this is the real part - use group or order */
define d / group order=internal
For more guidance: http://support.sas.com/resources/papers/proceedings11/090-2011.pdf
Thanks for the code. However, it's not working as per the requirement. If you notice the desired output, C, D, E and F columns values are unique based on C and D grouping.
When you do grouping the result will be distinct values. This is what you mentioned.
As you are saying this in not your requirement what is your requirement?
Distinct Values should be pulled for C, D, E and F columns as per the desired output requirement provided above.
You are still not clear in your requirement Distinct and Unique are difffernt words having same meaning within the SQL language.
The meaning is the same result as by grouping ignoring grouping results. There are a lot of dogs called Max but not every dog is names Max.
You have to be very precise in your description condition and wanted results.
As you do that you probably will already have your answer.
If Proc report or proc sql used, I am getting distinct values for all the rows which is correct according to SQL or SAS where in my requirement is to pull unique values for all the columns. Unique values should be in such a way that according A and B, rest of the columns values to be unique. Instead of rows as unique I need columns as unique which is linked to A and B columns.
Do you want a report (print layout) ... In a tabular form like report is often a readable approach the headers for grouping are not repeating
This is what you have done to columns C and D not A and B
Do you want to have an other dataset for further processing. All levels of information caused them by joining a indentification should be there this is what you have done with columns A, B not C, D?
What should it be one of the two: 1/ printable lay out or 2/ data processing ?
So if you eliminate the duplicates and then use proc report with specifying the first four columns as ORDER then the display is similar to what you wanted, but the first two columns will be blanked out when the values are the same just like the second two in your requested report.
data have ;
infile cards dsd dlm='|';
input a b (c d e f) ($) ;
cards;
101|1|aaaa|aaa|a|aa
101|1|aaaa|aaa|b|aa
101|1|aaaa|aaa|c|bb
101|1|aaaa|aaa|d|cc
101|1|aaaa|aaa|a|aa
101|1|aaaa|aaa|b|cc
101|1|aaaa|aaa|c|aa
101|1|aaaa|aaa|d|aa
101|1|aaaa|aaa|a|cc
101|1|aaaa|aaa|b|cc
101|2|bbbb|bbb|c|dd
101|2|bbbb|bbb|d|ee
101|2|bbbb|bbb|e|ff
101|2|bbbb|bbb|f|dd
101|2|bbbb|bbb|c|ee
101|2|bbbb|bbb|d|ff
101|2|bbbb|bbb|e|dd
101|2|bbbb|bbb|f|ee
101|2|bbbb|bbb|c|ff
101|2|bbbb|bbb|d|dd
101|2|bbbb|bbb|e|ee
101|2|bbbb|bbb|f|ff
102|1|cccc|ddd|a|gg
102|1|cccc|ddd|b|hh
102|1|cccc|ddd|c|gg
102|1|cccc|ddd|d|jj
102|1|cccc|ddd|e|kk
102|1|cccc|ddd|f|ll
102|2|dddd|eee|g|mm
102|2|dddd|eee|h|nn
102|2|dddd|eee|i|rr
102|2|dddd|eee|j|pp
102|2|dddd|eee|k|qq
102|2|dddd|eee|l|rr
102|2|dddd|eee|m|ss
102|2|dddd|eee|n|tt
;;;;
proc sort data=have out=want nodupkey ;
by a--f;
run;
proc report ;
columns a--f ;
define a--d / order ;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.