BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Yves_Boonen
Quartz | Level 8

I have an Excel sheet with a few dozen columns. One of these columns is called "distribution options". In this column you find general keywords to indicate how it will be distributed. Below you will find all the available "distribution options".

 

 

Fast Track Heir

Fast Track Living Expenses

Fast Track Balance

Fast Track Surety

Manual Distribution Office

Distribution Notery

Distribution Standard

 

I wanted to read out the entire column and receive a table how many times each of these distribution options appear in the column.

 

The result in the end should look similar to this.

 

Distribution Options

Count

Fast Track Heir

102

Fast Track Living Expenses

230

Fast Track Balance

17

Fast Track Surety

43

Manual Distribution Office

2

Distribution Notery

18

Distribution Standard

43

 

I wrote a script, but for some reason it doesn't count the Fast Track Heir, Fast Track Surety and Manual Distribution Office. All the other values (the same as above) are correct.

 

I used this script, but perhaps there is an easier approach. I simply can't figure out why Heir, Surety and Manual return "0" values, whereas the rest returns the values shown above. There are no errors in the log either. This time I checked the data myself and used COUNTIF(range;"word") to count it in Excel. The values above are correct. The script somehow isn't.

  

PROC SORT IN=atropos.dossiers OUT=atropos.dos_opt;
BY dossier_manager;
RUN;

DATA atropos.dos_rap;
SET atropos.dos_opt;
BY dossier_manager;
RETAIN
	Heir
	Living_Expenses
	Balance
	Surety
	Manual
  	Notary
	Standard;

IF first.dossier_manager
THEN DO;
	Heir=0;
	Living_Expenses=0;
	Balance=0;
	Surety=0;
	Manual=0;
  	Notary=0;
	Standard=0;
END;

SELECT (distributionoption);

  	WHEN ('Fast Track 1 Heir') Heir +1;

  	WHEN ('Fast Track Living Expenses') Living_Expenses +1;

	WHEN ('Fast Track Balance') Balance +1;
  	
  	WHEN ('Fast Track Surety") Surety +1;

	WHEN ('Manual Distributionassignment Office') Manual +1;

	WHEN ('Distribution Notery') Notery +1;

	WHEN ('Distribution Standard') Standard +1;

  	OTHERWISE;
END;

IF last.dossier_manager THEN OUTPUT;
KEEP dossier_manager Heir Living_Expenses Balance Surety Manual Notary Standard;
RUN;

1 ACCEPTED SOLUTION

Accepted Solutions
Loko
Barite | Level 11

Hello,

 

Are you sure the values are written the way you have added in the select clause ?

 

By the way select clause is useful when:

- you have a long series of mutually exclusive numeric conditions
- data values are uniformly distributed

 

In your case I believe an aprroach using if then else condition would be more appropriate.

 

View solution in original post

9 REPLIES 9
Loko
Barite | Level 11

Hello,

 

Are you sure the values are written the way you have added in the select clause ?

 

By the way select clause is useful when:

- you have a long series of mutually exclusive numeric conditions
- data values are uniformly distributed

 

In your case I believe an aprroach using if then else condition would be more appropriate.

 

Yves_Boonen
Quartz | Level 8

Yes, the values are written correctly. I have copy pasted the values from the Excel sheet. I even tried to change

 

WHEN ('Fast Track 1 Heir') Heir +1;

 

to

 

WHEN ('%Heir%') Heir +1;

 

Even then it didn't work.

 

So you're suggestion would look like this?

 

DATA atropos.dos_rap;
SET atropos.dos_opt;
BY dossier_manager;
RETAIN
	Heir
	Living_Expenses
	Balance
	Surety
	Manual
  	Notary
	Standard;

IF first.dossier_manager
THEN DO;
	Heir=0;
	Living_Expenses=0;
	Balance=0;
	Surety=0;
	Manual=0;
  	Notary=0;
	Standard=0;
END;

SELECT (distributionoption);

  	IF distributionoption = ('Fast Track 1 Heir') THEN Heir +1;

  	ELSE IF distributionoption = ('Fast Track Living Expenses') THEN Living_Expenses +1;

	ELSE IF distributionoption = ('Fast Track Balance') THEN Balance +1;
  	
  	ELSE IF distributionoption = ('Fast Track Surety") THEN Surety +1;

	ELSE IF distributionoption = 'Manual Distributionassignment Office') THEN Manual +1;

	ELSE IF distributionoption = ('Distribution Notery') THEN Notery +1;

	ELSE IF distributionoption = ('Distribution Standard') THEN Standard +1;
END;

IF last.dossier_manager THEN OUTPUT;
KEEP dossier_manager Heir Living_Expenses Balance Surety Manual Notary Standard;
RUN;
Yves_Boonen
Quartz | Level 8

I tried a different approach.

 

I imported the file. Then I took the imported data and only selected the distribution options, while storing this new selection as a new table. I noticed that some of the distribution options were incomplete and thus the query didn't work.

 

For example: "Fast Track 1 Heir" became "Fast Track 1 He" as well as with the other two.

 

I did copy and paste the exact values from the Excel file, but that wasn't the table my query tried to work with. In fact it was an import in my SAS library. Instead of checking the raw data, I should also check my imported data more. Consider this a lesson learned, because after I adjusted the names in the first script, everything ran smoothly!

 

Thanks for helping me get there.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

 

Firstly, please code in consistent casing/indentaiton, makes it much easier to read, and posting example test data in the form of a datastep will help you get tested code.  

 

Now your problem.  I thinkk you are looking at it the wrong way round, you have list of what values you want out, and then trying to count from that.  The other way of looking at it is to proc means the data you have, and then filtering the results for those you want to keep.  This has the benefit of being able to see what counted data is present.  For example (untested);

proc means data=have;
  by distribution_options;
  output out=results;
run;
/* Filter out the ones you want to keep */
proc sql;
  create table WANT as
  select  *
  from    DISTRIBUTION_OPTIONS in (select OPTIONS from LIST_OF_VALUES);
quit;

Nice and simple way, though to be honest I don't think you need the second step, just drop records you don't want in your further processing, then its just one step.

 

Just to add, on your second post, a select clause can look like:

select(<test>);

  when(<result>) <code>;

Or ir can look like:

select;

  when (index(strip(upcase(<text>)),"HIKE")>0) <code>;

 

Yves_Boonen
Quartz | Level 8

No it isn't.

 

The original code (in my SAS client) has variable names in Dutch. To make it more understandable, I manually changed all the names in here. Kinda screwed me over, but rest assured: no syntax errors. 🙂

 

Thanks!

LinusH
Tourmaline | Level 20

Not sure I under the purpose of this. Seems to me as a hard coded recode of column values, count, and then a transpose...?

Similar to @RW9, I think there are smarter ways of accomplish this.

A format/lookup table can recode/rename your distribution options.

PROC TABULATE does cross tabulation out of the box - no need to transpose your data set.

Data never sleeps
stat_sas
Ammonite | Level 13

Another way to do this is using proc sql:

 

data have;
input distribution_options $40.;
datalines;
Fast Track Heir
Fast Track Living Expenses
Fast Track Balance
Fast Track Surety
Manual Distribution Office
Distribution Notery
Distribution Standard
Fast Track Heir
Fast Track Living Expenses
Fast Track Balance
;

proc sql;
create table want as
select trim(distribution_options) as distribution_options , count(*) as Count
from have
group by trim(distribution_options);
quit;

 

 

Haikuo
Onyx | Level 15

Maybe I am missing something, but Proc Freq seems to be designed for this kind of job.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1324 views
  • 2 likes
  • 7 in conversation