- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Team,
In attached file I want to categorize the respondent into three category based on their spending(below 100,100-200 and above 200
) and how we count the number of unique customer in each group.
however I have tried below codes but not successful to finding the result-
proc sql;
create table new as
select
case
when amtspent > 100 then "below 100"
when amtspent 100-200 then "between100 to 200"
when amtspent < 200 then "above 200"
else
value "nothing";
end;
run;
Kindly check and advise.
Thanks,
Manoj Ahuja
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Using case for such tasks is possible, but not recommended. Using a self-defined format is the way to go. The following code uses sashelp.class and the age-variable.
proc format;
value Age
LOW -< 12 = 'under 12'
12-14 = '12-14'
14 <- HIGH = 'above 14'
;
run;
proc summary data=sashelp.class nway;
class Age;
format Age Age.;
output out=countAgeGroup(drop=_type_ rename=(_freq_=count));
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Heartily agree with @andreas_lds
One of the strong features of SAS is the concept of a display format. You can do multiple types of analysis on groups based on formats without adding lots of extra variables.
Using a format in the SASHELP.CLASS data set we could also provide groups of over/under a single age (creating 2 groups), or 4.
With the MULTILABEL format option, though only useable directly my a few procedures such as Proc means, Tabulate and Report, you can even create nested groups.
Also you can apply the same format to multiple variables. So suppose you had other variables that you wanted to analyze with the same boundary values such as "planned spending" or "last year amount". Apply the same format to those variables. Otherwise you have to rewrite the same code for each variable.
With numeric values you can have the numeric value order control appearance in procedures. With your example the default sort order and the appearance order in many procedures would be "above 200" , "below 100" then "between 100 to 200". Which is somewhat odd in most senses.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
REMOVE the VALUE keyword from the ELSE clause of your CASE.
Use valid boolean logic in the WHEN clauses
case
when amtspent > 100 then "below 100"
when 100<= amtspent <=200 then "between100 to 200"
when amtspent > 200 then "above 200"
else "nothing"
end
REMOVE the extra semi-colons in the middle of your CREATE statement.
Give your new variable that the CASE is generating a name.
ADD a FROM clause so SQL knows what you are selecting from.
SQL does not need RUN statement. Each statement runs immediately. Use QUIT statement to end the PROC SQL step.
proc sql;
create table new as
select
*
,case
when amtspent > 100 then "below 100"
when 100<= amtspent <=200 then "between100 to 200"
when amtspent > 200 then "above 200"
else "nothing"
end as new_var
from have
;
quit;