DATA Step, Macro, Functions and more

create a new Dataset for every unique Country and it should have data of that specific country

Reply
Occasional Contributor
Posts: 9

create a new Dataset for every unique Country and it should have data of that specific country

can u help me trough this

 

I wrote code like this but its not the efficient code

 

data AU CANADA France Germany NZ UK US;
set myhome.product;
if country = 'Australia' then output AU;
if country = 'Canada' then output Canada;
if country = 'France' then output France;
if country = 'Germany' then output Germany;
if country = 'New Zealand' then output NZ;
if country = 'United Kingdom' then output UK;
if country = 'United States' then output US;

Attachment
Super User
Posts: 5,509

Re: create a new Dataset for every unique Country and it should have data of that specific country

[ Edited ]
Posted in reply to Karan_Dumbre

IF / THEN statements are very quick.  It's surprising that you find the program slow.  At any rate, adding ELSE would speed things up:

 

if country = 'Australia' then output AU;
ELSE if country = 'Canada' then output Canada;
ELSE if country = 'France' then output France;
ELSE if country = 'Germany' then output Germany;
ELSE if country = 'New Zealand' then output NZ;
ELSE if country = 'United Kingdom' then output UK;
ELSE if country = 'United States' then output US;

 

In addition, you could change the order of the statements if you know something about your data.  Right now, the possibilities get evaluated in alphabetical order ("Australia" first, "United States" last).  You could improve that by listing the possibilities in order from most frequently occurring to least frequently occurring.

Occasional Contributor
Posts: 9

Re: create a new Dataset for every unique Country and it should have data of that specific country

Posted in reply to Astounding

can you help me to write more efficient code , what if number of countries are more than 100

here there are only few of them

so i can directly use the if else statement

Super User
Posts: 5,509

Re: create a new Dataset for every unique Country and it should have data of that specific country

Posted in reply to Karan_Dumbre

It sounds like your definition of "efficient" means easy for you to write, not code that runs faster.

 

We can always find a way to make the programming easier, but the program will become more complex and might take longer to run.  If that's your objective, start by creating a SAS data set that contains two variables:  the country (such as "United States") and the data set name you want to use for that country (such as "US").  Then we can worry about how to automate creation of the IF/THEN statements.

Occasional Contributor
Posts: 9

Re: create a new Dataset for every unique Country and it should have data of that specific country

Posted in reply to Astounding

actually u didnt get my point, what i m trying to tell u that in this case i know the name of countries in advance that my data set have this this countries. what if i dont know the countries and a that time i have to create separate data set for each country

Trusted Advisor
Posts: 1,918

Re: create a new Dataset for every unique Country and it should have data of that specific country

Posted in reply to Karan_Dumbre

Karan_Dumbre wrote:

actually u didnt get my point, what i m trying to tell u that in this case i know the name of countries in advance that my data set have this this countries. what if i dont know the countries and a that time i have to create separate data set for each country


I agree with what others have said, and I repeat their advice much more loudly: USE ONE DATASET, PROCESS USING BY GROUPS

PROC Star
Posts: 746

Re: create a new Dataset for every unique Country and it should have data of that specific country

Posted in reply to Karan_Dumbre

Do you know all the possible countries in your myhome.product dataset in advance?

Occasional Contributor
Posts: 9

Re: create a new Dataset for every unique Country and it should have data of that specific country

yes i know

but what if i dont know the countries . in that case what would be the solution

Super Contributor
Posts: 440

Re: create a new Dataset for every unique Country and it should have data of that specific country

Posted in reply to Karan_Dumbre

I agree with @RW9 - having a seperate file for each country isn't a good strategy and in the long run will lead to more complications than it resolves. However if you don't know which countries are in your file you find out as follows

 

proc sql;
	create table countries
	as select distinct country
	from product;
quit;

Once you have that file you COULD, if you absolutely need to, write a macro to generate one file for each country using that as a control file but I'd strongly advise against it.

Regular Contributor
Posts: 227

Re: create a new Dataset for every unique Country and it should have data of that specific country

Posted in reply to ChrisBrooks

Macro_CallText has code for this in section 3.3 Splitting_a_Data_Set

 

 

Super User
Posts: 11,343

Re: create a new Dataset for every unique Country and it should have data of that specific country

Posted in reply to Karan_Dumbre

Karan_Dumbre wrote:

yes i know

but what if i dont know the countries . in that case what would be the solution


Do you imply that you want to do the same thing for every country in the data? The typical approach for that is to use BY group processing.

Something like:

 

Proc sort data=product;

   by country;

run;

 

Proc Print data=product;

   by country;

run;

 

Almost every report or analysis procedure supports BY groups and it is much easier (and "efficient") in general to use By groups for analysis in one data set and then split the results to different pages of a report or similar based on the by groups as the last step.

Super User
Super User
Posts: 7,955

Re: create a new Dataset for every unique Country and it should have data of that specific country

Posted in reply to Karan_Dumbre

My first question is going to be why.  Its rarely, if ever a good idea to split data into separate datasets - consider if you then want to print them, you have to write code to proc print each dataset - whereas with one dataset you can use by groups.  Just one example of why one dataset is better than many.

 

The second quesiton is going to be, how do you know that Canada should go out to Canada and no CA, as some have two letter codes others do not - no consitency.

 

Third question, do you know all the combinations, what about countries you haven't though of - again this ties in with splitting data, by grouping you don't need to know this up front.

 

Finally for multiple ifs, select may be simpler:

data au canada...;
  set myhome.product;
  select (country);
    when ("Australia") output au;
    when ("Canada") output canada;
    ...
    otherwise;
  end;
run;
Community Manager
Posts: 2,954

Re: create a new Dataset for every unique Country and it should have data of that specific country

[ Edited ]
Posted in reply to Karan_Dumbre

I have to agree with lots of others that splitting a data set for perceived efficiency is usually not needed.  But, sometimes we don't get to choose our tasks.

 

This blog post presents a few lines of code that uses PROC SQL to generate the DATA steps that will split your data.  It's not the most efficient in performance, but it requires almost no additional planning or data prep.  Other experts chimed in on the comments with their own ideas.

 

Code:

 

/* define which libname.member table, and by which column */
%let TABLE=sashelp.cars;
%let COLUMN=origin;
 
proc sql noprint;
/* build a mini program for each value */
/* create a table with valid chars from data value */
select distinct 
   cat("DATA out_",compress(&COLUMN.,,'kad'),
   "; set &TABLE.(where=(&COLUMN.='", &COLUMN.,
   "')); run;") length=500 into :allsteps separated by ';' 
  from &TABLE.;
quit;
 
/* macro that includes the program we just generated */
%macro runSteps;
 &allsteps.;
%mend;
 
/* and...run the macro when ready */
%runSteps;

 

PROC Star
Posts: 283

Re: create a new Dataset for every unique Country and it should have data of that specific country

[ Edited ]
Posted in reply to ChrisHemedinger

@Karan_Dumbre   To your question- "what if i dont know the countries . in that case what would be the solution?".

Are you comfortable with Hashes? If yes, here you go-

 

/*This is super simple and fully dynamic*/

 

proc sort data=product;

by country;

run;

 

data _null_;

if _n_=1 then do;

if 0 then set product; 

      declare hash h(dataset:'product(obs=0)',multidata:'yes' );   

      h.defineKey('country');

      h.defineData(all:'yes');

      h.defineDone();

       end;

set product;

by country;

if first.country then h.clear();

h.add();

if last.country then h.output(dataset:'country'||compress(country));

run;

 

 

Regular Contributor
Posts: 227

Re: create a new Dataset for every unique Country and it should have data of that specific country

[ Edited ]
Posted in reply to ChrisHemedinger

warning: fragile code!

 

select cat(...)  %*fragile!;
into :mvar

 

@ChrisHemedinger, remember that the cat* functions create text whose default length is $200

and when truncation occurs --- you stuffed tooooo much text into the function call ---

no warning is given

 

Ron Fehd  been burned by this very trick maven

Ask a Question
Discussion stats
  • 16 replies
  • 240 views
  • 8 likes
  • 10 in conversation