BookmarkSubscribeRSS Feed

How to create a data step version of your data AKA generate sample data for forums

Started ‎03-24-2016 by
Modified ‎08-04-2021 by
Views 47,811

When asking question on the forum, it's incredibly helpful to provide sample data. Sometimes you can get away with referencing SASHELP datasets, but it's also nice to be able to use your own data, especially if you have a complex structure or just code that will work on your computer. 

 

Mark Jordan has written a SAS macro that will generate a data step version of your data. This post goes through how to use it in SAS. It's untested, but should work with any version of SAS. Note that this currently generates a warning, that I'll look into later, but does not change the results. 

 

This post has been simplified from previous runs. Now, you would modify the first two lines below. 

Replace sashelp.class with your data set name. 

Replace 5 with the number of observations you want in the demo data set. 

 

Then run the whole program posted below. The data step will be in the log. 

 

 

*data set you want to create demo data for;
%let dataSetName = sashelp.Class;
*number of observations you want to keep;
%let obsKeep = 5;


******************************************************
DO NOT CHANGE ANYTHING BELOW THIS LINE
******************************************************;

%let source_path = https://gist.githubusercontent.com/statgeek/bcc55940dd825a13b9c8ca40a904cba9/raw/865d2cf18f5150b8e887218dde0fc3951d0ff15b/data2datastep.sas;

filename reprex url "&source_path";
%include reprex;
filename reprex;

option linesize=max;
%data2datastep(dsn=&dataSetName, obs=&obsKeep);

 

In the log this would generate the following code that you can easily include in your posts. Make sure to use the {i} or running-man icon to conserve formatting. This is IMPORTANT to prevent the browser from modifying your code and turning colon parenthesis ( : ) into smiley faces :). 

 

 

 data WORK.CLASS(label='Student Data');
   infile datalines dsd truncover;
   input Name:$8. Sex:$1. Age:32. Height:32. Weight:32.;
 datalines;
 Alfred M 14 69 112.5
 Alice F 13 56.5 84
 Barbara F 13 65.3 98
 Carol F 14 62.8 102.5
 Henry M 14 63.5 102.5
 ;;;;

 

I would highly recommend you run this once first to test if it works as shown. 

If it does, then try and do it again with your own data. 

 

EDIT:

 

It's been shown that this doesn't always produce the correct results, unfortunately. 

Here's another method from the comments (thanks to @rogerjdeangelis, with the fix indicated from @ChrisNZ 

 

 

filename tagset url "https://gist.githubusercontent.com/statgeek/7be124b98b37d51e7c002b85b6e9cf72/raw/ec0bc5cf8e1a3386c5cf71d2280b242b7f87dadd/SAS_tagsets_sql_reprex.sas";

%include tagset;

   ods tagsets.sql file="/folders/myfolders/class.txt";
   proc print data=sashelp.class ;
   run;
   ods tagsets.sql close;

 

Note that I've modified this slightly to create a txt file, instead of a .sql file, so the user can easily open it. You can open it and verify and then paste it into your question. 

Comments
This tagset will create SQL insert code to recreate the SAS dataset.

However a bug was introduced in 9.4 where the length of character variables is missing ie varchar(). It used to work.

filename tagset http "http://support.sas.com/rnd/base/ods/odsmarkup/sql.sas";
   %include tagset;
   ods tagsets.sql file="d:/sd1/class.sql";
   proc print data=sashelp.class ;
   run;
   ods _all_ close;
   ods listing;

@rogerjdeangelis

I think you need to replace

set $col_def $col_def "(" width ")" /if $types_with_widths[type];

with

set $col_def $col_def "(" colwidth ")" /if $types_with_widths[type];

I asked SAS to correct.

Hi,

I run the original code and the log says 

 

ERROR: Cannot load SSL support.
ERROR: Cannot open %INCLUDE file REPREX.
ERROR: %INCLUDE has encountered an I/O/ error. Canceling submitted statements.

 

Anyone suffer from the same? How could we repair? Thank you.

If you have trouble accessing the file

https://gist.githubusercontent.com/statgeek/bcc55940dd825a13b9c8ca40a904cba9/raw/865d2cf18f5150b8e887218dde0fc3951d0ff15b/data2datastep.sas 

just download it and run it.

Don't run the lines

%let source_path = https://gist.githubusercontent.com/statgeek/bcc55940dd825a13b9c8ca40a904cba9/raw/865d2cf18f5150b8e887218dde0fc3951d0ff15b/data2datastep.sas;

filename reprex url "&source_path";
%include reprex;
filename reprex;

 


 

Yes, I got it. The codes run beautifully. Thanks a lot @ChrisNZ.

@wagdy I'm guessing you were planning on posting this somewhere else, rather than here?

GFW

Hi, Reeza, this is so cool and easy to get a sample data for asking questions.

 

Thanks for sharing!

I think you need to replace

set $col_def $col_def "(" width ")" /if $types_with_widths[type];

with

set $col_def $col_def "(" colwidth ")" /if $types_with_widths[type];

 

I asked SAS to correct.

 

18 months later the file is still not corrected. 😞

@Reeza  

Can you replace

In the log this would generate the following that you can easily include in your posts. 

with something like

In the log this would generate the following that you can easily include in your posts. Make sue to use the {i} or running-man icon to conserve formatting. 

@ChrisNZ  done, thanks for the suggestion.

Tom

The sample code that is being generated does not work.  It is using space as delimiter in generating the text, but using comma as the delimiter in code that is reading the text.

@Tom was it you who had a better version? I can link to that one if you remind me of the link? Or I can delete this, not sure I have time to investigate it, was just trying to make it more accessible to everyone.

Tom

Link:  https://github.com/sasutils/macros/blob/master/ds2post.sas

Example:

filename ds2post url 'http://tiny.cc/ds2post' ;
%include ds2post ;
%ds2post(sashelp.class)

Result:

data work.class (label='Student Data');
  infile datalines dsd dlm='|' truncover;
  input Name :$8. Sex :$1. Age Height Weight ;
datalines4;
Alfred|M|14|69|112.5
Alice|F|13|56.5|84
Barbara|F|13|65.3|98
Carol|F|14|62.8|102.5
Henry|M|14|63.5|102.5
James|M|12|57.3|83
Jane|F|12|59.8|84.5
Janet|F|15|62.5|112.5
Jeffrey|M|13|62.5|84
John|M|12|59|99.5
Joyce|F|11|51.3|50.5
Judy|F|14|64.3|90
Louise|F|12|56.3|77
Mary|F|15|66.5|112
Philip|M|16|72|150
Robert|M|12|64.8|128
Ronald|M|15|67|133
Thomas|M|11|57.5|85
William|M|15|66.5|112
;;;;

 

GFW

@Tom Hi Tom, I ran the macro in my SAS 9.2, there is some ERRORs, but still get the correct result from the LOG window.

MPRINT(DS2POST):   * Get member label in format of dataset option. ;
MPRINT(DS2POST):   * Get dataset contents information in a format to facilitate code
generation.;
MPRINT(DS2POST):   * Column names reflect data statement that uses the value. ;
MPRINT(DS2POST):
*----------------------------------------------------------------------------;
MPRINT(DS2POST):   proc sql noprint;
NOTE: Line generated by the invoked macro "DS2POST".
2         select cats('(label=',quote(trim(memlabel),"'"),')')     into :memlabel trimmed

-------

22

76
2    ! from dictionary.tables     where libname="&libname"
SYMBOLGEN:  Macro variable LIBNAME resolves to WORK
SYMBOLGEN:  Macro variable MEMNAME resolves to A3
MPRINT(DS2POST):   select cats('(label=',quote(trim(memlabel),"'"),')') into :memlabel trimmed
from where libname="WORK" and memname="A3" and not missing(memlabel) ;
ERROR 22-322: Syntax error, expecting one of the following: ',', -, FROM, SEPARATED, THROUGH,
              THRU.

ERROR 76-322: Syntax error, statement will be ignored.

 

Tom

The TRIMMED keyword is probably not supported in such an old version of SAS.
The old work around was use the SEPARATED BY keyword instead, doesn't matter what delimiter you specify if you make sure the query only returns one observation.

That query is just to get the member label attached to the dataset into a macro variable.  So if it doesn't run the result is just that the generated code will not attach any label to the generated dataset.  Does the CATS() function or the second option to the QUOTE() function cause any trouble in SAS 9.2?  Those are also newer SAS features used in that statement.

GFW

@Tom I will check it later, thanks!

@AshPatel please post this as a new topic in the Programming community.

Use the code from this article (on which you accidentally commented) to post your datasets as data step code.

good hints

Version history
Last update:
‎08-04-2021 01:28 PM
Updated by:
Contributors

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags