Hello,
I am trying to convert a SAS dataset with nucleotide sequence to a fasta format. See sample FASTA formated file here:https://github.com/veg/hivtrace/blob/master/test/rsrc/TEST.FASTA
What I have is a table with two columns:
ID Seq
What I need is to create a new line for the corresponding Seq value as :
>ID1
Seq1
>ID2
Seq2
Any advise is much appreciated.
You don't list the individual values, you list the variable names.
Run this, changing the filepath to be relevant as a starting point and see what is generated.
data _null_;
set sashelp.class;
file '/folders/myfolders/demo.txt' linesize=80 flowover;
put Name;
put Age;
run;
@HabAM wrote:
Thank you Reeza. The only issue is I have thousands of row values to individually list as suggested.
samle set
ID Seq >12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag >343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
expected text file:
>12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag >343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
Thank you
Thank you Reeza. The only issue is I have thousands of row values to individually list as suggested.
samle set
ID | Seq |
>12121 | aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag |
>343434 | cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>4545e | cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>1111 | cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>2321 | cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
expected text file:
>12121 |
aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag |
>343434 |
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>4545e |
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>1111 |
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>2321 |
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
Thank you
Add the necessary set statement to @Reeza 's code in order to read your SAS dataset.
You don't list the individual values, you list the variable names.
Run this, changing the filepath to be relevant as a starting point and see what is generated.
data _null_;
set sashelp.class;
file '/folders/myfolders/demo.txt' linesize=80 flowover;
put Name;
put Age;
run;
@HabAM wrote:
Thank you Reeza. The only issue is I have thousands of row values to individually list as suggested.
samle set
ID Seq >12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag >343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
expected text file:
>12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag >343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
Thank you
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.