DaveLunt.net - Dr Dave Lunt | FastTree 1: Compiling and testing
16393
post-template-default,single,single-post,postid-16393,single-format-standard,ajax_fade,page_not_loaded,,side_area_uncovered_from_content,qode-theme-ver-9.1.3,hide_inital_sticky,wpb-js-composer js-comp-ver-4.5.3,vc_responsive
 

FastTree 1: Compiling and testing

26 Sep FastTree 1: Compiling and testing

This is how I downloaded, compiled and got FastTree working. Its a bit obvious in places but I think detailed instructions are a good thing to have out there and Google findable. I am using a multicore MacPro 2.8GHz with 4GB RAM and OSX 10.5.4 (I’m not sure the 8 cores make any difference whatsoever if the code isn’t written to take account of them).

  • I downloaded FastTree from www.microbesonline.org/fasttree.
  • I had to use Safari to do this as Firefox wouldn’t let me right click and download.
  • FastTree 1.0.0 is available as binaries for Windows and Linux. Unfortunately there are no Mac binaries
  • I downloaded the C code file (156kb)
  • I installed the developer tools “XcodeTools.mpkg” from system software install DVD number 2. It took about 15 minutes. This allowed me to use the gcc compiler to actually make the application.
  • I opened terminal, moved to the location of the c file and issued the command: gcc -lm -O2 -Wall -o FastTree FastTree.c
  • It didn’t like that much and gave the error: FastTree.c:212:20: error: malloc.h: No such file or directory

After some Google searching I came across a couple of indications that malloc.h (I had no idea what this was) was outdated.

malloc.h not supported, use stdlib.h (http://developer.apple.com/technotes/tn2002/tn2071.html)

Most helpful was this:

Mac OS X for unix geeks
http://unix.compufutura.com/mac/ch05_01.htm
5.1.2. malloc.h

make may fail in compiling some types of Unix software if it cannot find malloc.h. Software designed for older Unix systems may expect to find this header file in /usr/include; however, malloc.h is not present in this directory. The set of malloc( ) function prototypes is actually found in stdlib.h. For portability, your programs should include stdlib.h instead of malloc.h. (This is the norm; systems that require you to use malloc.h are the rare exception these days.) GNU autoconf will detect systems that require malloc.h and define the HAVE_MALLOC_H macro. If you do not use GNU autoconf, you will need to detect this case on your own and set the macro accordingly. You can handle such cases with this code:

#include
#ifdef HAVE_MALLOC_H
#include
#endif

  • So I opened the C file in a text editor and searched for malloc.h, I found it on line 212. I then deleted that line (#include {malloc.h}) and inserted the four lines from above.
  • I repeated the gcc command to build the application from above and it worked. No errors and it produced the application in 1 second.
  • I tried to launch it and display the help file using the terminal command: ./FastTree -h
  • It didn’t work at all, so for no logical reason I just dumped the application and rebuilt it with the same gcc command. This time ./FastTree -h did launch the application and it displayed the help file. Success!
  • Using the simple instructions from the FastTree page I tried to run it on some lizard DNA sequences I had lying around. These were fasta sequences, although it claims to work on phylip also. The command was: ./FastTree -nt lizards.fasta > lizards.tre
  • It produced a tree with only the first taxon name and nothing else. The input file had mac line endings though and when I corrected that to unix it was fine.
  • I noticed that some of the names were truncated and wondered if they had been chopped at spaces. I replaced with underscores and got better results. [In the spirit of full disclosure these last two points were with the previous version of FastTree and I didn’t try to replicate these errors with version 1.0.0]
  • Another problem I had with the previous version was when I accidentally had a repeated taxon in the matrix. It complained about a “non unique name X in the alignment” and wrote an empty treefile.
  • Having got around these teething problems it ran perfectly. Almost instantly writing a treefile (input fasta N=112, 923bp). The tree looked quite sensible and it had support values (local bootstraps). These are on a 0 to 1 scale, so 0.95 is 95%

Fairly straight forward (except for the malloc error) on the whole. Next I’m going to report on some bigger runs and start timing them properly. My immediate goal is to get a tree of all ~20k metazoan 18s rDNA sequences. Of course a tree of that size will bring its own problems, how to visualize it.



davelunt.net