Friday, 18 December 2015

Lessons learned in BEAST

At the moment, I am correcting an introduction to one of my chapters, analysing some ecology data, and running BEAST analyses in the hope that I will get nice, convergent trees. If you do lots of things at once, you do them slowly, especially if you are staying with your parents for a bit. But I have finally managed to sort out my CO1 tree so that the analysis converges and I don't get negative branch lengths (which look like trapdoor spiders are evolving backwards - try removing partitioning or using a random clock to remedy it). Now onto CytB.

I have kept a diary of hints and tricks about BEAST that I have learned along the way. It is now illustrated with many and varied trace plots.

Running BEAST takes a long time, and you don't want a power cut during the process (though sometimes putting your computer to sleep pauses the run). To start with, I run only 10,000,000 generations and try to make that work. You can tell if an analysis has worked because your trace plot for the posterior prior looks like a fuzzy caterpillar that has been straightened out (i.e. it isn't moving in any general direction other than left to right). The tighter the fuzziness, the better. The ESS values (estimated sample size) are >200 for every prior. And the tree looks like a phylogenetic tree, rather than a straight line, an invisible tree, or a weeping willow.

The first thing to do to a set of sequences that have been aligned (and put in the correct reading frame, if needed) is to test for appropriate substitution models using JModeltest. There are many different model testers, but JModeltest is very good, versatile and thorough, and I have investigated Beast 2 (which can model test using Bayesian algorithms rather than maximum likelihood) but ultimately came back to JModeltest because of its simplicity and suitability for answering my particular questions. Then, one takes one's sequence alignment and tests for partitioning (if the alignment has no gaps) using PartitionFinder (which is dead easy to use as long as you can work Java, which in my experience likes to play up a lot). If an alignment requires partitioning, it means that one or more bases in each codon evolves under different rules and times than the other bases. You can edit your alignment in a NEXUS file using the charset command to specify partitioning for Beauti.

The program BEAST is only for analysis. You set up the XML file in Beauti to analyse in BEAST. Of course, PartitionFinder will tell you that your dataset evolves under different models than what JModeltest says, because it is testing each of the three bases in the codons separately. I have wracked my brains over how to deal with this, because JModeltest acts as though all the base pairs evolve under the same model, so it averages the models out. There are lots of solutions to try: select the model output by JModeltest for all partitions; select the models output by PartitionFinder for their respective partitions, or select one of each. You can always try all of these options, but the difference is usually slight because the models are usually very similar and BEAST is very clever. I find that selecting the models output by PartitionFinder works best first, but if there are important parameters estimated by JModeltest, you can put those into Beauti under the Priors tab (but don't at first; you may not need to).

When first running the analysis, set only the model that you are using (under "sites"). Do not input any of the values for parameters estimated by JModeltest. You can limit the obscurity of the model by restricting the number of models that JModeltest tests for to 40 (select 5 under "number of substitution schemes"). Then you will only get models that can be implemented easily in BEAST (this is lazy though - you should really test more thoroughly and edit the BEAST XML file, but it doesn't make a huge difference and it takes people like me a very very long time to work out what needs changing, though it's easy enough when you work out how, but then BEAST runs for a few generations and crashes and you can't work out where you made a mistake).

When first running the analysis, set the clock to lognormal relaxed.

Set the MCMC chain to run for 10,000,000 generations.

If you have an outgroup, enforce it under the Taxa tab.

Then run it in BEAST!

When the run is complete, check the trace file. First check the ucld.stdev prior - if the mean of the estimates is more than one, then a strict clock is not appropriate (in other words, continue with the relaxed clock for now, or try a random clock). If the mean is <1, try a strict clock. If you need to try another clock, do it now and don't change anything else. See what difference (if any) it makes to your ESS.

The ESSs are good and the posterior caterpillar is heading in the right direction, but it appears to be melting in parts which isn't ideal.
This ucld.stdev mean is <1, so I tried random and strict clocks (neither of which improved the ESS, so I stuck with a lognormal relaxed clock - for now).

If you still have low ESS, change the offending priors to normal distributions around the mean, and the mean value should be the one estimated by JModeltest. Run it again. Change one thing at a time only, each time you run the analysis. Checking or unchecking the "Estimate" box next to where you set the clock also makes a huge difference to ESS.

This is a summary of what I have learned so far, in the last few weeks, about BEAST. My learning is a work in progress, so the above is hardly a substitute for advice from people who know what they are talking about. Try it at your own risk! It is a tricky program and help is not easy to find unless you can interpret computer speak (a useful skill in evolutionary biology). But it is very clever and very useful. Also, while an analysis is running, it's like waiting for Father Christmas to come.

If you're reading this because you need some tips with BEAST, message me if you want me to clarify anything (I tried to keep the post short!). If you're reading this for fun...well, surely not.

Good luck, merry Christmas, and I will see you in the new year!

Wednesday, 18 November 2015

The Watch

I thought it said in every tick:
I am so sick, so sick, so sick;
O Death, come quick, come quick, come quick.
-The Watch, by Frances Cornford.
 Time is ticking. By October 2016, my three years will be up. By February 2017, my visa and stipend will run out. I can sustain myself with part-time work if my stipend runs out, but to renew my visa I need to prove that I have money in the bank. Which I will not.

The last months of a PhD student's study can be hell, and it's usually harder for international students than it is for locals. Firstly, you have to take what data and results you have (usually a lot less than you were expecting), analyse it and write it up before going through cycle after cycle of edits and criticism. Secondly, your stipend runs out and you are expected to live off feral pigeons and non-toxic packaging material. You also have to pay university fees. Thirdly, you have to contemplate your extremely uncertain future (being qualified no longer means you will get a good job). Fourthly, you will be leaving your life and friends behind soon. Fifthly, you face deportation if your visa runs out before you're finished.

Nobody really tells you this when you're starting your PhD, but it doesn't take long to work it out.

I have every intention of finishing my thesis by October 2016. I'm also applying to become a resident, because I like New Zealand a lot and I'd love to secure a postdoc here (an unrealistic goal, but one worth shooting at). I love doing my PhD and haven't yet run into the exhausting stress that I am told will come. If this project was set to run for twenty years, I'd be so happy, and I'd probably want it to run for another twenty years after that because there is so much left to find out. But I only have till next year. I'm going to re-evaluate my timetable for the next 11 months so that I can fit everything in. This is what I have left to do:

  • Data collection
    • I recently got more funding (thanks to the Brian Mason Trust), so can do one more massive pitfall trapping effort in the winter to collect more males. They will be useful for my DNA versus morphology in species delimitation research. I also have four males from Quail Island to sequence.
    • I must visit Otago Museum to look at their extensive Cantuaria collection, and compare morphology with the specimens that I have collected.
    • Investigating ArcGIS layers to add more data about where Cantuaria populations are found.
  • Data analysis
    • I'm still fiddling with BEAST to get decent trees, but it's getting much easier.
    • I've started using SPLITSTREE to make phylogenetic networks and see whether there's any hybridisation going on.
    • Applying a molecular clock.
    • I haven't yet looked at GenGIS to start phylogeography research.
    • Generalised linear mixed model in R to analyse habitat selection data.
    • Delimit and describe species.
  • Writing
    • I need to get up to date with all my chapters!
    • When I have done all of the above, I can finish writing it up.
    • Then I need to send it to my friends to edit out stupid mistakes.
    • Then I need to send it to my supervisors to edit out mistakes.
    • Then it gets read by my assessors.
There are other important things that I need to be getting on with, too; I am also looking for (and nurturing) postdoc opportunities, and training falconers and falcons so that I can make some money and New Zealanders can learn the difference between a falcon (endangered) and a harrier hawk (not endangered in the slightest). I want people to care a bit more about the wildlife in this country, rather than just appear to care about it.

At some point, I also want to write a book on BEAST for beginners, written in English rather than computer language, but if you're reading this and want to steal my idea, please do - just make sure it's written for beginners and not people with a degree in advanced computer physics or whatever. I believe in a future world where first-year PhD students can understand what they are doing when they plug numbers into BEAST, and not be told "Of course you are getting a misconbobulated flatuole constant. You forgot to destatify the gayn trigger" or other such stuff which makes no sense to those of us with a biology background.

So I have a fair bit to do, which won't realistically get started in earnest before January. But I'll do what I can until then!

Tuesday, 6 October 2015

Two years in: funding and data analysis

I'm two years into my PhD, which started officially on October the 1st 2013. This time last year, I had completed the following:

Completed my proposal and seminar
Collected female specimens from throughout NZ
Been handed some males from the public, and pinpointed good places to set pitfall traps
2 conferences with presentations
Begun sequencing for phylogeny
Found someone to help me with genetic and ecology fieldwork early next year.

I was a bit sad that I hadn't completed my objectives, and that real life got in the way of me devising my perfect routine. Over the past year, I got used to this. Everything takes longer than I think, and my initial objectives were rather unrealistic. Money, friends, and hobbies are all things that get in the way of doing my PhD, but they are also things that keep me sane, which is quite important. But instead of trying to cut them out of my life, which I have been trying to do but failing miserably, I have decided to live with them and take them into consideration when planning things. That has been much kinder to my blood pressure. Every day, I prioritise my PhD above everything else, but after I have done a bit of work I can do the other things that are screaming for my attention. I'm on track, I think, so it seems to be working. I have, however, become really bad at time management and answering my emails, because I have periods during the day when I want to work on my PhD and the rest of the world be damned. Plus I have discovered that one can "flag" one's emails to prioritise dealing with them. I can do the flagging part fine; it's the dealing with them that I usually forget.

My project went over its funding allocation for this year, which has meant that I had to stop lab work. It was good in a way, because I needed to stop anyway. It was getting to the point where I was trying crazy and superstitious ways to squeeze sequences out of extracts that probably can be sequenced using some method somewhere, but weren't worth trying every possible combination of every parameter and ingredient. After completing my last 96 sequencing attempts (as usual, most didn't work), I helped out on a field trip with some undergrads. While out there, we caught some male trapdoor spiders which would be really good to sequence. I still have them. I really hope I get enough money to sequence them before October next year. They are from an island and would be a really interesting piece to add to the puzzle. I completed my environmental data collection too (I think). So the only data I have left to collect is morphological.

Now I am doing DNA data analysis, which involves downloading programs which don't work, and trying to get them to work. I just cracked one yesterday, and went to use one that I have used since my honours project and know really well, but it needs downloading again, and it won't install, and it requires Java, and Java won't install. This used to really stress me out but now I feel weirdly zen about it all because it's familiar. The feeling when something finally works is incredible. I think when this PhD is over I'd like to write a book on basic molecular techniques and analysis for people like me. I get this feeling that everyone instinctively knows how to work these programs except for me, and then someone comes up to me and asks how to do something basic and I realise it isn't just me.

All I have to do is phylogenetic and niche modelling data analysis, then finish my thesis. A year's work, easily, hopefully?

Anyway, here's to the next and final year. Cheers!


I'm visiting the UK soon; one of my oldest friends is getting married, and another friend from China wants me to show her around Britain. I said I would if she would show me around Shanghai. So on the way back, we'll be stopping off for Chinese New Year.

So people have started to ask me "When are you going home?" "You're going home soon, aren't you?", and so on, as if I'm just here for a holiday and will finally be returning to the UK. That's not it at all. The UK isn't my home. It's a nice place, but we never really fit into each other. I loved living in Scotland, and to start with that felt like home, but the underlying Anglophobia from ancient resentment wore through over the years.

I think people assume the UK is my home because I was born there and I have an English accent. I guess if that's what makes a place your home then it is, but then "homely" would also have a different meaning if it wasn't to be disembodied from its mother word. Since leaving home in 2007, I haven't lived in England and I haven't wanted to. I lived in Wales, then Scotland, then New Zealand, then Scotland again, then back to New Zealand. Each of those places was home, and now I can make myself at home pretty much wherever I am. New Zealand is now most definitely my home. I'm comfortable here, most of my friends, responsibilities, and possessions are here, I have a way of life and I feel like myself. I can talk with people who are like me, and have the freedom to direct my life. If I'm having a bad day I can hop in the car and drive somewhere remote. New Zealand has its problems, of course, but it feels like I belong. I'm really happy here!

Not to say that I don't miss things about the UK. I come from Rodborough Common, an ancient tract of wind-blown meadow land where farmers have grazed their cattle since before anyone can remember. Rodborough Fort was built in 1764, 76 years before the Treaty of Waitangi was signed in New Zealand. It's still a private dwelling. I miss the oldness of buildings and traditions, the feeling of gravity that surrounds them. I miss seasonal festivals being in the right season. I really miss the seasons. Weather was always quite important, but I only really recognised how important it was when I moved here. There's no autumn, and the winters are dull and wet rather than crisp and snowy (although really it is Scottish weather that I miss).

I suppose these are first-world international student problems. They aren't problems really. But I get a bit tired of people assuming I'm going to go back to the UK one day and stay there, because that would really be quite bad for me. I'd much rather stay wherever I can do science, live relatively peacefully, make friends and control the direction of my own life. Isn't that what everyone in my position wants?

Sunday, 30 August 2015

I know I'm not the only one.

How I always hope my day will go:

7.30am: Get up, get ready.
8.00: Drive to house of person who I car pool with, train hawk while he gets ready.
8.30: Leave for uni.
9.00: Arrive at uni. Work on PhD all day with an hour for lunch and 10 min breaks between hours.
5pm: Leave uni, tired and satisfied.
5.30: Arrive back at home, work on lectures, collaboration or read something about science. Alternatively do something hobbyish.
10.00: Go to bed.
10.30: Be asleep.

How my day usually goes:

7.30am: Get up, get ready.
8.00: Drive to car pooling person's house, drag out of bed and convince to go to uni (he lost enthusiasm long ago), train hawk, lose track of time.
9.00: Leave for uni.
9.30: Arrive at uni; make cup of tea.
9.45: Check emails, respond to them, check facebook, learn about something that is as interesting as it is irrelevant to my study.
10.00: Beat self up for not working. Make another cup of tea.
10.15: Meet someone in the hallway and chat. Remember I have to talk to someone else as well.
10.45: Realise time and beat self up for not working. Make another cup of tea.
11.00: Do a bit of work.
12.00: Lunch starts.
1pm: Friend wants to catch up, has crisis/exciting news/is visiting town and will be gone tomorrow.
2.00: Check email and facebook, realise the time and beat self up for not working.
2.15: Do a bit of work. Panic about how much I have to do and the prospect of falling behind. Welcome interruptions from people. Complain to them that I'm not working enough and they tell me they are also unproductive. Must be weather/ time of day/ day of week/ time of year.
4.00: Depressed at lack of progress. Procrastinate with anything.
5.00: Decide to leave uni and work at home.
5.30: Make dinner.
6.00: Flatmate gets home. Chat, eat, watch TV series.
10.00: Go to bed.
10.30: Realise how much of today was wasted. Beat self up for not working. Panic about falling behind. Try to work out where today went wrong. Recite priorities. Resolve to do better tomorrow. Work out what I will do tomorrow, exactly.
12.00: Fall asleep.

NB: There are always a few of these days, then one super productive day. Apparently this is normal. Why? I like my work. Why do I avoid it so much? I think because it is labelled as work, and everyone hates work.

Tuesday, 18 August 2015

Fear and phobia

A phobia is an anxiety disorder characterised by intense, irrational fear. A very common one is arachnophobia: fear of arachnids (usually spiders). Some self-proclaimed arachnophobes are just silly. A fear of spiders can be used to get attention. Young children are common examples of this, and you can just tell them that they're not scared or say that only silly people are scared of spiders, and their fear is cured. Others are much worse, and the fear can take over their lives and change their personalities.

Phobias, and other anxiety disorders, are medical conditions and can be treated. But for some reason, people with arachnophobia tend not to think there is anything wrong with them, and they just live with their fear. Possibly the high prevalence of arachnophobia and general acceptance of it within society is to blame, but that doesn't stop people going to the doctor for antibiotics every time they get a cold or sore throat.

But if you have a phobia, you need to get it treated. Fear is controlling, and phobias can make you avoid situations that you would otherwise enjoy or benefit from. As a sufferer, you don't always realise what you are missing, but when the phobia has been treated a whole world of possibilities suddenly becomes realised.

Fear is such an intense reaction. Telling someone to stop being silly, or that there's nothing to worry about, doesn't work because fear doesn't respond to logic (otherwise it wouldn't be irrational). There are different ways of treating phobias, and I'm not a psychologist, but I have experienced some of them. If you have a phobia, go to your GP and say that it's getting in the way and you want it gone. Hopefully they will refer you to a counsellor or psychologist.

For about ten years, I had two phobias: vomiting and crowds. The crowd phobia was cured entirely by exposure: I got a job fund raising in shopping centres and hid behind the display table until I managed to brave the crowds for longer and longer periods of time. The vomiting phobia came about because I had underlying anxieties making me feel sick all the time, and I was always worried that one day it would come to fruition. I took anti-emetics, avoided children like the plague, never drank alcohol and stayed away from people who did. It got so bad that when my friends started drinking, I would run away to a dangerous place where they would not follow (even drunk people can have a sense of self-preservation). I got some treatment, mostly cognitive behavioural therapy (CBT) (break the cycle of avoidance behaviour you got yourself into), but also trying out "tapping into the brain" which is based on neurolinguistic programming. It works for people who get travel sick and respond well to placebos (which is a lot of people who get travel sick). It didn't work for me.

CBT worked pretty well for me, but the psychologist was very wise and very Glaswegian, and he would say things that stuck with me. Quotes from Mark Twain about worrying thousands of times during your life but hardly any of the things you worry about come to pass. Reminders that, no matter how intense the nausea, it will pass. I began repeating these mantras every time I felt sick rather than taking pills. I began looking forward to the time in the future when I wouldn't feel sick. Eventually, I stopped feeling sick. For some reason, the travel sickness that I developed (after 15 years of life with the strongest stomach imagineable) stuck, but I can now be around people who are drinking and eating dodgy food without worrying too much. Which means I can attend parties (though I still don't understand them). I can share a car with someone who gets carsick without leaving and deciding to walk the remaining 250 km but don't worry, I don't mind, I like walking...

With the loss of emetophobia came the rise of another phobia: claustrophobia. Any small, enclosed space, particularly if it was warm, would make me panic. It developed one night when I was trapped inside a tent in a hot, humid place. I got flashbacks of a time when I was trapped inside a car on a hot day when I was too young to understand car locking mechanisms, and the two knotted and formed a phobia. I came to NZ with this phobia, and some other ones as well (some people get moles, I get phobias, ok?). My GP referred me to a psychologist, and she treated all the underlying things that people accumulate from when they're young due to the fact that this world is hard to live in, and people can make it even harder. I understood where the fears came from, and how they worked, and that helped me deal with them. I made an anxiety ladder, each rung slightly more frightening, and exposed myself more and more to the situations which made me nervous (ranging from telephone conversations to eating in front of people). It worked really well.

So, if you are frightened of spiders or children or revolving doors or music or art, you can probably cure it very simply by exposing yourself to it (read about spiders! They are really fascinating, like music and art!). Or, if that doesn't work, get some treatment! You may be able to get it free or subsidised, and it makes life really wonderful to have a phobia lifted.

Tuesday, 28 July 2015

All hail the God of PCR

At a particularly low moment in my PCR optimisation process, I created a shrine on top of a cupboard. It was dedicated to a god of PCR. I'm against the non-critical thinking that leads to religion and faith, so you can probably imagine how low I had to be to create a shrine. But I thought that if, despite all evidence and logic, there was a god of PCR...I had better start worshipping it.

Since then, everything has pretty much gone to plan. I have had more samples work than not work, and I've begun to assemble a pretty comprehensive tree of Cantuaria. It has mostly shown me that the current taxonomy of Cantuaria places them into different taxonomic groups (species, genus) than their genetics do. But there is still a bit of work to do with BEAST, the program I am using the build the trees.

I've decided that optimising the stubborn samples that I have left (which all seem to require completely different PCR conditions) would be a waste of time and money. I have enough for my tree, and it would be nice to get them all to work, but I was meant to have completed this six months ago. I've got three new genes to work with as well (thanks to a friend) which I'm going to use to create a different phylogeny with fewer samples, but more sequences per sample (an idea which I picked up from the Evolution conference). That is not going to involve optimisation, though. If stuff doesn't work, screw it, I'm out of there. I have more ecological fish to fry.

So, all that I have left to do in the lab is...
-extract DNA from the 43 remaining spiders that I have collected
-PCR for all three genes these 43 spiders
-CO1 PCR a few more samples
-PCR the new 3 genes (I'm aiming for 14 sequences - two from each major clade).

If the PCRs for the 43 remaining spiders don't work, I'll try adding BSA (a magic liquid extracted by squeezing cows), doing a gradient PCR, and using MyTaq, which was recommended to me by another person who studies the same family of spiders (but seems a little sensitive for most of my samples). But that is the limit to which I will go to make them work.

There is a light at the end of the tunnel. Wish me luck!

Saturday, 4 July 2015

Evolution Conference, Brazil 2015

The perching birds (passerines) decided to have a conference to display their various adaptations. Nightingale went along to see what the other birds had been up to. However, when he got to the conference, he could not see any other nightingales; the only member of his family was a blue flycatcher, with his striking cobalt and rust plumage. In fact, none of the other passerines at the conference looked much like a nightingale at all. They all seemed to be brightly coloured.

Nightingale went to some of the displays the birds were giving. Most of them comprised dancing, or showing off their bright plumage; manakins were moonwalking and jumping up and down, birds of paradise made their usual elaborate displays, and other colourful species simply talked about how their attractive feathers helped them to find quality mates. Nightingale soon grew tired of their splendour, and wished his plumage was comparable.

At one of the mixers, Nightingale met a lyrebird. He was just as brown as Nightingale, but with long, elegant tailfeathers. "He's not so gaudy as the other lot," Nightingale said to himself. "He looks better than me, but he's still not quite there yet!" It made him feel a little better to see a brown bird. He wandered over to the lyrebird and bowed.
"Hello there," said the lyrebird. "I am Menura. What genus are you?"
"I'm Luscinia," said Nightingale. "I specialise in song," he added.
"Oh, a songster!" said Lyrebird, obviously elated. "Fantastic! There's only you, me, and Songthrush over there."
Nightingale went to Songthrush's display, and was comforted: the thrush's song was not nearly as beautiful as his own. But when he went to see Lyrebird's session, Nightingale's heart sank: the lyrebird began with a lilting melody, then incorporated beautiful and exotic sounds from the human world, and finished by imitating every bird that had come to watch him.

Nightingale was so flustered that he wanted to leave the conference. He felt so inadequate that he didn't look where he was going, and bumped into a sparrow on the way out. 
"What's wrong?" asked the sparrow, her face full of concern.
"Oh, I'm supposed to be a passerine, but I can't possibly compare to the others with their bright feathers and wonderful dances - even the dull ones can sing better than I can!" Nightingale sobbed.
"Ha! Those posers," said the sparrows. "Don't believe for a minute that you need bright colours and gaudy displays to succeed as a passerine. Sure, they work for those who use them, but you won't catch us growing sparkly feathers - we'd just get eaten!"
More and more sparrows began to crowd around the poor, crying Nightingale. They offered him tissues, and one put his wing around Nightingale's shoulders. "There are so many of you," Nightingale gasped.
"Yes! We're one of the world's most successful families of bird. We have colonised every continent except for Antarctica, although we've had a bit of help from the humans," said a sparrow.
"'re so dull coloured!" Nightingale said.
"Success isn't much to do with how bright your plumage is," said another sparrow. "It's more to do with your biology as a whole. If you've got a good well-rounded ability to adapt and survive, you'll do well, young nightingale, and have a brilliant future!"
One by one, the sparrows went back into the conference, focusing on displays of behaviour and foraging rather than the showy displays that Nightingale had been going to. Nightingale followed the sparrows, learning about their natural history, and took home plenty of ideas about how his lineage might adapt to the challenges of the future.

Bird phylogeny by Jetz et al. (2012, Nature). Species within the black line are passerines.

Monday, 18 May 2015

18-month(ish) report

I submitted my 18-month report a while ago, and the other day I had my 18-month review (although it has actually been 19 months). The review is there for a few reasons, but it particularly serves to identify major problems in time to solve them, and to scare the student into working if they haven't already started. You give a brief presentation, and then your supervisors and an assessor discuss your findings so far, and your assessor asks lots of questions, and they identify concerns. If you're in a really bad state in your project, they can supposedly kick you off it, but I think this is a story told by supervisors to young students to scare them into being good. I've only heard of people who have heard of people who have been kicked out. That said, as a graduate student you shouldn't need to rely on your supervisor to kick your arse into gear. You should be disciplining and organising yourself by now. That is what I tell myself all the time, anyway.

My report went OK, but the assessor had a lot of questions. That's fine really, because they don't know my project that well, but she raised some concerns that I thought were just interesting things. Like the fact that Cantuaria molecular data and morphological data say completely different things. I have the equivalent of two identical-looking lions, but their DNA says one is a tiger, and I also have a tiger and a lion that have DNA so similar that they might as well both be tigers. I thought that was quite interesting, and a nice illustration of how morphology and genetics don't always agree on how a phylogeny should look. It also means lots to work on with regards to taxonomy, which is great because I have special funding for taxonomy alone. But my assessor said it is a massive problem, and I need to find males to back it up (which I am looking for but cannot find!), and it is going to get in the way of landscape genetics. I'm not really sure that it is though, because I am not basing any of my inferences on morphology - just genetics - and I have always been working under the impression that the morphology will be misleading. I just hadn't reckoned with how misleading it would really be.

Overall though, my main supervisor says it went well. I need to get more sequences and more funding for tuition fees, but I have enough work for a PhD and it is meant to have challenges and be hard. I can keep going. They all gave me some useful pointers about landscape genetics to look into. More on that when I get to that part of my PhD. For now, I must continue to try and get rid of this inhibitor problem that I am having!

Thursday, 7 May 2015


"There comes a point in your PhD when you've hit rock bottom. And  you've either gotta kill yourself, or do some work. I recommend the latter."
-Hamish Patrick
I love my PhD. I looked forward to doing it throughout secondary school and undergrad, and was even more excited during my masters when I got a taste of what full-time research was like. I knew that it would be a hard slog, but I also knew that it would be the only time in my entire career when I get to focus 100% on my own single research project. Unlike some poor schmucks, I even got to design my own research project based on what I'm interested in. I have an awesome advisory team, a great place to live, my own office space and a project that I love.

Yet even I have succumbed to what they call the "two-year slump".

Here's what I was told about the two-year slump: the honeymoon period of your PhD is over. You have finished planning everything and coming up with new ideas, and are fast discovering that things take a lot more time to do than you realised. You feel as though you haven't accomplished enough, and aren't living up to expectations. You have also, by now, read enough around your subject to understand that what you're doing isn't really that important or revolutionary.

This pretty much describes what I'm going through. A year and a half isn't enough time to do a venom phylogeny, a boosted regression analysis on habitat selection, develop microsatellites for Cantuaria and do ground-breaking landscape genetics with them, describe a bunch of new species, AND finish my phylogeny AND write my thesis. So far I have only managed to collect samples and habitat data, and mostly optimise PCR protocols. That took a year and a half. Seriously! A year and a half isn't as long as I thought it was.

It's not like I've been slacking off, either: I kept work as work and play as play, prioritised my work over everything else and did everything I thought good students do. I have a timeline, a diary and a list of goals broken down into weeks and months. For the last two weeks, I haven't met them, because I have been doing an average of about 3 hours of work per day and procrastinating for the rest of the time. It's really hard to motivate myself, because what's the point? So far I have done my best and got a tiny fraction of my PhD finished. It's unlikely I'll even have a third year, because I don't have the funding for my third year of tuition (research funding, tuition, and salary/stipend are all obtained separately. Research funding is relatively easy to get, stipend a lot harder, and tuition fees nearly impossible). My work uses techniques that have been around for a while to add a tiny piece of evidence to an already enormous body of evidence (much of which has been gained using modern techniques that are more exciting than mine). It will add a sizeable chunk of knowledge to what little we know about a genus of trapdoor spiders that nobody really cares about.

But that's what the PhD is: someone once told me that, as a postgrad, you're not the one at the front of the line, slashing through the vegetation of unexplored jungle: you're at the back of the line, with a magnifying glass, trying to see if we missed anything. I don't mind that my research isn't profound. I'm happy that, so far, I've found out a lot about trapdoor spiders, including that they are a lot more interesting and complex than I ever thought. I'm not really furthering the progress of science, but I'm educating and improving myself, so that I can learn to be a researcher.

I'm not going to do everything that I set out to do. I'm going to do a phylogeny, calibrate it with geological dates, and see if the date of divergence from the outgroup is before or after the Oligocene Drowning. Then I'm going to see if that answer makes sense, given what I have found out about the ecology and genetics of Cantuaria. The other stuff? Maybe I'll do that at some point. It's not essential.

I've just got to work around this slump. It helps if I do something I like doing, like sorting through things or writing my thesis (I'm writing a pretty cool chapter at the moment). The worst thing to do is lab, because it takes me all day to do one tiny thing because I know the end result will be a gel with no bands. That's (I hope entirely) because of inhibitors in the spider DNA, which I have ordered a few things to help with. I need to go and collect some more habitat data from the West Coast, too: now that I've got rid of a lot of stuff from my PhD, I have to rely a bit more on the ecology results, so I really have to get that done.

Wish me luck!

"Luck? I don't wish you luck! I wish you sense!" - Boris the goose (Balto)

Tuesday, 28 April 2015

"Pleasant to read"

I really have to get over this somehow. Other people manage to read papers without falling asleep. Why can't I?

I have just returned from a very useful couple of talks about writing - how to discipline yourself, and how to make a good job of your thesis. Major take-home messages:

-Write little and often (I am going to write first thing in the morning, which will also enable me to car pool with late risers)
-Be regimented in your writing if it works for you (which it probably does)
-Make sure you have a sound structure
-You are the expert in your small field, teaching others about it, but...
-You're not trying to get a Nobel prize.

As useful as they were, the talks snared me with a line which I have heard many times: that your thesis (or manuscript, or proposal) should be "pleasant to read".

I have read a few theses. One was pleasant to read (mostly because it was really cool at a basic level - radio-tracking tarantulas!). The others were interesting, intellectual, and soporific. Scientific writing isn't really designed to be pleasant to read. You are trying to convey a tiny discovery to your reader, and everything about it - what we knew before your study, why you did your study, how you did your study, what your results were, how you interpret them, and where you think future research should be headed. This is not rippingly good stuff. It is functional. Other people can read it, interpret your study, replicate it or use your findings to support their own research. Science scrutinises the smallest crumbs of reality, which themselves seem meaningless, but together can be built into a bigger picture of the truth. The crumbs themselves are often rather dull. The bigger picture, reported in a scientific review, or on a blog, or in a piece of science journalism, sounds much more exciting.

I mentioned, at the end of the talks today, that I find scientific papers (and theses) dull to read (actually I find them extremely dull). People gasped, and disagreed. Surely I can't be alone in my lack of eagerness to read thick paragraphs of jargon, wandering around a single (often uninteresting but necessary) point. But maybe I am, or I don't have the intelligence to process sentences that are heavy with pointlessly pretentious words such as "utilise". One person suggested reading only the abstract, but I was taught as an undergrad that that is a heinous crime. Reading only the abstract does not enable you to judge the validity of the claims that the paper is making.

As a PhD student, I am not at the cutting edge of some profound, worldwide research initiative. I have a tiny project that involves measuring the diameters of trapdoor spider burrows, and hours of pipetting in a lab. My research will hopefully produce some evidence that supports either vicariance or dispersal as the major reason why Cantuaria show their current distribution. It will not prove or disprove the amount of land present during the Oligocene drowning period, but it will provide some evidence which may be combined with the current growing body of evidence to indicate how much land was there (and a few other things about NZ biota, spiders, and dispersal versus vicariance). There will be about seven different messages from my thesis which could be presented as bullet points, but I must also review current knowledge, and write about how and why I did things the way that I did, and how they can be interpreted and what people should be researching in the future.

I hope it will be pleasant to read!

Tuesday, 24 March 2015

BEAUti and the BEAST

After putting all my sequences into a nice order, I filled in missing sequences or parts of sequences with question marks (less dynamic and thrilling than it sounds).

What is SD2? We just don't know.

Next, I imported the sequences into the program BEAUti (which takes sequence files and information about how you want the analysis to be run, and turns it into a manageable format for BEAST), and specified the right(ish) substitution models (as predicted by JModeltest, but BEAUti doesn't have a huge selection of preset models so I just did my best with the options it gives). Also, I told BEAUti to make BEAST run as STARBEAST, a modification which enables several sequences to be grouped together (in my case, sequences from the same species can be grouped together as that species). 
Coded sequence names (left panel) can be grouped together into nice, friendly species names (right panel) in STARBEAST.
 STARBEAST also allows you to use multiple genes and specify whether or not the genes are independently evolving. I set it to run 100,000,000 generations, because I have a supercomputer and my supercomputer can do anything. Then I ran the analysis using BEAST. After 20,000,000 generations and a fitful sleep, I stopped the analysis and decided 10,000,000 would be enough next time. It was plenty long enough anyway, especially considering this first tree was but a preliminary one.

This is what BEAST looks like when it starts running. It is sampling lots of phylogenies and seeing which are the best supported, given my prior assumptions and the posterior probabilities estimated from the sequence data by BEAST.

 Anyway, the output tree looked quite cool:

It looks like the new species I found in Central Otago may well be a new species, wheras the new species found in Canterbury all look to be the same as C. dendyi (despite looking very different from it). Also, C. delli, C. stewarti, and C. isolata are all open-burrow spiders, but they appear to be polyphyletic - that is, they are not all grouped together, suggesting that lid-building traits have been lost several times in the evolutionary history of Cantuaria.

Spurred on by this in"tree"guing outcome, I sequenced some more spiders to make a bigger tree...but computer said no. Firstly, finding the correct substitution model using JModeltest didn't seem to work - the DNA alignment (collection of sequences lined up so that the different parts of the sequence match parts of other sequences) took too long to analyse. It seemed to be a bug with JModeltest, so I used an older version, but that couldn't read the sequences at all. I've given up with that temporarily, and used the same models as I did with the first tree. But I will have to get it working for the tree that I put in my thesis.

After selecting all the right(ish) settings in BEAUti (which took a few goes, because some of the question marks were in the wrong place), I tried to run the analysis in BEAST, but it had trouble reading in the data or it just crashed. Most of the problems were down to the plugin BEAGLE, which I use because the power of my computer is mostly in its graphics card, and BEAGLE allows you to harness the almighty power of the graphics card (which does lots of little things very quickly, as opposed to the computer's processor which does few big things slowly). I finally managed to get it to run, after I had remedied the following things:

- When creating mapping files for BEAUti, make sure sequence names differ from the labels used to group them.
- Make sure every set of sequences is in the same order as all the other sets of sequences.
- When running BEAGLE, give up and don't use BEAGLE unless you want to spend half an hour playing with its settings for something that takes the same amount of time as not using BEAGLE (why the hell did I spend so much money getting a good graphics card when it doesn't seem to make any difference at all?).

The first run did not give a large enough effective sample size (ESS, shown by the program Tracer), so I ran it for 40,000,000 generations (which thankfully only took six hours). This was the resulting tree:

This tree should be more representative of Cantuaria evolution than the first tree I made: there are more sequences (though some are a bit short), and I trimmed the ends of the sequences a lot less. Distances might be a bit wrong because the substitution models I used might not be right. Interestingly, all the lidless species are now monophyletic (grouped together), although C. huttoni is also lidless and seems to be quite distantly related to the rest. Again, Canterbury new species seem the same as C. dendyi. Interestingly, C. johnsi, C. magna and C. prina all seem to be pretty much the same species genetically. Species designations will have to be looked into a bit later using the Spider R package. Also, I have to do a bit less of a rushed job next time, and resequence some of the samples that came out too short, and get the right models, in the hope that one of these things will make the branch lengths of the tree a bit more sensible (the tips should all line up). I will by then hopefully have some Misgolas samples to use as outgroups instead of Segregara.

Another interesting thing that has come out of building these trees is that if I try to group the species into separate populations, the posterior probabilities invent a new kind of distribution:

This should be a bell-shaped curve...

I spoke to another BEAST user about this and he said it means the populations are not separate - they are all interbreeding with each other. So, while Cantuaria spp. can be divided into different species, they are not easily divided into populations - perhaps these spiders move around a bit more than I thought.

Playing with trees has been fun, but now I think I need to focus on writing my 18-month report which is due in April. I also need to think about:
- Identifying samples using morphology
- Analysing some ecological data I have collected (I started that but I should probably finish it at some point)
- Visiting Otago Museum to have a look at their holotypes
- Working out how to use Spider (R package)
- Describing species
- Working out how to use GenGIS (a landscape genetics program)
- Pitfall trapping for males (before their season finishes!)
- Evolution conference in Sao Paulo which I really want to go to
- Writing my thesis!