Antibody Expert, Structural bioinformatician specializing in antibodies and mutations, Experienced expert witness, Antibody consultant.
Check him out on Linkedin as well.
1) “Effects of mutations on protein structure and predicting whether a mutation will be pathogenic using machine learning. We have developed specialized predictors for specific phenotype prediction in individual proteins as well as a general predictor.” Question: Can you unpack this interest and what about it fascinates you?
My work on mutations really is something completely different from the antibody work but really constitutes the other main area that we work on. I am interested in understanding how a mutation affects the structure of a protein and then how that affects protein function. It’s very difficult to tell what will happen to the overall structure when a mutation occurs. However we can categorise mutations into a number of groups: (i) First, a mutation might be ‘functional’ – in other words it is changing a residue that is intimately involved in the function of a protein, perhaps in an enzyme active site or in a site involved in interacting with another protein or something like a metal ion. (ii) Second, a mutation might prevent the protein from folding up – for example, if you had a small amino acid buried in the middle of the protein and you replaced it with a much larger amino acid, then it will be physically impossible for the protein to fold properly. (iii) Third, and perhaps most interestingly, a mutation may not prevent the protein from folding properly, but somehow destabilizes the correctly folded form compared with misfolded or unfolded forms. We know this is the case because there have been a number of experiments where people have taken proteins that have a mutation and reactivated them simply by lowering the temperature where the destabilizing effects are mitigated. This gives the potential for drug design to rescue mutated proteins.
So what we try to do, is look at the local structural effects of a mutation and from those identify which class the mutation will fall into. For example, a mutation might break a hydrogen bond, a small-to-large mutation might mean that the new residue won’t fit properly, or conversely a large-to-small mutation would lead to the protein having a void in the structure if it maintains its normal conformation, but in reality the protein will change conformation somehow collapsing to fill the void. However, we don’t try to predict that – we simply say that *something* will happen to the overall structure.
Once we have all this structural information we can use that to train an artificial intelligence machine learning method to predict whether the mutation is going to be pathogenic. In other words is the mutation going to have an effect on the structure of the protein that affects its function in a way that is detrimental.
With projects such as the 1000 genomes project, the UK10k genomes project and now the UK100k project as well as similar projects in other countries (such as the Personal Genome Project of Australia) we have huge amounts of information about differences between people’s genomes and we have links between those differences and diseases. Using this sort of information, we can start to try to understand mutations that are likely to be linked to disease and, in my case, try to understand what they are actually doing to proteins. Also in Mendelian inherited diseases we can try to predict whether a novel mutation seen in the clinic is going to be linked to disease.
2) “I am also an adviser to the WHO-INN on the description and annotation of antibody-based drugs and naming of biologics. ” Question: What does an advisor for the WHO do specifically, and how were you chosen?
As you probably know, antibody based drugs make up a half of the top 10 selling drugs in the world and about a third of drugs in development are based around antibodies. Commercially, and from the point of view of pharmaceuticals, that makes them incredibly important. The World Health Organisation (WHO) International Nonproprietary Names (INN) committee allocates generic names for all drugs. Naming may seem like it’s a fairly arbitrary, and not particularly exciting, topic, but getting names right is actually very difficult. Many countries, including the UK, require that prescriptions for pharmaceuticals are on the basis of generic or non-proprietary names because often the generic drug is going to be cheaper than the brand-named drug. Thus getting the name right is very important because you want to avoid confusion in prescribing and this applies both to the names of drugs when they’re spoken and when they’re written. There are a number of other considerations as well: we need to avoid names that give some sort of commercial advantage to a particular company and equally we need to avoid names that might give a disadvantage, perhaps they are rude in some particular language.
There are now approximately 70 antibody based drugs on the market, but we have to allocate names for perhaps 100 a year, of which only a small percentage actually make it to market. Consequently we have to allocate a large number of names that are never used for commercial drugs. Our job, as a committee, is to look at names that have been proposed by the applicants and check that they don’t give an advantage or disadvantage and also look at potential clashes with trademarks. In particular we try to check that the names are not likely to be confused with existing names.
Part of the problem with antibody names is that there is a particular scheme to which they have to conform. All antibodies used as drugs end in ‘-mab’ and, until recently, had a substem before that which indicated how the antibody was made (e.g. ‘-zu-‘ for humanized antibodies, ‘-o-‘ for mouse antibodies, ‘-u-‘ for human antibodies and so on). Before that, comes an infix that indicates the general target for the antibody; for example ‘-li-‘ to indicate that the target is part of the immune system, ‘-vi-‘ for viruses, ‘-tu-‘ (or now, ‘-ta-‘) for tumour targets, etc. Last year, the INN realized that there was a shortage of potential antibody names that fit the criteria, but are also possible to pronounce! Partly for this reason, we removed the source information from the name. The other reason that it was removed is that trying to describe the source in a single syllable is becoming impossible as people are producing antibodies by all sorts of bizarre and complex methods. In addition to the names, the INN produces annotations about the antibodies and one of my jobs is to generate those annotations that are published alongside the names.
I ended up on this committee because two of the members come from a university in Italy. Another well known person who worked on antibodies is also based in Italy and they asked her for suggestions of somebody to help with the problems of naming and annotating antibodies – she proposed my name.
3) Question: How have you mentored people in the past? What type of advice, suggestions, thoughts, books, or online resources have you recommended to people at the different stages of their development (i.e. just starting out, having something built, career change, etc)?
I have mentored 15 PhD students in my group and have also acted as a second supervisor another 20 or so. We also have a scheme whereby first year PhD students rotate around different labs and I have had about 6 of them work with me. I have also had about 50 undergraduate project students either working in the lab or doing literature projects and about another 25 masters students working with me. It is difficult to generalise about specific advice that I might give since everybody’s project is different. The vast majority of people working in my lab have come from a biology/biochemistry background, but they all need to learn how to program a computer and I am keen to promote best practice in programming. Consequently we make use of systems such as git and GitHub in order to maintain code and I encourage other forms of best practice such as test driven design – in other words making sure that code is thoroughly tested and that ideally the tests are written before the code. Personally I program in Perl and C, but people in my lab often choose to program in Python.
4) If you couldn’t work on Antibodies, what would you work on?
There are lots of areas of bioinformatics that are of interest to me. Obviously the simple answer to your question is that I would focus more on our work on the effects of mutations. However, we have also worked on identification of transcription factor binding sites and we have developed a lot of general purpose tools manipulating protein structure. Developing tools to help other people do science is something that has always particularly interested me. A major area of research in the bioinformatics community at the moment is anything related to Next Generation Sequencing. With the advent of new, very fast, high-throughput sequencing machines, there are huge amounts of data being generated and the development of tools to help people analyze those data is a major area of research that I would probably move into if I weren’t working on antibodies and the effects of mutations.
5) What are you a nerd about (i.e Star trek or soccer)?
I suppose the answer to this is probably wine! I love good wine, particularly red wine from Bordeaux. I buy Bordeaux wine ‘en primeur’, the year after the harvest, but when it is still in barrel. It is then available the year after that, but generally needs to be kept for another five to ten years before it is ready to drink. I am also a bit of a food nerd and visiting Michelin starred restaurants is another hobby. My wife and I also run a supper club where people cook for each other once a month.
6) What makes you hopeful for the future?
I think the advances that we are making in the treatment of disease, particularly cancer, are very exciting and antibodies are playing a major role in this. One of the recent advancements in cancer treatment is the immunomodulatory antibodies that bind to the ‘programmed death’ ligand or receptor thus stopping cancer cells from being able to evade the immune system. In general, I think we are starting to move to an era of personalized medicine. All the data coming from the genome projects and, in particular, the vast amounts of mutation data, is allowing us to understand differences between people and their responses to drugs, thus allowing us to target medicines specifically to individuals.
7) Does antibiotic resistance concern you to a high level, or do you think we will innovate around that issue?
Yes of course antibiotic resistance is a huge concern. Again antibody-based drugs may offer solutions. About 10% of licenced antibody drugs are anti-infectives. A couple of these are antiviral (including one newly approved for HIV treatment which targets the receptor to which HIV binds). The others bind to bacteria or bacterial toxins and are thus working as antibiotics. However antibody-based drugs are much more expensive that conventional small molecule drugs. Largely this means that their potential in this area has not been exploited. Clearly, while there are effective small molecule drugs costing a few dollars for a course of treatment, it makes little sense to use antibody-based drugs perhaps costing tens of thousands of dollars. As antibiotic resistance becomes more of a problem, I believe that situation will change and more antibodies will be developed and licensed for antibacterial treatments. However this is going to lead to a huge financial burden on our health services.
8) What are the next big mountains you are set on climbing that you are working on and why have you chosen those issues as things to work on?
I guess the big issue for us at the moment is something that I touched on when I talked about other areas that I would be interested in working in. Next generation sequencing is also being applied to the analysis of antibodies – people are looking at antibody repertoires in healthy individuals and in patients who have certain infections. A number of drug companies exploit this to help them develop new therapeutics. Our abYsis system contains approximately two thirds of a million sequences, but we recently visited a company who wanted to know if it could accommodate a million sequences. We said yes, that should be no problem. They said “Oh good, we are generating a million sequences a week – how would your system cope with that?” We had to admit we didn’t know! It turns out that the system as it was then needed more than a week to load a million sequences, but we have now speeded that up considerably. However loading the data into the underlying relational database becomes prohibitively slow. Consequently, we are interested in developing a new big-data orientated database in order to be able to load millions of sequences and provide brief annotations of the types of features pharmaceutical companies are interested in – things like post-translational modification sites which companies can use to triage their data before feeding it into abYsis. However obtaining funding for these types of projects is a huge problem.
9) What advice would you give someone starting out?
In terms of general advice for someone wishing to get into bioinformatics, I would thoroughly recommend that they learn to program ideally in both a scripting language such as Perl or Python and a faster compiled language like C. They should also learn the best practice programming methods that I mentioned earlier. Obviously they also need to have a good understanding of the biology relevant to what they wish to do. In the past we have had people from a computer science background doing bioinformatics and, though technically they can be very good, often they don’t have sufficient understanding of the biology, or the needs of biologists, in order to sustain research in that area. Of course there are always exceptions and there are some excellent people in bioinformatics that come from a computer science background. However, I would say this is the exception rather than the rule.
10) What has been the most rewarding?
The most rewarding thing about working in a university is the students – both the undergraduates and having your PhD students complete their PhD and move on with their careers in science or elsewhere. Of the science that I have done, I guess getting emails from people who use our software and find it helps them is one of the most rewarding aspects. The general process of turning an idea into a piece of software and making that available for people to use either as open source code that they can download, or as a web server they can access over the internet, or indeed as a commercial product, is hugely rewarding.