A. Here is my plan for my project. My primary tool will of course be biology workbench. The objective of the project will be to see how conserved the RAI1 gene is between a great number of species phylogenetic trees will be consructed and comparisons will be made, of course the greatest challenge of this project will be finding all of the sequences that need to be inputted into biology workbench, but I have a few plans. My initial search will involve blast, but not just the NCBI blast, I will blast through the biology workbench because it provides a much wider variety of options, including allowing me to search for sequences within the Protein database, something that I was having alot of trouble with before. There are also other ways that a sequence can be found. I can use wormbase to find the sequence within the nematode c. elegans. From workbench I tried to use the Ndjinn multiple database search and I searched for RAI1, this didn’t seem to work, I think your supposed to actually input a sequence and then click on all the organisms that you want to see that specific sequence for, this seems like the quickest most effective way to find all the sequences for the RAI1 gene. http://www.ncbi.nlm.nih.gov/nuccore/NG_007101.2?&from=5000&to=134980&report=fasta This is the linkt o the FASTA sequence of the RAI1 gene, this is the sequence that I inputted into DJinn!!! EXCiting!! I’m not really sure if its actually going to ever load on my computer, the RAI1 gene sequence is really really long, and Djinn searches through multiple genomes of multiple organisms to try and find a match!!! I Think instead of using the whole sequence I will try to use just a portion of it to say if any matches come up. I think that this Djinn tool has alot of potential to find certain sequences within a great number of organisms. I don’t think my computer actually can do it because it still not working, I will try it tommorow when I have access to a school computer. O and of course, to create the phylogenetic tree, I will use the same techniques as before!!!
So following the journal club, there are many things that I would like to comment on. I feel as though the paper’s specific justification was not very clear. So multiple CNVs were found in the SMS patients that had not previously been indicated in any other studies with SMS, however the majority of them had already been identified as associated with other disorders that are very similar to SMS. The end of the paper suggests that for a proper diagnosis of SMS aCGH should be used, but he also suggests that the molecular significance should be used to aid in the treatment of patients with SMS. This whole idea to me seems rather unneccesary, for proper treatment of SMS, all that is needed is a diagnosis. I liked how he suggested other genes that should be involved that are in association with SMS. I don’t really think that William’s study helped with determining the function of SMS. Another thing i’m a little concerned about is the definition of a copy number variation. I previously though that a CNV was just a sequence that was repeated over and over again within the human genome, but now I am starting to see that the definition goes more in depth than that because within the paper deletions of chromosomes are also considered copy number variations as well. I decided to look up the definition and I figured out exactly what it means. A CNV implies a copy number difference, this could be a result of a duplication or a deletion.
Further studies could have been used within the study. There were a total of 52 “SMS-like” patients and CNVs were only found in a small portion of them. Further studies should be done to determine what the molecular diagnosis of those remaining patients are. aCGH has a great number of limitations. Being that it is only able to find duplications and deletions, I suggest that symptoms could be the result of translocations or inversions. Karyotyping could be used to find large translocations, but inversions could be a much greater challenge. I think that Quantitative PCR may have applications when it comes to finding inversions and translocations, however it is definitely a study that would represent a significant challenge.
Abstract
Describes all the different methods that are used to determine whether or not someone has SMS syndrome. These methods include G-banding, flourescent in situ hybridization, Real time Quantitative PCR, and multiplex ligation-dependent probe amnplification. Flourescnet in Situ hybridization is specifically the one that Stephen Williams mentioned in his presentation. The article explains that the functional role of RAI1 gene is not fully understood expect that it appears to play an important role in transcription. The abstract also explains the symptoms of SMS, which include sleep disturbances.
Introduction
SMS is classified as a very complex genetic disorder and is often underdiagnosed.
Clinical Overview
This section of the article goes over all of the symptoms of Smith Magenis syndrome. There are so many of the symptoms that I just do not understand within this section of the article starting from the beginning. 1. Brachycephaly 2. Hypertelorism 3. Synophrys 4. Micrognathia 5. Taurodontism 6. Otolaryngological
7. Anthropometry. There is a nice looking Table that has a summary of all the symptoms associated with Smith Magenis Syndrome as well the prevalance of those symptom within patients who have smith magenis syndrome.
Diagnostic Approaches
FISH and G-banding are the two classical methods for SMS determination. Newer methods allow for a higher resolution in detecting DNA deletions. These methods include MLPA and Real-time Quantitative PCR. The molecular basis for detection of the disease is that there is a deletion in the 17p11.2 region which means there is a deletion in the short arm, region 11, section 2 of chromosome 17. This deletion results in a mutation of the RAI1 gene, this mutation causes Smith Magenis Syndrome. There is a table that goes through all of the different amino acid mutations that occur within the RAI1 gene that cause SMS syndrome.
Genotype Phenotype Correlation
All SMS patients with a 17p11.2 deletion are deleted for RAI1, and mutations in RAI1 likely resulted in a non functional protein thus resulting in haploinsufficiency. Haploinsufficiency means that the gene doesn’t function properly unless it has both 2 copies not just one. There are other genes that contribute to the severity and variability of the phenotype.
Conclusion
The key to proper managment of SMS involves very precise clinical diagnosis. This is because of the tremendous variability of the disorder and the many, many symptoms that can manifest. From the paper, it is understood that SMS has a wide range of symptoms. I have learned about the many symptoms that are associated with it, although I really don’t understand any of the medical terminology. I have learned some specific methods that are used to identify those people who have SMS.
So, to start I would like to say that I was very impressed with Stephen’s presentation, he was clear and consice. He especially did a good job clarifying why his experiment was important and the ways that it would help to identify and diagnose people who have SMS syndrome. Just from the presentation I learned alot about SMS. I learned that it is caused by a deletion in the short arm of chromosome 17. It occurs 1 in every 20000 live births, which contributes to its significance. The diagnosis of SMS is one that is challenging. There is a very large list of symptoms for SMS that occur because there are so many other genetic chromosomal disorders that cause very similar symptoms. SMS can be identified by looking at a karyotype and finding the deletion in chromosome 17. The deletion causes a mutation in the RAI1 gene, those people who have the proper deletion are the ones that are considered to have SMS. People who don’t have the RAI deletion are considered to be “SMS like.” They took those people who were SMS like and did whole genome array comparative Genomic Hybridization (aCGH). This is the standard technique for identification of genomic changes in individuals with developmental and intellectual delay. After this part of the presentation things started to get a little fuzzy. I think that they used Bacterial artificial chromosomes that are about 19k base pairs long for each array. They then investigated Copy number variation, which are duplications or deletions that are found to be associated with the specific Disease. NAHR which is non-allelic homologous recombination is somehow involved although I wasn’t entirely sure how. They then used scattergrams to determine which BACs had copy number variations because they went too high or too low off of the middle ground. reading the scattergrams is a challenege and I am still not entirely sure how I would read it. This experiment was interesting because as Stephen said it is one of the first that studies test subjects that do not actually have SMS by molecular basis.
A. I have made tremendous strides using Biology Benchmark, not only have I learned how to align protein sequences and create phylogenetic trees, I have also successfully used many of the protein tools to determine predicted protein structure, A protein that closely matches my sequence within the protein Database. At the biology workbench, under the protein tools bar, I clicked my sequence and used the tool blast P, it then gives you the option of looking to see if your sequence shows up on the protein database. You can then find the link to the protein of your choice there.
Here is the env protein found within the pdb
http://www.rcsb.org/pdb/explore/explore.do?structureId=2B4C
I have also done many other things. I used a program called Grol4 which allows you to predict the secondary structure of a certain sequence. I used it in my project to determine whether or not there were certain amino acids within the env protein that were required for it to function. I did not get the results that I expected, but it was still a new program heres the sequence I used as well as a key to along with it.
>S1V2-10 EVVIRSENFTNNAKIIIVQLNESVEINCTRPNNNTRKSIHIGPGRAFYTT GDIIGDIRQAYCNISRAEWNNTLKQIVIKLREHFGNKTIVFNHSS
LEGEND:
Alpha Helix = H Beta Sheet = E Random Coil = C
The one online is color coded you can’t tell when its copied into a blog post.
I have also learned how to make phylogenetic trees for both my amino acid sequences and my nucleotide sequences here they are!


SO EXCITING!!
I also tried to use a program called Mview. Mview allows to choose the reference sequence of your choice and it aligns all your sequences and tells you what % of the sequence is conserved between all of the sequences. It was a very useful tool I feel because it allows you to see how much the HIV virus really mutates.
B. I feel like I am really starting to get a hold of biology benchmark, I am confident that I can learn and understand nearly all of the tools found within it. I have even learned how to change sequence formats, there are programs that allow it on benchmark. There are also options that allow you to find sequences to download, you can use benchmark to search for a protein and it will bring up seqences in a way that is clear and easy to understand.
I have finally achieved success using Biology workbench, I managed to make a phylogenetic tree of Subject 1′s amino acid sequences! Here is a picture!
This is a picture of an unrooted tree, I’m so proud of myself. It actually involved a very long and convuluted process. First you have to input all of the amino acid sequences into the sequence alignment program, then you have to find some way to turn them into a tree using the clustal program. There are other programs that work equally well such as Drawtree. That’s at least what it said there.
. This is a picture of the unrooted tree. The next steps in my project involve making another phylogenetic tree using the same program and comparing to the amino acid trees. I think its better is I use the Biology workbench so that way all the data is uniform. Plus, I wasn’t really a fan of the trees that the HIV people made, they’re too confusing.
1. Pertaining to the recent HIV project, there are some things that i learned after the presentation which I feel would have greatly assisted my presentation. There was one big thing in particular that I feel I confused the class about. The strains of HIV from the first visit are from that specific reference point. All of the HIV strains that are from previous visits and show up within the first column are exactly the same. This would explain why there are never two strains in the first column from any one visit. When two strains are parallel to one another and further into the phylogenetic tree, they are not identical, they simply have the same number of different nucleotides from the parent strain.
2. A. Swami is an incredibly hard program to use, I was having alot of trouble getting anything on it to work, but i will run through what I did. I first visited the bioquest website and copied and pasted 5 of the HIV sequences from the website and input them into swami as data.
Data-http://www.bioquest.org/bedrock/problem_spaces/hiv/hiv_data/aa/Subject1_aa.txt
Swami- I first selected the toolkit and looked and clicked on clustalW, this apparently allows you to perform a sequence alignment, although it don’t understand how it works. The next thing I did was I selected the Toolkit again and went to the phylogenetic tree section and chose the tool that allows you to infer a phylogenetic tree. I had uploaded the data into a file and put it into the data section. That still didn’t work, I have the data all ready I think, but I’m completely uncertain as to hwo to get the tree to finally show up.
3. HIV ideas!!!
Idea 1-Reconstruct all of the phylogenetic trees that had been made using the amino acids as oppose to the nucleotide sequence and see how different they are.
Idea 2-Compare the HIV amino acid sequences vs. other viruses and determine what amino acids are unique within HIV that make it a deadly virus.
Idea 3-Using the sequence of amino acids, determine every protein found within HIV and the structures of those proteins using the data base.
Idea 4-Using NEB cutter and other programs, design an experiments that allows you to isolate fragments of HIV that are essential to its auto immune dysfunction.
A. Today in Molecular Biology lab, we had to use Blast, to find a primer sequence that matched up with the bitter-tasting gene. The gene that codes for bitterness taste is Dominant. The lab involved chewing up our own cheek cells and determining our exact genotype using gel electrophoresis and PCR DNA amplification. As it turns out, according to the gel, I am a non taster, but according to my phenotype, I am a taster. On the NCBI website, we used the primer sequence that was used for PCR to find the gene of interest. We then aligned the sequence of the tasting gene with the sequence of the non-tasting gene and used that to determine the exact difference between the two. After that we compared the sequence of the human tasting gene, to the sequence of a variety of monkey’s similar genes and determined their differences.
B. The one big thing I learned that I didn’t know before, was how to use primer sequences to find and isolate specific fragments of DNA. This was also the first time I aligned more than one gene in Blast. I also have a better understanding of all the numbers at the bottom of blast searches.
Patterns of HIV1 evolution based on how rapidly T cell levels decline
0 Comments Published October 12th, 2009 in informatics
This figure represents the phylogenetic tree of subject nine from the experiment.
“subject 9 (Fig. (Fig.3)3Figure 3) all viral isolates through visit 4 arise from a point close to the viruses in visit 1. The viruses at visit 5 are spread throughout the evolutionary tree. Viruses at visits 6 and 7 extend from the branch carrying clones 1 and 9 from visit 5. However, the viruses at visit 8 emerge from visit 6 clones, not from a visit 7 clone. This pattern of limited progression along a single branch followed by return to strains closely related to those present at an earlier visit is observed in nine other subjects (subjects 1, 3, 5–8, 10, 13, and 14). Phylogenetic trees randomly selected from the remaining 14 subjects show this pattern (Fig. (Fig.4).4Figure 4). In these 10 subjects host factors that influence the evolution of early viruses appear potent enough to select against the few clones that predominate at any given visit.”
The tree shows that all the mutations arose from those strains found from the first visit. Mutations along a single branch seem to be a common pattern, indicating that the results from the experiment are very accurate.
B. I found the Craig Venter videos to be incredibly impressive. The bacteria that could take 3 million rads of radiation was amazing, especially in light of all the DNA repair mechanisms I am learning in molecular biology. The way Craig venter talked about it he said that almost all of the chromosomes was ripped into pieces. The process to repair just one of those pieces, particularly double stranded breaks is a complicated one. It is hard to imagine how quickly processes work in this bacteria to allow it to repair its DNA so quickly and efficiently. I also found the incredible variety of bacteria and viruses that his team found in the ocean water to be incredibly impressive. It was something like 50000 new species every 200 miles? and that’s not even taking into account the temperature changes between the different depths of the water, or the areas further north or south.
