![]() |
A
CGIAR Generation Challenge Programme project.....Cultivating Genetic Diversity
for the Resource Poor |
![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
QGene
Example, T. Fulton, Cornell University
QGene is a program designed by Clare Nelson. It is a very user friendly program for QTL analysis, NIL selection, and has many other functions. It is especially geared for Advanced Backcross populations, for which most current software is either not applicable or user friendly for plant breeders. QGene requires a data file containing all the marker data and scored trait data and a map file listing all the markers scored. In this exercise we will go through an example of QTL detection and NIL selection. This population was derived from a cross between S. lycopersicon (TA209), a commercial variety of tomato, and S. pennellii LA1657, a wild relative. The marker analysis was done on the BC2 population, and BC3 plots of 30 plants each were used in field trials at Israel and 2 locations in California.The results of this study were published in:
Frary A, Fulton TM, Tanksley SD (2004) Advanced backcross QTL analysis
of a Lycopersicon esculentum x L. pennellii cross and possible
orthologs in the Solanaceae. Theor Appl Genet 108(3):485-96 QGene
requires 2 datafiles for QTL analysis. One is a file containing all
the marker and trait data for this population. Download here(as
a excel file; save as text to open in QGene) or see a smaller example
here. Down the left column are all the
plants in this population; across the top are the marker and trait names.
The marker data is the mapping data for this population, with the same
scoring conventions as in our mapping file last week. The trait abbreviations contain the trait
measured, such as yield, fruit weight, fruit color, etc. and the location
it was measured at. These plants were scored at several locations, Israel
and both Heinz and Sunseeds in California. Therefore "isfw"
is fruit weight scored in Israel. An important thing to note about the
marker scores vs. the trait scores is the meaning of "0".
In mapping scoring, a 0 means missing data. However, in phenotypic scoring,
there can be such a thing as an actual 0 score, as opposed to missing
data. For example, 0 yield, no yield, is different than a yield score
that's missing (it means that plant didn't yield anything, which is
bad!). Thus in the trait data you will see that missing data is scored
as "-" to differentiate it from scores of 0. The
second file QGene requires is a file with mapping information. Download
the Word file here or see an example
here This file tells QGene which markers were mapped, on which chromosome
they are located, and how far apart they are. This is mainly for getting
a more accurate picture of where your putative QTL are located. Open
QGene by double-clicking on it.
Under the "File" menu, choose Open Population,
Open LA1657.bc2.qdata.txt, then
qpenn.map.txt. On
the right side of your screen is a list of all the traits scored. Under
the View menu is "help" which you can access at any
time for descriptions of functions. Also under the View menu is "Map"
which will show the map of all the markers scored (close this by clicking
in the small box at the top left corner). You
can select any trait (or many at once) and look at their scoring patterns
by going to the View menu and selecting Trait Histogram. Note that some traits are scored as continuous data and some
are scored as ordinal data (for example, color is scored as 1-5 where
5 is the most red). This is a good way to check your data for outliers
that might be mis-scored data. We
can also check for skewed segregation by selecting Chi-square segregation under the Analyses window.
Here we can see that at the end of chromosome 12 is a region highly
skewed toward the wild parent allele, something important to keep in
mind for future work such as breaking linkage drag and developing Near-Isogenic
lines. Let's
try some of the other functions: In
the trait list at the right, select is.brix (soluble solids scored
in Israel) by double clicking on it: it should now have a • in
front of it (or be red, in the newer version). (Note: soluble solids
are a very important characteristic to the tomato processing industry;
more solids means a thicker tomato paste or sauce. Brix is the unit
used to measure the solids). Under
the "Analyses" menu, choose Single Point regression. The graph depicts the chromosomes as if laid end to end, showing
the effects of the 2 parental alleles on yield (Recall that TA209 is the esculentum recurrent
parent, while LA1657 is the wild species donor parent). Here's
the graph you should see:
This
depicts the chromosomes as lying end to end. Anytime the line goes up
it is showing a positive effect from the wild parent allele; down is
a negative effect from the wild allele. So the spike on chromosome 12
is where the allele from LA1657, the wild parent, is associated with
an increase in brix. If we click on the bump it highlights CT79 as being
the marker associated with this QTL. We
can also say Save stats, which saves the data for us in a table
format file. Choosing the option 'sort by statistic' is a good idea
because the table will contain the regression data for this trait using
every single marker in your file, but you really only want to look at
the ones that are the most significant. Now you can open the table with
Excel. An
example of this table is at the end of your handout, showing the marker,
chromosome, source of the increase (an increase in soluble solids in
this case), R-squared (amount of the increase), P value, AA = the average
value for plants containing the homozygous esculentum alleles, Aa =
the value for plants heterozygous for the esc and wild alleles, etc.
You
can see that there are quite a few QTLs where the wild allele is associated
with an increase in brix, with CT79 being the most significant, but
they are all located together on chromosome 12. In this case we wouldn't
know whether there are actually several QTLs that are linked or whether
its one big QTL unless we did some further work, such as fine mapping.
In
the Frary et al paper, you can see that we created a table and a map
of all significant QTLs. This was done simply by doing a regression
for every trait, just as we did for brix, and compiling all the significant
QTL into a table and onto their map location. In this case we also compared
the QTLs found in this population with those found in other tomato populations
from other wild species so that we could see what was in common, and
which might be new (previously undiscovered) QTL.
Now
we might be interested in creating a Near-Isogenic Line (NIL): a plant completely identical to TA209 except for this small
piece of chromosome 12, which would contain DNA introgressed from the
wild parent. First we would
want to make sure that this same region of chromosome 12 was not also
associated with any negative effects.
For example, often an increase in brix is often associated with
a decrease in fruit size or yield which could be very damaging, commercially. Looking
back at the trait list, we can select several traits at once by holding
down the Command key and click on is.brix, is.firm, is.totyld,
and HW.brix. All the selected
traits should be highlighted (not •). Under
the View menu, select Multiplot. (The thresholds etc can be changed if
desired, but it's not necessary today)
Select chromosome 12, then OK. You should see a picture of chromosome
12 as shown below, with the effects of each of the traits you selected,
with red showing the most significant effects and gray showing no significant
effects, and the yellow line next to the chromosome showing where the
LA1657 allele is associated with an increased effect (increased yield,
increased firmness etc). Therefore
an area of the chromosome showing significant effects but WITHOUT a
yellow line next to it signfies DECREASED effects.
Here
we can see that while CT79 has good effects on brix, in both locations,
it also is associated with a decrease in total yield. There was no effect
on firmness associated with this region. We would have to decide whether
we still wanted to go ahead and create a NIL with the region of CT79,
and hope to get rid of the associated decrease in yield by creating
additional recombinations, or by combining it with a QTL for increased
yield that might overcome this negative effect, or to select a different
region for making a new NIL. For example, it looks like a NIL selected
for CT99 might still have good brix, but a less significant effect,
but also no other negative effects. Suppose
we decide to create a NIL that has only the CT99 region and not the
rest of the wild parent chromosome. We can use the "NIL extraction"
function to help us choose the best plant in the population from which
to make a NIL. Close the Multiplot window. Under the View menu, choose Map.
Click on CT99 on chromosome 12 to find the correct place
that will highlight that marker. Hold down the SHIFT key and click
on CT99 again. A colored scale will pop up with LA1657
at one end (red) and TA209 at the other--point the cursor at
the LA1657 end of the scale and let go-CT99 should now be red on the
map. We have just selected
this as the region we want to be introgressed from LA1657. Under
the Analyses menu, select NIL extraction. Select the backcross
then self option
as we're getting our seed directly from the self fruit from the BC plots
in the field. Click OK, and a table of NIL candidates should appear, like this:
This
table shows the # of progeny needed to grow from each line to have a
95 or 90% chance of being able to select a "clean" NIL (containing
the region we want, and no other regions).
So from plant 96T844-5, the best candidate, we should grow 742
plants to have a 95% chance of getting a clean NIL.
We
can also look at the graphical genotype of this plant to see it's exact
genetic makeup. Under the View menu, select Marker
Genotypes. Under Choose line, select Show which
line. Click in the box at the right corner of
this window and type 96T844-5, then OK.
(Under
the File menu, you can choose Deselect all to get rid of the
red color if you want). The
shaded in parts of the chromosomes are the regions homozygous for esculentum
alleles, the hatched parts are heterozygous regions and white parts
mean missing data. You
can see that this line contains CT99 like we wanted, but also a big
region that we don't want; those
we will have to select against using marker assisted selection.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Have
a Question or a Problem? Email the helpdesk |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
We will try to
answer it or find someone who can! More about
the helpdesk, including phone, fax, and snailmail info...... |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||