Still, better than NIMHD!
Still, better than NIMHD!
Minimally this is a straightforward data pre-processing step that would probably improve the performance of all methods.As it is, I think the results are misleading because you suggest that bmws and other methods cannot estimate large selection coefficients which is not true.
I agree you shouldn't condition but I think if you see a trajectory like this you should remove the region in red from the inference. As you say, it's uninformative – the selection coefficient could be anything – and you do exclude the equally uninformative blue region.
Are the selected variants fixing in the longer time series? I think if you include long stretches where the allele frequency is 1 then bmws will underestimate the selection coefficient due to the shrinkage on s. But I think in practice you would exclude those observations.
Converse is people using the term "experiment" to describe some kind of data analysis.
And Mary-Claire
Well I suppose specifically for a developmental trait like this I might think that the environmental assumptions of the twin study design would be violated - i.e. twins share more of the relevant (in utero) environment than sibs, leading to an overestimate of h2.
I don't think this really makes sense though; it's like how anthropologists sometimes talk about things like "potential height" which is the height you would reach if you had perfect nutrition. But it's not a phenotype.
Indeed, rates of right-handedness were increased by [an environment] in which people were forced to write right-handed.
I mean if someone is forced to the point where they use their right hand all the time, are they not right-handed?
Yeah, those are always inflated
Also I don't believe the heritability is 25%. UKB SNP heritability is less than 2%.
It's still environment, even if it's not practically measurable! Also, handedness can modified [by changing the environment].
Not sure the individual variation is the issue; it looks like they see it in most samples but I wonder whether there could be an off-target effect in the PCR step. Would want to check whether the putative mutant reads are enriched for mismatches.
The only bit of the ARG you own is your terminal branch.
We'll have to fix that!
I don't know if it's better. They can certainly phase better, and this suggests they have access to UKB data: www.ukbiobank.ac.uk/projects/hig.... Plus they have in-house non-EUR panels (e.g. www.nature.com/articles/s42...)
Yes seems so, at least a couple of months ago I was able to download imputed data.
True, I mean I guess that [people imputing their own genomes] is exactly what they don't want people doing. Also 23andMe gives you imputed data, possibly better than TopMed imputation anyway?
Maybe because they don't want people trying to split up their samples over many small jobs. If you just have one, I guess just add in 19 samples from 1000 Genomes?
He chose.... poorly
I find Figure 7 more informative about the route than Figure 1
The Mathieson Lab at the University of Pennsylvania is hiring a computational postdoc in machine learning and evolutionary biology. Apply with a CV and references: smathi@sas.upenn.edu. More info: https://saramathieson.github.io/lab #postdoc
I think it makes very little difference - the high LD regions are small (and don't obviously have different patterns of cross-individual differences), so won't have much effect on the genome-wide differences.
In practice the main use of LD pruning seems to be removing high ("long-range") LD regions that otherwise show up as individual PCs (e.g. see Fig1 etc here: academic.oup.com/bioinformati...). In some sense those are still "real" structure in the ARG, but it's kind of undesirable for some analyses.
I think you could probably bias it one way or another depending on how you do the LD pruning (LD in which population?). I think of it as like you're changing the relative weighting of different parts of the ARG.
So |EUR-YRI| can be < |YRI-X| for some other African population X, due to structure within Africa. |EUR-YRI| can be < |YRI-YRI| for some PCs but if you sum the distances across enough PCs, weighting them appropriately, it will always be larger.
I see, but to the original point that's not how PCA works. I don't think |EUR-YRI| is ever going to be less than |YRI-YRI| in PC space – again YRI-YRI can coalesce before the split. Of course if you only look at the first 2 (or first N) PCs, it can be, but that just depends on sampling.
They are not measuring pairwise differences. If I understand their figure 4A, if one individual is 0/0 and another is 1/0 then they count that as S=1. But If the second individual is 1/1 then they also count that as S=1 (not S=2). So they undercount differences in individuals with low heterozygosity
Another way to think about it: the variants that fix in A during the bottleneck are equally likely to fix in the matching or non-matching state as a randomly chosen individual from B, so the expected number of differences does not change.