Path-properties of the tree-valued Fleming–Viot process

Andrej Depperschmidt, Andreas Greven, Peter Pfaffelhuber
2013 Electronic Journal of Probability  
E l e c t r o n i c J o u r n a l o f P r o b a b i l i t y Electron. Abstract We consider the tree-valued Fleming-Viot process, (Xt) t≥0 , with mutation and selection as studied in Depperschmidt, Greven and Pfaffelhuber (2012) . This process models the stochastic evolution of the genealogies and (allelic) types under resampling, mutation and selection in the population currently alive in the limit of infinitely large populations. Genealogies and types are described by (isometry classes of)
more » ... try classes of) marked metric measure spaces. The long-time limit of the neutral tree-valued Fleming-Viot dynamics is an equilibrium given via the marked metric measure space associated with the Kingman coalescent. In the present paper we pursue two closely linked goals. First, we show that two well-known properties of the neutral Fleming-Viot genealogies at fixed time t arising from the properties of the dual, namely the Kingman coalescent, hold for the whole path. These properties are related to the geometry of the family tree close to its leaves. In particular we consider the number and the size of subfamilies whose individuals are not further than ε apart in the limit ε → 0. Second, we answer two open questions about the sample paths of the tree-valued Fleming-Viot process. We show that for all t > 0 almost surely the marked metric measure space Xt has no atoms and admits a mark function. The latter property means that all individuals in the tree-valued Fleming-Viot process can uniquely be assigned a type. All main results are proven for the neutral case and then carried over to selective cases via Girsanov's formula giving absolute continuity. Path-properties of the tree-valued Fleming-Viot process will play an important role. Remark 2.2 (Interpretation of equivalent marked metric measure spaces). 1. In our presentation, only ultra-metric spaces (U, r) will appear. The reason is that we only consider stochastic processes whose state at time t describes the genealogy of the population alive at time t, which makes r an ultra-metric. Path-properties of the tree-valued Fleming-Viot process 2. There are several reasons why we consider equivalence classes of marked metric spaces instead of the marked metric spaces themselves. The most important is that we view a genealogical tree as a metric space on its set of leaves. Since in population genetic models the individuals are regarded as exchangeable (at least among individuals carrying the same allelic type), reordering of leaves does not change (in this view) the tree. In order to construct a stochastic process with càdlàg paths and state space U A , we have to introduce a topology. To this end, we need to introduce test functions with domain U A . Definition 2.3 (Polynomials) . where µ ⊗N is the infinite product measure, i.e. the law of a sequence sampled independently with sampling measure µ. Let us remark that functions of the form (2.4) are actually monomials. However, products and sums of such monomials are again monomials, and hence we may in fact speak of polynomials; cf. the example below. Remark 2.4 (Interpretation of polynomials). Assume that φ only depends on the first n 2 coordinates in r(u i , u j ) 1≤i<j and the first n in (a i ) i≥1 . Then, we view a function of the form (2.4) as taking a sample of size n according to µ from the population, observing the value under φ of this sample and then taking the µ-sample mean over the population. Example 2.5 (Some functions of the form (2.4)). Some functions of the form (2.4) will appear frequently in this paper, for example r → φ(r) := ψ 12 λ (r) := e −λr12 , Ψ 12 λ (U, r, µ) := µ ⊗N , ψ 12 λ = (π 1 * µ) ⊗2 (du 1 , du 2 )e −λr(u1,u2) . (2.5) This function arises from sampling two leaves, u 1 and u 2 , from the genealogy (U, r) according to π 1 * µ and averaging over the test function e −λr(u1,u2) of this sample. Then (Ψ 12 ) 2 is again of the form (2.4) and (Ψ 12 λ ) 2 (U, r, µ) = (π 1 * µ) ⊗4 (du 1 , . . . , du 4 )e −λ(r(u1,u2)+r(u3,u4)) . (2.6) Another function that will be used and which also depends on types is given by Ψ 12 λ (U, r, µ) := µ ⊗2 (du 1 , du 2 , da 1 , da 2 )1 {a1=a2} e −λr(u1,u2) . (2.7) In this function u 1 and u 2 contribute to the integral only if their types, a 1 and a 2 agree. Since we use polynomials as the domain of the generator for the tree-valued Fleming-Viot process, we need to restrict this class to smooth functions. Path-properties of the tree-valued Fleming-Viot process Definition 2.6 (Smooth polynomials). We denote by Π 1 := Φ φ as in (2.4) : φ bounded, measurable and for all a ∈ A N , φ(·, a) ∈ C 1 b R ( N 2 ) (2.8) the set of smooth (in the first coordinate) polynomials. Furthermore we denote by Π 1 n the subset of Π 1 consisting of all Φ φ for which φ(r, a) depends at most on the first n 2 coordinates of r and the first n of a and hence have degree at most n. Definition 2.7 (Marked Gromov-weak topology). The marked Gromov-weak topology on U A is the coarsest topology such that all Φ φ ∈ Π 1 with (in both variables) continuous φ are continuous. The following is from [4, Theorems 2 and 5] Proposition 2.8 (Some topological facts about U). The following properties hold: 1. The space U A equipped with the marked Gromov-weak topology is Polish. 2. The set Π 1 is a convergence determining algebra of functions, i.e. for random U A -valued variables X, X 1 , X 2 , . . . we have (2.9) Construction of the tree-valued FV-process The tree-valued Fleming-Viot process will be defined via a well-posed martingale problem. Let us briefly recall the concept of a martingale problem. Definition 2.9 (Martingale problem). Let E be a Polish space, P 0 ∈ M 1 (E), F ⊆ B(E) and Ω a linear operator on B(E) with domain F. The law P of an E-valued stochastic process X = (X t ) t≥0 is a solution of the (P 0 , Ω, F)-martingale problem if X 0 has distribution P 0 , X has paths in the space D E ([0, ∞)), almost surely, and for all F ∈ F, Path-properties of the tree-valued Fleming-Viot process Remark 2.10 (Interpretation of generator terms). The growth, resampling, mutation and selection generator terms are interpreted as follows: 1. Growth: The distance of any pair of individuals is given by the time to the most recent common ancestor (MRCA). When time passes this distance grows at speed 1. Note that in [14] and [5] the corresponding distance was twice the time to MRCA. The reason for this change were some simplifications of the terms in the computations that we will see later. 2. Resampling: The term µ ⊗N , φ • θ k, − φ describes the action of an event where an offspring of individual k replaces individual in the sample corresponding to the polynomial Φ φ . This term is analogous to the measure-valued case [see e.g. 9, eq. (3.21)], but acts on both, the genealogy and the types. 3. Mutation: It is important to note that mutation only affects types, but not genealogical distances. Hence, the mutation operator agrees with the measurevalued case [see e.g. 9, eq. (3.16)]. Note that here we consider only jump operators B. Selection: This term is best understood when considering a finite population. Consider for simplicity the case of additive selection (i.e. (2.22) holds) in particular covering haploid models. Here, the offspring of an individual of type a replaces some randomly chosen individual at rate αχ(a) due to selection. In the large population limit, we only consider a sample of n individuals and this sample changes only if some offspring of an individual outside the sample, e.g. the (n + 1)st individual by exchangeability, replaces an individual within the sample, the kth say, due to selection. After this selection event, the fitness of the kth individuals is χ(a) which is also seen from the generator term. In the case of selection acting on diploids, the situation is similar, but one has to build diploids from haploids first and then apply the fitness function.
doi:10.1214/ejp.v18-2514 fatcat:6hyub7mklfg6tjh7ufngkjwgea