Multivariate asymptotic analysis of set partitions: focus on blocks of ﬁxed size

Using the Saddle point method and multiseries expansions, we obtain from the exponential formula and Cauchy’s integral formula, asymptotic results for the number T ( n, m, k ) of partitions of n labeled objects with m blocks of ﬁxed size k . We analyze the central and non-central region. In the region m = n/k − n α , 1 > α > 1 / 2, we analyze the dependence of T ( n, m, k ) on α . This paper ﬁts within the framework of Analytic Combinatorics.


Introduction
Set partitions parameters have long been a topic of investigation. See, for example, Graham et al. [7], Knuth [9], Mansour [13], Stanley [15], for other investigations of set partitions. Moreover, set partitions continue to be of interest recently; see Chern et al. [1,2]. During a talk given by P. Diaconis at the Conference in honour of Svante Janson's 60th birthday, our attention was attracted by a classical parameter of set partitions: the number T (n, m, k) of partitions of n labeled objects with m blocks of fixed size k. The value of k will be fixed in this paper, so we suppress the k and simply write this number as T (m, n).
Our goal is to analyze the asymptotic growth of T (m, n) in several regimes. Let Π(n) be the set of partitions of n labeled objects, with B n := |Π(n)| denoting the nth Bell number.
We define the random variable J n as the number of blocks of size k in a set partition chosen (uniformly) at random from the class of all set partitions of n labeled objects. Then the distribution of J n is The celebrated exponential formula (or polymer expansion) is given as follows. Let Then exp(g(z)) = ∞ n=0 b n (a 1 , . . . , a n ) z n n! , where the nth Bell polynomial reads as b n (a 1 , . . . , a n ) = λ∈Π(n) Π n k=1 a X k (λ) k , and X k (λ) is the number of blocks in λ of fixed size k, λ denoting a set partition.
Hence, we use T (m, n)y m z n n! to denote the analogous bivariate GF (which is exponential in z and ordinary in y); the variable y is used here to "mark" the blocks of size k. See [4, p. 156] for an introduction to the marking technique.
It follows immediately that Now we define To find f 3 (z), we take the mth derivative of f 2 (z) with respect to y, then divide by m! and evaluate at y = 1, so that In this paper, we will use multiseries expansions: multiseries are in effect power series (in which the powers may be non-integral but must tend to infinity) and the variables are elements of a scale. The scale is a set of variables of increasing order. The series is computed in terms of the variable of maximum order, the coefficients of which are given in terms of the next-to-maximum order, etc. This is more precise than mixing different terms. This technique was used in the analysis of Stirling numbers (of the first and second kind) and Eulerian numbers in Louchard [10,11,12].
Our paper is organized as follows: in Section 2, we consider the central region, where we re-derive, with more precision, the asymptotic Gaussian property of J n . In Section 3, we analyze the large deviation m = n/k − n α , with 1 > α > 1/2. The appendix provides a brief justification of some integration procedures.

Central region
We use several saddle points. To ease this paper's reading, we summarize the different values we need. We consistently use "root" to be interpreted as "root of smallest modulus." In section 2, ρ is the root of ρe ρ = n, ρ k is the root ofρ k exp(ρ k ) = n − k, In section 3, ρ is the root of ρe ρ = kn α .

The moments
In this section, we first derive asymptotics for the Bell numbers and then proceed to the analysis of the moments of J n . For the Bell numbers, we could use Salvy and Shackell [14] or Chern et al. [1]. But the first paper uses n/ρ in the scale and the second paper uses the solution of ue u = n + 1. We prefer to use n and ρ: the root (of smallest modulus) of ρe ρ = n. The scale of the multiseries is n ρ k ρ (we assume here k ≥ 2).
We define m : · · · (m − + 1) as the th falling power of m. Now we use this notation to study the th falling power of J n . We have In particular, for = 1 and = 2, respectively, we have Therefore, the first and second falling moments of J n are Now we need an asymptotic expansion of B n . Set Let the saddle point be given by ρ and let Ω denote the circle ρe iθ . We compute By Cauchy's theorem, it follows that where See Good [6] for a neat description of this technique. We have where W is Lambert's W function (see Corless et al. [3]). We use the principal branch, which is analytic at 0. Let us set L := ln(n). We have the well-known asymptotic expressions All expressions involving ρ in the sequel can of course be expanded into powers of L, but this would lead to huge formulae.
Now we turn to the integral. We have for instance We now proceed as in Flajolet and Sedgewick [4, ch. VIII]. Let us choose a splitting value θ 0 such that κ 2 θ 2 0 → ∞, and κ 3 θ 3 0 → 0, as n → ∞. For instance, we can use θ 0 = n −5/12 . We must prove that the integral is such that |K n | is exponentially small. This is done in Appendix 4. Now we use the classical trick of setting Computing θ as a series in u, this gives, by Lagrange's inversion, with, for instance, This expansion is valid in the dominant integration domain |u| ≤ √ n θ 0 a 1 = 1 + ρ n 1/12 .
. This extension of the range is justified as in Flajolet and Sedgewick [4, ch. VIII]. (From now on, we only provide a few terms in our expansions, but of course we use more terms in our computations. Also, all O terms in the sequel may depend on k, ρ.) where Now we turn to B n− k . We compute We will useρ ,k as the root ofρ ,k exp(ρ ,k ) = n − k, i.e.,ρ ,k = W (n − k). This gives Now we specialize to the case = 1 to get the mean.
and H 2k := 1 − 1 24 We need Bn . This leads to conclude where and and where H 5 is computed as follows: The mean M of J n is given by Similarly (we omit the details) Hence the variance σ 2 of J n is given by More generally, the th falling moment is given by

Distribution of J n
Fristedt [5] proved that the distribution of J n is asymptotically Gaussian. This can also be obtained with Hwang's techniques: see [8]. We want here to re-derive this property with more precision. The corresponding GF f 5 (z) is derived from f 3 (z): The saddle point equation is now It is easily seen that we have When we write ρ k ∼ ρ + δ, this is simply a statement that ρ k − ρ can be written as a Taylor series in terms of powers of n −1 , and then α 1 is just the first coefficient in this Taylor series. Solving (7) gives, for instance, Also, we have the classical result that ln(m!) = −m + m ln (m) + 1 2 ln (2πm) + 1 12 We must analyze First of all, we must compute the dominant term of n!f 5 (ρ k ) eBnρ n k . We have with H 2 given by (6) and, with (5), We compute The dominant term of T 0 is computed as Set now the next term of T 2 as T 3 := T 0 − T 2 . We have and Later on, we will need We now turn to the coefficient of 1/n in T 1 .
To compute the integral, we obtain, for instance, and the integral leads to We compute now , and T 92 := 1 2 Combining the integral I with (8) gives the local limit theorem: The asymptotic distribution of J n is given by the local limit theorem: Of course more terms can be mechanically computed, but the expressions become much more intricate.
To check the quality of our asymptotics, we have chosen k = 2, n = 1000, the range of interest for m is given by m ∈ ρ k k! + 2 ρ k k! , ρ k k! − 2 ρ k k! = (6, 21). To numerically compute T (m, n), we use (3), which gives Figure 1 shows T (m, n) (circle), the asymptotics e −x 2 /2 (line, of course we use here ρ k k! as mean and variance) and Equ. (9) (box).
(line) and Equ. (9) (box) Figure 2 gives the quotient of Equ. (9) and T (m, n). Figure 3 gives the quotient of Equ. (9) and T (m, n) (box) and the quotient of e −x 2 /2 and T (m, n) (line). Note that the same technique would lead to the joint distribution of J n (k 1 ), J n (k 2 ) for two (or more) different values of k.
3 Large deviation m = n/k − n α , 1 > α > 1/2 We have a maximum of n/k blocks of size k in a partition. So we use This can be written as In this section, we choose α ≥ 1 2 , but the other case is similarly analyzed. The saddle point equation, from (7) becomes The solution of ρe ρ =ñ is asymptotically given by As previously, we have with, here, First of all, we have We have, successively, and Computing the integral, we have, for instance, and the integral leads to and Finally, we obtain the following asymptotic result The asymptotic expression of the T (m, n) for large deviation is given by Let us analyze the importance of our terms. We have two sets: the set A of dominant terms, which stay in the exponent and the set B of small terms, leading to a coefficient of type (1 + ∆), with ∆ small. The property of each term may depend on α. For instance, in R 10 , the first term leads to an O(nL) term, the ε term leads to an n α term, the ε 2 term leads to an n 2α−1 term, which are all ∈ A. the ε 3 term leads to an n 3α−2 term which is ∈ A if α ≥ 2/3 and ∈ B otherwize. In R 11 the ε term leads to a term ∈ A. In R 12 all terms are ∈ B. In R 2 , theñ term is ∈ A , in R 20 the first term is ∈ A, all other terms are ∈ B, in R 21 , all terms are ∈ B.
We finally mention that our non-central range is not sacred: other types of ranges can be analyzed with similar methods.
To check the quality of our asymptotics, we have first chosen k = 2, α = .52, a range n ∈ [10000, 70000] and m = n k − n α . Figure 4 shows the quotient Equ. (10)/T (m, n). The fluctuations are due to the fact that m is integer, so the value of α we need is actually the root of m − ( n k − n α ) = 0. For α = .65, we choose n ∈ [100, 2300], this leads to Figure 5. The quality of the asymptotics decreases for α ≥ .7, more terms would be necessary.
Let us first notice that niθ does not contribute to the analysis. Next, we have e ρe iθ = e ρ cos(θ) cos(ρ sin(θ)) which has a dominant peak at 0. For the Gaussian case, we use f 5 (z). We must analyze e ρe iθ − (ρe iθ ) k k! = e ρ cos(θ) cos(ρ sin(θ)) − ρ k k! cos(kθ) which has a dominant peak at 0. The non-central region leads to the same analysis.