Identifying long cycles in finite alternating and symmetric groups acting on subsets

Let $H$ be a permutation group on a set $\Lambda$, which is permutationally isomorphic to a finite alternating or symmetric group $A_n$ or $S_n$ acting on the $k$-element subsets of points from $\{1,\ldots,n\}$, for some arbitrary but fixed $k$. Suppose moreover that no isomorphism with this action is known. We show that key elements of $H$ needed to construct such an isomorphism $\varphi$, such as those whose image under $\varphi$ is an $n$-cycle or $(n-1)$-cycle, can be recognised with high probability by the lengths of just four of their cycles in $\Lambda$.


Introduction
The third and fourth authors predicted in [10] that, for a permutation group H on a set Λ, which is permutationally isomorphic to a symmetric group S n acting on the k-element subsets of points from {1, . . . , n} (that is in its k-set action), for some arbitrary but fixed k, it should be possible to recognise an element in H corresponding to an n-cycle in S n by the lengths of just four of its cycles in Λ. The purpose of this paper is to prove this result. Theorem 1.1. Let H be a permutation group on a set Λ, which is permutationally isomorphic, via an unknown isomorphism ϕ, to a finite symmetric group S n in its k-set action, for some k. Let h be a uniformly distributed random element of H and let λ 1 , . . . , λ 4 be independent, uniformly distributed random points of Λ. Then there exist positive constants N 0 and c such that, for n ≥ N 0 , Prob ϕ(h) is an n-cycle the h-cycle containing λ i has length n, for i = 1, . . . , 4 > 1 − c n Subset actions of S n and the alternating group A n play a crucial role in algorithms for permutation groups. They are examples of 'largebase' actions. Most primitive permutation groups are 'small-base' and very efficient algorithms are available to compute with them (for a detailed definition see [12, p. 51]). However these algorithms become prohibitively expensive when applied to large-base groups and, therefore, alternative means of handling large-base groups are essential (see [12,Chapter 10] for a discussion on currently available algorithms for this case). The large-base primitive permutation groups all contain in their socles alternating groups with associated subset actions. Hence finding efficient algorithms for these actions is important. This article provides key results which give a theoretical foundation for such an algorithmic application in a forthcoming paper.
An algorithm to recognise A n or S n in its k-set action finds certain key elements, namely elements containing an m-cycle for large m, as described in Table 1. Our Theorem 1.1 follows from a more general theorem, Theorem 3.1, which addresses the problem of finding all of these elements. Once these key elements have been found, a permutational isomorphism from H on Λ to A n or S n on k-sets can be constructed using 'Black-box' methods described in Sections 4 and 5, especially Lemmas 4.1, 4.3 and 5.5 of [2]. Alternative methods, focussing in particular on the k-set action, are developed in [6].
1.1. Context of our results. In a seminal collection of papers, Erdös and Turán initiated the study of asymptotic behaviour of the proportions of various kinds of elements in permutation groups. For example, they showed [4,5] that for n large enough, most elements in the symmetric group S n of degree n have order n ( 1 2 +o(1)) log(n) . In the same vein Warlimont [13] proved that the conditional probability that a random element g in S n is an n-cycle, given that g n = 1, is 1 − O(n −1 ).
Applied algorithmically, Warlimont's result is used to conclude, from the fact that the nth power of a 'hidden' permutation g ∈ S n is the identity, that g is almost certainly an n-cycle. Finding an n-cycle is a key step in many algorithms that 'constructively recognise' S n , so this is valuable. However, in some situations a group isomorphic to S n might be given in such a way that it is too expensive to apply Warlimont's result. The results of [10] provide a basis for extending this to a situation where we know only that g has an orbit of length n in some action. An extension of this nature to k-set actions of S n and A n is the subject of this paper. We refine and improve significantly the main result of [10]. For example, we employ a similar division of the elements of S n into several families according to properties of points which lie in cycles of lengths dividing n. However, examining this subdivision alone is not sufficient to achieve the results in this paper. We need to study the probability that several k-element subsets of {1, . . . , n} have exactly n distinct images under g for g an element in one of the families. Moreover, in our algorithmic applications we also required analogous results for elements of S n and A n containing m-cycles, for m ≥ n − 6. 2 or 4 (mod 6) n − 3 3 3-cycle 1 7 3 or 5 (mod 6) n − 4 3 3-cycle 3/4 8 0 (mod 6) n − 5 3 3-cycle 7/20 9 1 (mod 6) n − 6 3 3-cycle 9/40 Table 1. Groups and types of elements In Section 2 we briefly describe the algorithmic application, and in particular we explain the meaning of the parameters r and ρ in Table 1. In Section 3 we introduce the notation which we shall use throughout the paper and give the precise statement of the main result (Theorem 3.1). The proof of Theorem 3.1 (and hence of Theorem 1.1) is given in Section 4. In particular we exhibit an explicit value for the constant c of Theorem 3.1(a) and Theorem 1.1. We present some background material in Sections 5 and 6. Sections 7 -11 contain the various parts which are pulled together in Section 4 for the proof of Theorem 3.1.

Algorithmic Application
The results in this paper are motivated by algorithmic applications in [6] and [7]. In these applications, H is a permutation group acting on a set Λ of n k points. We wish to test whether H is permutation isomorphic to G = A n or G = S n acting on the set Ω k of k-element subsets of Ω = {1, . . . , n}. That is to say, whether there is a group isomorphism ϕ : H → G and a bijection f : Λ → Ω k such that, for each h ∈ H and λ ∈ Λ, (λ h )f = (λ)f hϕ .
We say that an element h ∈ H corresponds to an element g ∈ G if the permutation isomorphism ϕ maps h to g. The algorithms construct a 'nice generating' set for H of size 2. In the case where H is permutation isomorphic to S n in its action on Ω k , this generating set consists of elements that, in the natural representation of S n on n points, correspond to an n-cycle and a 2-cycle interchanging two consecutive points of the n-cycle. In the case where H is permutation isomorphic to A n in its action on Ω k the nice generating set consists of elements that in A n correspond to an n-cycle or (n − 1)-cycle, and to a 3-cycle.
We wish to find these elements by selecting independent, uniformly distributed random elements from the group H. However, the proportion of 2-cycles in S n or 3-cycles in A n is too small to allow us to find such elements directly by random selection. Therefore, we seek elements in H which correspond to permutations containing a 2-cycle or a 3-cycle together with one long cycle of length m, say, where m is at least n − 6 and m is coprime to 2 or 3, respectively. The algorithms in [6] and [7] seek elements h ∈ H which correspond to the kinds of elements g listed in Table 1, where H is permutation isomorphic to G = S n or G = A n , with G, n as in the second and third columns. The fourth column, labelled m, lists the length of the m-cycle which the element g contains. The fifth column, labelled r, lists an integer between 1 and 3. Ultimately we wish to find an element h in H which corresponds to an element in G with cycle type as recorded in the sixth column. This element is constructed as a power of the element h.
The first element in the nice generating set for H corresponds to an element satisfying the conditions of Line 1, 4, or 5, namely it corresponds to an n-cycle or an (n − 1)-cycle. The second nice generator corresponds to a 2-cycle if G = S n and is constructed from an element h ∈ H which corresponds to g as in Line 2 or 3. If G = A n , the second nice generator corresponds to a 3-cycle and is constructed from h ∈ H corresponding to an element g as in Line 6, 7, 8 or 9. The last column, labelled ρ(G, n, m), records a rational number such that the proportion of elements h of H which correspond to elements of G containing an m-cycle and with order dividing rm is ρ(G,n,m) m (see (1)). The group H acts on a set Λ of size |Λ| = n k , and in the context of the algorithm m, n and k are so large that it is 'too expensive' to compute the full cycle structure of elements of H in their action on Λ. Instead we compute the cycle lengths of elements h ∈ H on a handful of randomly chosen points of Λ, that is to say, we 'trace' these points under the action of h .
In computer experiments in GAP [3], we discovered that if H is permutation isomorphic to G = S n or A n on Ω k then, for m, r as in one of the lines of Table 1, most elements of H which produced cycles of lengths a multiple of m and dividing rm, when we traced each of four or five independent random points of Λ, corresponded to elements of G containing an m-cycle. This computer experiment is formalised in procedures FindMCycle and TraceCycle. Our experimental observation turns out to be true in general, and is proved in Theorem 3.1. For clarity of exposition the proofs of Theorem 3.1 are written in terms of the action of G on Ω k . For n, m and r as in one of the lines of Table 1, define N (n, m) to be the set of all g ∈ S n that contain an m-cycle and N good (G, n, m) to be the set of all g ∈ N (n, m) ∩ G for which o(g) divides rm. Note that, for given G, n, m, only one of the lines of Table 1 is satisfied, and hence r is determined by G, n, m. We define ρ(G, n, m) to be the rational number satisfying As an example of how to interpret this information, consider Line 3 of Table 1. The proportion of elements g of S n containing an (n − 3)cycle is 1 n−3 , and 2/3 of these elements contain also a 2-cycle or three 1-cycles on the remaining 3 points. Thus the proportion of elements of S n containing an (n − 3)-cycle and having order dividing 2(n − 3) is . In order to construct a 2-cycle (the entry in column 6 for this line), we raise the element g to the (n − 3) rd power producing x = g n−3 . Since n − 3 is odd, the element x is the identity if g has three fixed points, a 2-cycle if g contains a 2-cycle, or possibly a 3-cycle if g contains a 3-cycle and 3 does not divide n. Thus three quarters of the elements of N good (S n , n, n − 3) yield a 2-cycle by powering. The algorithm FindMCycle can therefore easily be incorporated into a Monte Carlo algorithm to construct a transposition in this case: by repeating FindMCycle a number of times we will with high probability construct a transposition by powering the output of FindMCycle. The other Lines have a similar interpretation for ρ(G, n, m).
We now describe the two algorithms. Algorithm 1 assumes that we have a function RandomGrpElt which takes as input a generating set Y for a group H and returns independent, uniformly distributed random elements of H. Algorithm 2 assumes that we have a function RandomPoint which takes as input a finite set Λ and returns independent, uniformly distributed random points of Λ. Note that Algorithm 1 calls Algorithm 2 and that we assume that Algorithm 2 has access to the variables of Algorithm 1.  Table 1. Let H be a permutation group with a generating set Y acting on a finite set Λ. Let ε be a real number with 0 < ε < 1 and let M be an integer with M ≥ 4. Result: An element h ∈ H or fail; This algorithm inspects up to O(n log(ε −1 )) uniformly distributed independent random elements from H to find one which has orbits of length a multiple of m and dividing rm on each of M randomly selected points from Λ. If such an h ∈ H is found it returns h, otherwise it returns fail. This algorithm tests whether the permutation h ∈ H has orbits of length a multiple of m and dividing rm on M randomly selected points from Λ. If this is the case it returns true, otherwise it returns fail.
if |λ g | = r 0 m for some r 0 | r then return false; return true; Remark: (a) The number M of random points of Λ tested in the algorithm TraceCycle is often a bounded constant (as, for example, in Theorem 3.1), but in our analysis we allow it to be as large as O(n), see (2).
(b) The algorithm TraceCycle performs O(n) image computations to check whether |λ g | = r 0 m, for each random point λ. Thus if ξ rp , ξ rge , ν im , are upper bounds for the costs of producing a random point using RandomPoint, producing a random group element using RandomGrpElt, and computing the image of a point of Λ under an element of H, respectively, then the cost of FindMCycle is This cost is modest when compared with the cost n k ν im of computing the product of two permutations of Λ (especially when k = O(n)).
Our main result Theorem 3.1 shows that these simple and inexpensive procedures provide an effective way to find and identify elements of S n and A n containing m-cycles from their actions on k-element subsets.

Statement of the main theorem and notation
In order to state our main theorem we introduce several parameters that are used throughout the paper. Suppose that the triple (G, n, m) satisfies one of the Lines of Table 1, and note that r is determined by G, n, m. The integer M used in the algorithm FindMCycle is assumed to satisfy Let d(x) be the number of positive divisors of an integer x. By [11, pp. 395-396], d(x) = x o(1) . In fact, for every δ > 0, there is a positive constant c δ such that for all x. Choose real numbers δ and s satisfying Further let By (4), all of M(1 − s) > 1, 3 − 2s − 2δ > 1, 1 + s − 3δ > 1 and 2s − 2δ > 1 hold. Hence ℓ > 1. Next we define the constant a δ by with c δ as in (3), and the constant b M,δ,s , which we usually abbreviate to b M , by The theorem involves an 'error probability' ε, that is, a real number satisfying 0 < ε < 1. We assume that the integer n satisfies the following inequalities: Theorem 3.1. Let (G, n, m) be as in one of the lines of Table 1, and let k be a positive integer satisfying 2 ≤ k ≤ n/2. Suppose that H is a permutation group permutation isomorphic to G acting on k-element subsets of {1, . . . , n} (via the unknown isomorphism ϕ : H → G). Then the following hold .
(b) Let M be an integer satisfying (2), and let s, δ be real numbers satisfying (4), and ℓ as in (5). Then FindMCycle is a Monte Carlo Algorithm which, given as input the permutation group H, an error probability ε > 0 and the integer M, returns an output h such that, provided n satisfies (8), (i) the probability that h ∈ H and ϕ(h) contains an m-cycle is at least 1 − ε, (ii) the probability that h ∈ H and ϕ(h) does not contain an m-cycle is at most ε/2, and (iii) the probability that h = Fail is at most ε/2. Notation 3.2. For the rest of the paper we assume that n, m, r and G are as in one of the lines of Table 1, noting that r is determined by G, n, m. Let M be an integer satisfying (2), let s, δ be real numbers satisfying (4), and let ℓ, c δ , a δ and b M be as in (5), (3), (6) and (7) respectively.
We use the notation in Table 2 to describe an element g ∈ S n , where γ 0 is a k 0 -subset of Ω. Here we identify a cycle of g with the subset of Ω it permutes.
length of the g-cycle containing γ 0 on k 0 -subsets s-small g-cycle g-cycle in Ω of length less than (rn) s s-large g-cycle g-cycle in Ω of length at least (rn) s ∆(g) union of g-cycles in Ω whose lengths divide rm We define in Table 3 several classes of elements in G. We usually omit mentioning n and m in our notation. For example, we refer to N (n, m) (defined in Section 2) simply as N and to N good (G, n, m) simply as N good .  (8) limits the practical applicability of Theorem 3.1 severely, but we note that in our analysis we allow k to be as large as n/2. The first two inequalities of (8) imposed on n are due to the subdivision of the set of permutations of order divisible by m into disjoint subsets which depend on s. We give a uniform proof that holds for all values of k in the range 2 ≤ k = n/2. If, for example, k were bounded as n increases, then several of the arguments would be simpler and the constraints on n correspondingly less severe.

N
set of all g ∈ S n that contain an m-cycle set of all g ∈ F such that |∆(g)| > 4(rn) s and at least two g-cycles in ∆(g) are s-large Table 3. Families of Elements (c) The main constraint forcing n to be very large is the third inequality in (8). For example, for our parameter choice in Theorem 3.1, namely M = 4, s = 17 24 and δ = 1 6 , we have c δ ≤ 138.32 and, for n large enough, a δ = 25 4 . In this case we find b M > 2·10 8 and the last inequality of (8) dictates n > 3.3 · 10 112 /ε 12 . Moreover, even though a larger value of M allows us to choose a smaller value for c δ , the choice might result in a smaller value for ℓ, which in turn has undesired consequences, making b M larger, and hence requiring n to be larger.

Proof of the Main Theorem
The proof of the main theorem, Theorem 3.1, relies on many supporting results. In this section we subdivide the proof into various parts and show how these parts are then brought together to give a complete proof. The individual parts of the proof are proved in later sections. The main idea of the proof is to divide the elements of S n that could possibly be returned by FindMCycle into disjoint families, and to compute the probability that TraceCycle returns true for an element of each of these families. The families of elements in this subdivision are defined in Table 3, namely N , R, S 0 , S + 1 , S − 1 , S ≥2 , and we use the notation introduced in this table throughout the paper.
Proof of Theorem 3.1(b). We prove this theorem by analysing the algorithm FindMCycle. Let N = 5n log( 2 ε ) . A call to algorithm FindMCycle can terminate in one of three possible ways: (G) For some i with 1 ≤ i ≤ N the i-th iteration of the for-loop returns an element in N . We call this a good outcome.
(B) For some i with 1 ≤ i ≤ N the i-th iteration of the for-loop returns an element which is not in N . We call this a bad outcome. (U) The for-loop is executed N times and TraceCycle returns false for each of the selected random elements. In this case the algorithm returns Fail. We call this an ugly outcome.
Thus to prove the three parts of Theorem 3.1 we must prove Clearly any two of these inequalities implies the third. We shall therefore prove only Prob(B) ≤ ε/2 and Prob(U) ≤ ε/2. To study these outcomes more closely we define the following events.
Proof that Prob(U) ≤ ε/2: For a uniformly distributed random element g ∈ G, let and let p = ρ m p 1 + m−ρ m p 2 , where ρ := ρ(G, n, m) (see Table 1), the proportion of elements of G containing an m-cycle that have order dividing rm. Note that, since the proportion of elements containing an m-cycle in S n is 1/m, we have Prob(g ∈ N good ) = ρ m . Given E i , the event U i is the disjoint union of the events U i1 , that g i ∈ N good and TraceCycle(g i ) = false, and U i2 , that g i ∈ N good and TraceCycle(g i ) = false. Thus Note, in particular, that this probability is independent of i. By (9) we have E i = U i−1 , and hence Prob( and in particular, The required inequality Prob(U) ≤ ε/2 holds whenever p N ≤ ε/2. We now prove the latter inequality. By Proposition 7.5 we have , and so by Lemma 5.2, Table 1), it is sufficient to prove that n n−2 M ≤ 9 8 . By our assumption, M ≤ log( 9 8 ) n−2 2 , and hence and exponentiating both sides gives the required inequality. Thus p N ≤ ε/2 and hence Prob(U) ≤ ε/2 is proved.
We estimate these proportions in Sections 8 -11. Recall the definition of ℓ in (5), and that ℓ > 1. Define b M (R) = 33 8 M and note that δ r 2s+2δ 72. Then Proposition 8.4 and (3) give 24. Then Proposition 9.1 and (3) give Thus by (7), We make a critical observation that the argument up to this point relies only on the first two inequalities of (8), and does not depend on the third inequality of (8).
By (15) and the inequalities (14) and (12), we have that We showed above that n n−2 Proof of Theorem 3.1(a). We use the algorithm TraceCycle with M = 4. Note first that the probability that a random element h ∈ H corresponds to an element g ∈ G containing an m-cycle, given that the h-cycles containing four random k-subsets λ 1 , . . . , λ 4 all have lengths of the form r i m with r i | r, is Prob(g ∈ N | TraceCycle(g) = true).
Recall the definition of q in (13). Then . (4) and (5) all hold. We choose N 0 to be the least natural number for which inequality (8) holds. Hence the inequality (2) holds and in particular also 12(rn) s + 6 ≤ n and (rn) s log(n) ≤ n.
Inequality . Hence, using n ≥ N 0 , and the displayed inequality above, we have ρ(G,n,m) .

Preliminaries
It is useful to collect together some of the arithmetic facts we use in the rather delicate estimations in the remaining sections.
Lemma 5.1. Let n, m, r be as in one of the lines of Table 1, and let d be a divisor of rm with d ≤ n. Then either d = m, or d ≤ 2m/7, or r, d are as in Table 5.  Table 5. possibilities for r and d In particular, either d ≤ 2m/7 or d is one of at most 3 different divisors of rm greater than 2m/7 and in the latter case d ≤ 2m/3 ≤ 2n/3.
The next result follows from the fact that log(1 − p) > −p for 0 < p < 1.
The next inequalities are easily verified.
x . For the estimates in our last arithmetic result Lemma 5.6, we first restate how to estimate sums via integrals.
Lemma 5.5. Let a, b ∈ Z with a < b, and let f (x) be a function defined on the interval [a−1, b+ 1], satisfying one of the lines of Table 6. Then  Table 6. Conditions on f Lemma 5.6. Let a, c ∈ R + and n ∈ Z + with n > a > c + 2 ≥ 3, and let t, ℓ ∈ Z + with t ≥ 2 and t ≥ ℓ. Then, summing over integers x in the interval (a, n], Proof. Note first that if t > ℓ the function f (x) = x t (x−c) ℓ is decreasing on (a, tc t−ℓ ] and increasing on [ tc t−ℓ , n], while if t = ℓ then f (x) is decreasing on (a, n]. In either case, by Lemma 5.5 we have a<x≤n f (

Binomial inequalities and partitions
In this section we prove a result about partitions that will be needed in Sections 11 and 7. As preparation, we prove an inequality about certain binomial coefficients. Lemma 6.1. Let a be an integer such that a > 1, and let c, ℓ be integers such that 1 ≤ ℓ < c. Then Proof. The proof is by induction on ℓ, for fixed c, a. Since c ℓ = c c−ℓ and ca ℓa = ca (c−ℓ)a , it is sufficient to prove this for 1 ≤ ℓ ≤ ⌊c/2⌋.
Suppose first that ℓ = 1. Here it is straightforward to check that Now suppose that 1 ≤ ℓ < ⌊c/2⌋ and that the inequality holds for ℓ. Then, using induction we have This latter quantity is at most ca (ℓ+1)a if and only if and this is equivalent to Now the first factor on the right hand side is equal to (c − ℓ)/(ℓ + 1), and each of the other factors is at least 1 since c ≥ 2ℓ + 1. Thus the inequality (16) holds, and so the induction proof is complete.
and moreover, if d ≤ αn for some α < 1 then Proof. Every part of the proof depends on the following observation: If d ≤ αn for some α < 1 then For (b) let n 0 = ⌊n/2⌋ and k 0 = ⌊k/2⌋. Note then that Now k + k 0 ≤ 2n/3 + n/3 ≤ n. Applying Fact 1 with t = n 0 to the first product and with t = k + k 0 to the second, we obtain . Note that the first inequality is strict if either k 0 ≥ 2 or k − 1 > k 0 , that is, if k ≥ 3. If k = 2 then n 0 k 0 = ⌊n/2⌋, while 2 n k 3k 4n ⌈k/2⌉ = Lemma 6.3. Let d, k, t be positive integers and a > 0 such that k ≤ d and t d−k+1 ≤ a. Then ).
Proof. Note first that Now we state and prove the result on partitions.
Proposition 6.4. Let U be a finite set of size u > 1, and let P be a partition of U in which all parts have size at least 2. For 2 ≤ k 0 ≤ u, let N P (k 0 ) denote the number of k 0 -subsets of U that are unions of parts of P. Then N P (k 0 ) ≤ ⌊u/2⌋ ⌊k 0 /2⌋ , and moreover, if k 0 is odd and u is even, then u ≥ 4 and N P (k 0 ) ≤ (u−2)/2 (k 0 −1)/2 . In particular, N P (k 0 ) = 1 if k 0 = u and N P (k 0 ) ≤ 1 u−1 u k 0 otherwise. Proof. First we construct a partition P ′ of U having at most two parts of size 1, and all parts of size at most 2. Start with P ′ = ∅ and run through the parts of P. For each part P ∈ P of even size, choose any partition of P with all parts of size 2, and add the parts of this partition to P ′ . If all parts of P have even size, then the construction of P ′ is completed in this way. So suppose that P has at least one part of odd size. In this case P ′ will have 1 or 2 parts of size 1, and its construction is completed as follows. For each part P ∈ P of odd size p := |P |, add (p − 1)/2 parts of size 2 to P ′ formed from p − 1 of the points of P . Let P 1 , . . . , P r be the odd length parts of P. Pair up the remaining r points into parts of size 2 and add them to P ′ , leaving exactly 1 or 2 of these points to form singleton parts of P ′ .
Next we define, for each k 0 -subset η of U that is a union of parts of P, a k 0 -subset η ′ that is a union of parts of P ′ . Note that if k 0 is odd then η must contain a part of P of odd size, and in this case P ′ has one or two singleton parts. If k 0 is odd and P ′ has two singleton parts, then we choose one of them, and we always place this chosen singleton part in η ′ . To define η ′ for a given η, we start with η ′ = ∅ and build it up by considering in turn each of the parts P of P contained in η. If |P | is even, then P is a union of parts of P ′ of size 2, and we add all of these parts to η ′ . If |P | is odd, then we add to η ′ all the parts of size 2 of P ′ contained in P . At this stage |η ′ | = k 0 − ℓ, where ℓ is the number of odd sized parts of P contained in η. Next we add to η ′ up to ⌊ℓ/2⌋ parts of P ′ of size 2 that contain points from two different parts of P. If η ′ cannot be completed in this way then either (i) ℓ is odd, or (ii) ℓ is even and is equal to the number of odd sized parts of P. Case (i) occurs if and only if k 0 is odd, and here we add to η ′ the designated singleton part of P ′ . In case (ii) there are two singleton parts of P ′ , and we add to η ′ these two singleton parts.
Note that, if ℓ ≥ 2, then we may have had some freedom in choosing the ⌊ℓ/2⌋ parts of P ′ of size 2 that contain points from two different parts of P, so η ′ may not be determined uniquely by η. On the other hand, η ′ always determines η uniquely, since η is the union of the parts of P that have at least two points in η ′ . Thus distinct sets η correspond to distinct sets η ′ .
It follows that N P (k 0 ) ≤ N ′ where N ′ is the number of k 0 -subsets γ ⊆ U such that γ is a union of parts of P ′ and in addition, if k 0 is odd and P ′ has two singleton parts, then γ contains a designated one of these singleton parts.
Suppose that γ is such a k 0 -subset. If P ′ has at most one part of size 1, then γ contains ⌊k 0 /2⌋ of the parts of P ′ of size 2 (and also a singleton part if k 0 is odd). Thus N ′ ≤ ⌊u/2⌋ ⌊k 0 /2⌋ . Note that in this case, if k 0 were odd, then P would have at least one odd part, and so P ′ would have exactly one odd part, whence u would be odd. Thus the first assertion is proved in this case. So suppose that P ′ has two singleton parts, in which case u is even. If k 0 is odd then k 0 ≥ 3 and γ consists of ⌊k 0 /2⌋ of the parts of P ′ of size 2 and the designated singleton part, whence u ≥ 4 and N ′ ≤ (u−2)/2 (k 0 −1)/2 < ⌊u/2⌋ ⌊k 0 /2⌋ . On the other hand, if k 0 is even then γ consists of k 0 /2 of the two-point parts (or k 0 /2 − 1 parts of size two and the two singleton parts). Again N ′ ≤ ⌊u/2⌋ ⌊k 0 /2⌋ . This proves the first assertion in all cases. Note that ⌊u/2⌋ = ⌊k 0 /2⌋ if and only if either k 0 = u, or k 0 = u − 1 with u odd. If k 0 = u obviously N P (k 0 ) = N ′ = 1. If k 0 = u − 1 with u odd then P ′ has a unique part of size 1 and its complement is the unique k 0 -subset of U that is a union of parts of P ′ -it may or may not be a union of parts of P.
So suppose from now on that ⌊k 0 /2⌋ < ⌊u/2⌋, and set u 1 = ⌊u/2⌋ and k 1 = ⌊k 0 /2⌋. Then ⌊u/2⌋ ⌊k 0 /2⌋ = u 1 k 1 , and by Lemma 6.1, this is at most 1 For a prime p and an integer n, let n p denote the p-part of n, that is the highest power of p dividing n. Recall that, for a positive integer k 0 ≤ n, a k 0 -subset γ ′ of Ω, and an element g ∈ S n , we denote by c k 0 (γ ′ , g) the length of the g-cycle containing γ ′ in the action of g on k 0 -sets. Lemma 6.5. Let g ∈ S n , let C be a g-cycle of length t, let k 0 be a positive integer such that k 0 ≤ t and let p be a prime dividing t.
(a) Suppose that γ ′ is a k 0 -subset of C such that the p-part t p does not divide c k 0 (γ ′ , g). Then γ ′ is a union of Z(C, p)-orbits, where Z(C, p) is the subgroup of order p of the cyclic group g C ∼ = Z t induced by g on C. In particular p divides gcd(k 0 , t). (b) The number σ(k 0 , C) of k 0 -subsets γ ′ of C such that t p does not divide c k 0 (γ ′ , g) is at most ⌊t/2⌋ ⌊k 0 /2⌋ , and in particular, is 1 if k 0 = t, and at most 1 Proof. (a) Since t p does not divide c k 0 (γ ′ , g) and g C ∼ = Z t , it follows that the setwise stabiliser H of γ ′ in g C contains the unique subgroup Z(C, p) of g C of order p. As γ ′ is H-invariant, γ ′ is a union of H-orbits in C, and hence γ ′ is a union of Z(C, p)-orbits in C. In particular, p divides k 0 as well as t.
(b) If k 0 = t then C is its unique k 0 -subset and σ(k 0 , C) = 1. If k 0 < t then, by Proposition 6.4, σ(k 0 , C) ≤ ⌊t/2⌋ ⌊k 0 /2⌋ and also σ(k 0 , C) ≤ 1 t−1 t k 0 . Corollary 6.6. Let G, n, m, r be as in one of the lines of Table 1, and let g ∈ G. Let Σ(g) be as in Table 2 with u = |Σ(g)|, and let k 0 be a positive integer such that k 0 ≤ u. Then the number σ(k 0 , Σ(g)) of k 0 -subsets γ ′ of Σ(g) such that c k 0 (γ ′ , g) divides rm satisfies Proof. For each g-cycle C in Σ(g), by the definition of Σ(g), |C| does not divide rm, and hence there exists a prime p(C) such that |C| p(C) does not divide rm. Let Z(C, p(C)) denote the subgroup of order p(C) of the cyclic group g C induced by g on C, let P(C) denote the set of Z(C, p(C))-orbits in C (all of length p(C)), and let P = ∪ C P(C) denote the corresponding partition of Σ(g).
Suppose now that c k 0 (γ ′ , g) divides rm. Then for each C such that k(C) = 0, also c k(C) (γ ′ ∩ C, g) divides rm, and hence |C| p(C) does not divide c k(C) (γ ′ ∩ C, g). By Lemma 6.5, γ ′ ∩ C is a union of parts of P(C). Thus γ ′ is a union of parts of P. Since all parts of P have size at least 2, this implies that σ(k 0 , Σ(g)) = 0 if k 0 = 1, and the inequality for 1 < k 0 ≤ u follows from Proposition 6.4.

Tracing k-subsets
For the remainder of this paper we assume that k is an integer with 2 ≤ k ≤ n/2. We use ∆(g), Σ(g) and other notation introduced in Tables 2 and 3. Further, we use without further reference the number M of independent uniformly distributed random k-subsets in Algorithm 2 TraceCycle, where M satisfies (2), in particular M ≥ 4.  Table 1, and suppose that g ∈ G does not contain an m-cycle. Set v = |∆(g)| and suppose that v ≤ n−k −1. Then the proportion of k-subsets γ of Ω such that c k (γ, g) = r 0 m, for some r 0 dividing r, is at most v k n k + 1 n−v−1 .
Lemma 7.2. Let G, n, m, r be as in one of the lines of Table 1. Let g be a uniformly distributed random element of G, and suppose that g does not contain an m-cycle, and that v = |∆(g)| ≤ n − k − 1. Then the following both hold.
Proof. Now TraceCycle(g) = true if and only if c k (γ, g) = r 0 m, for some r 0 dividing r, for each of the M independent uniformly distributed random k-sets γ tested during the algorithm. Thus if g does not contain an m-cycle, the probability that TraceCycle(g) = true is p M , where p is the proportion of k-subsets γ such that c k (γ, g) = r 0 m for some r 0 dividing r. By Proposition 7.1, p ≤

For (b), we observe that
The last assertion now follows from part (b). Now we analyse the effect of TraceCycle applied to elements of R. Proposition 7.3. Let G, n, m, r be as in one of the lines of Table 1 and suppose that 12(rn) s + 6 ≤ n. Then, for a uniformly distributed random element g ∈ G, Proof. By definition, for g ∈ R, v = |∆(g)| ≤ 4(rn) s and g does not contain an m-cycle. By our assumptions on n and k and the hypothesis, Thus by Proposition 7.1, the proportion of k-subsets γ such that c k (γ, g) = r 0 m, for some r 0 dividing r, is at most v k n k + 1 n−v−1 ≤ (4r s ) k n k(1−s) + 1 n−4(rn) s −1 . Now TraceCycle(g) = true if and only if c k (γ, g) = r 0 m, for some r 0 dividing r, for each of M independent uniformly distributed random k-sets γ tested during the algorithm. Thus, given g ∈ R, the probability of this occurring is at most Now 12(rn) s < n, that is to say, 4r s n 1−s < 1 3 . Also k ≥ 2, r ≤ 3 and s < 1. Therefore ( 4r s n 1−s ) k ≤ ( 4r s n 1−s ) 2 < 4r s 3n 1−s < 4 n 1−s . Also, by assumption, n − 4(rn) s − 1 ≥ 8(rn) s + 5 > 8r s n s > 8r s n 1−s . Therefore, the probability that TraceCycle(g) = true is at most Next we analyse the effect of TraceCycle applied to elements of N good (defined in Table 3). Lemma 7.4. Let G, n, m, r be as in one of the lines of Table 1, and let k 0 be an integer satisfying 0 ≤ k 0 ≤ k. Let g ∈ N and let C be the m-cycle contained in g. Then the number of k 0 -subsets of C that can occur as γ ∩ C, for a k-subset γ of Ω such that c k (γ, g) is not divisible by m, is at most where ω(d) is the number of distinct prime divisors of an integer d.

Bounding S 0
Let G, m, n, r be as in one of the lines of Table 1, so G is A n or S n . To estimate the probability of a uniformly distributed random element g ∈ G being in S 0 or S + 1 , and TraceCycle(g) = true we use the following result from [8]. Recall the definitions of an s-small and an s-large cycle and of v from Notation 3.2. Let i ∈ {1, 2, 3}. In the next two sections we use the following notation: (1) For v ≥ 1 let P (v, rm) denote the proportion of elements of S v of order dividing rm, and let P (0, rm) = 1.
(2) For v ≥ 1 let P 0 (v, rm) denote the proportion of elements of S v of order dividing rm, all of whose cycles are s-small, and let P 0 (0, rm) = 1.  Table 7. Note that, for the proof of Theorem 3.1, we have n ≥ 156 by Lemma 5.3(ii), so rm ≥ 150, as in line 2 of Table 7.  Table 7. Possible values of a ′ δ for Lemma 8.3 Lemma 8.3. Let m, n, r be as in one of the lines of Table 1. Further, let v ≥ 16 and s, δ, c δ and a δ be as in Notation 3.2. Let a ′ δ and m 0 be as in one of the lines of Table 7 (or more generally as in Remark 8.2) and suppose that rm ≥ m 0 . Then Proof. This result follows from [8,Lemma 2.4] and its proof. A direct application of [8,Lemma 2.4] would require that rm ≥ v, which we cannot guarantee to hold. However, the proof of that lemma shows, without the assumption that rm ≥ v, that whenever v ≥ 3. Statement (a) follows from this, since m ≤ n, δ < s and, Hence the proportion in S v of such permutations is  Table 1. If 12(rn) s + 6 ≤ n and (rn) s log(n) ≤ n then, for a uniformly distributed random element g ∈ G, Prob(g ∈ S 0 ∩ G and TraceCycle(g) = true) ≤ a δ d(rm) 2 r 2s 72 n 3−2s where a δ is as in (6).
where v ranges over all integers satisfying 4(rn) s < v ≤ n.
For g ∈ S 0 (v), the restriction g ∆(g) of g to ∆(g) is a permutation in Sym(∆(g)) of order dividing rm with all cycles of length less than (rn) s . Consider a fixed v-set ∆. If G = S n , then all elements of Sym(∆) are induced by permutations in G. On the other hand if G = A n , then one of the lines 4-9 of Table 1 holds and hence rm is odd; thus all elements of Sym(∆) of order dividing rm actually lie in Alt(∆) and are therefore induced by elements of G. Therefore in all cases the number of possibilities for the restriction g ∆ of elements g ∈ G, for a given v-subset ∆ = ∆(g), is v!P 0 (v, rm) and the restriction g Σ where Σ = Ω\∆ lies in Sym(Σ) or Alt(Σ) according as G = S n or A n , respectively. Hence the number of permutations in S 0 ∩G corresponding to this value of v satisfies As 3(rn) s < 4(rn) s < v, we have n ≥ 156 by Lemma 5.3(ii) so rm ≥ 150, and hence we can apply Lemma 8.3(b) with a ′ δ = a δ . Thus, for a random g ∈ G, For any g ∈ S n with |∆(g)| = v and v ≤ n−k −1, we have in particular Hence, if v ≤ n − k − 1, then the probability that g ∈ S 0 (v) and TraceCycle(g) = true is at most a δ d(rm) 2 r 2s n 2s if n−k−1 < v ≤ n, this probability is at most a δ d(rm) 2 r 2s n 2s Summing over the values of v, we find Prob(g ∈ S 0 ∩ G and TraceCycle(g) = true) We first consider Σ 1 and apply Lemma 5.6 with a = 4(rn) s , c = (rn) s , t = ℓ = 3, and n − k − 1 in place of n. We also use a − 1 − c = 3(rn) s − 1 > 2(rn) s , and find The assumption 12(rn) s + 6 ≤ n implies by Lemma 5.3(i) that (rn) s /n < 1/12. Also, by our hypothesis, (rn) s log(n) ≤ n and, therefore, Σ 1 < 16a δ d(rm) 2 r 2s n 3−2s . Finally, we estimate Σ 2 .
9. Bounding S + 1 Let G, m, n, r be as in one of the lines of Table 1, so G is A n or S n . Recall the definitions of an s-small and an s-large cycle and of v from Notation 3.2 and the notation set out in Notation 8.1.
Proposition 9.1. Let G, n, m, r be as in one of the lines of Table 1.
is the set of all g ∈ S + 1 with |∆(g)| = v and v ranges over integers satisfying 4(rn) s < v ≤ n. For a given v, an analogous argument to that given in the second paragraph of the proof of Proposition 8.4 shows that Thus applying Lemma 8.3(c) we have, for a random g ∈ G, If |∆(g)| = v and v ≤ n − k − 1, then in particular 3 ≤ v ≤ n − 3.
Hence by Lemma 7.2(b), given that Thus, if v ≤ n − k − 1, the probability that g ∈ S + 1 (v) ∩ G and and if n − k ≤ v this probability is at most Summing over v we find Prob(g ∈ S + 1 ∩ G and TraceCycle(g) = true) First we consider Σ 1 . Interchanging the two summations and taking the sum up to n, we obtain the following upper bound, where D ℓ denotes the set of all divisors d of rm satisfying d ≥ (rn) s . Note that v > d + 3(rn) s (see Notation 3.2).
Since rm ≥ 150 by Lemma 5.3(ii), we may apply Lemma 8.3(b) with a ′ δ = a δ , and find that this expression is at most Now we apply Lemma 5.6 with t = ℓ = 4, a = 3(rn) s + d and c = d + (rn) s . Noting that a − c − 1 = 2(rn) s − 1, we obtain that this expression is at most Note that 2(rn) s − 1 > 23 12 r s n s by Lemma 5.3(iii) and, since d ≥ (rn) s , also d+(rn) s d ≤ 2. Note also that d + (rn) s < n and n + 1 − d − (rn) s < n.
Since, by hypothesis (rn) s log(n) ≤ n and by Lemma 5.3(i) n s /n ≤ r s n s /n ≤ 1/12 and r ≥ 1, the last expression is at most We now consider Σ 2 = n−k≤v≤n As v− d > 3(rn) s and n − k ≥ n/2 we have by Lemma 8.3(b) (with a ′ δ = a δ ) that where v(d) = max{ n 2 , d+3(rn) s } since, by Notation 8.1, each d ∈ D + 1 (v) is less than v − 3(rn) s . By Lemma 5.5, this quantity is at most In particular each d ∈ D + 1 (v) is less than m. By Lemma 5.1, there are at most three divisors of rm which are less than m and greater than 2m/7, and the sum of the reciprocals 1 d of these divisors is at most 7 m , which is less than 7.3 n since n ≥ 156 (by Lemma 5.3(ii)). Using v(d) ≥ d + 3(rn) s and Lemma 5.3(iii), the contribution from these exceptional divisors is therefore at most Thus altogether we get a proportion of at most d(rm) 2 (rn) −2s .
11. Bounding S − 1 Proposition 11.1. Let G, m, n, r be as in one of the lines of Table 1.
By the remarks above If k 0 ≤ u − 1 then, by Corollary 6.6 and our considerations above, σ(k 0 , Σ(g)) ≤ 1 Thus (21) is proved if k ≤ u − 1, so suppose that k ≥ u. Recall that u > (rn) 1−s + 1 by (19). Hence To complete the proof of part (a) it remains to estimate the number K =0 = K =0 (g) of k-subsets γ ⊆ ∆(g) such that c k (γ, g) = r 0 m for some r 0 dividing r. Since this number is zero if v < k, we assume that v ≥ k. Recall that C is the unique s-large cycle of g contained in ∆(g) and d = |C|. By Lemma 5.1, d ≤ 2m/3 < 2n/3. Since m divides c k (γ, g) it follows that γ ⊆ C. We prove K =0 n k ≤ 30.6 (rn) 1−s .