فهرست مطالب
Journal of Iranian Statistical Society
Volume:3 Issue: 2, 2004
- 180 صفحه،
- تاریخ انتشار: 1383/08/15
- تعداد عناوین: 10
-
-
Page 89We present new links between some remarkable martingales found in the study of the Binary Search Tree or of the bisection problem, looking at them on the probability space of a continuous time binary branching process. 1 Introduction This paper is a kind of game with martingales around the Binary Search Tree (BST) model (see Mahmoud [26]). The BST process, under the random permutation model is an increasing (in size) sequence of binary trees (Tn)n0 storing data, in such a way that for every integer n, Tn has n + 1 leaves; the growing from time n to time n+1 occurs by choosing uniformly a leaf and replacing it by an internal node with two leaves. In the BST we are interested in the profile, i.e. the number of leaves in each generation. A polynomial
-
Page 117The purpose of this article is to survey recent results on distributional properties of random binary search trees. In particular we consider the profile and the height.1 Introduction Probably the most widely used sorting algorithm is the algorithmQuicksort which was invented by C. A. R. Hoare [14, 15]. It is the standard sorting procedure in Unix systems, and the basic idea can be described as follows:A list of n (different) real numbers A = (x1, x2,. .. , xn) is given. Select an element (a pivot) xj from this list. Divide the remaining numbers into sets A,A> of numberssmaller (or equal) and larger than xj. Next apply the same procedure to each of these two sets if they contain more than one element. Finally, we end up with a sorted list of the original numbers.
-
Page 139Suffix trees are the most frequently used data structures in algorithms on words. In this paper, we consider the depth of a compact suffix tree, also known as the PAT tree, under some simple probabilistic assumptions. For a biased memoryless source, we provethat the limiting distribution for the depth in a PAT tree is the same as the limiting distribution for the depth in a PATRICIA trie, even though the PATRICIA trie is constructed from statistically independent strings. As a result, we show that the limiting distribution for the depth in a PAT tree built over n suffixes is normal.Abstract. Suffix trees are the most frequently used data structuresin algorithms on words. In this paper, we consider the depth of acompact suffix tree, also known as the PAT tree, under some simpleprobabilistic assumptions. For a biased memoryless source, we provethat the limiting distribution for the depth in a PAT tree is the sameas the limiting distribution for the depth in a PATRICIA trie, eventhough the PATRICIA trie is constructed from statistically independentstrings. As a result, we show that the limiting distribution forthe depth in a PAT tree built over n suffixes is normal.
-
Page 149We give an alternative treatment and extension of some results of Itoh and Mahmoud on one-sided interval trees. The proofs are based on renewal theory, including a case with mixed multiplicative and additive renewals. 1 Introduction Itoh and Mahmoud [2] have studied some one-sided versions of binary interval trees. These are obtained from full binary interval trees (see Section 2 for definitions) by pruning one of the two subtrees at each node; in other words, we are left with a single path in the binary interval tree. Five different such trees, defined by different pruning policies, are studied in [2]. Using an analytic method, Itoh and Mahmoud find explicit or implicit expressions for the moment generating function of the size of the tree, and they derive in each case asymptotic normality and asymptotic expressions for the mean and variance of the size.We will here give an alternative treatment using renewal theory, which enables us to generalize the results. In particular, the results
-
Page 165We investigate the distribution, mean value, variance and some limiting properties of an urn model of white and red balls under random multiple drawing (either with or without replacement) when the number of white and red balls added follows a schedule that depends on the number of white balls chosen in each drawing.
-
Page 175An ordered tree with height h is b-balanced if all its leaves have a level ` with h − b ≤ l≤ h, where at least one leaf has a level equal to h − b. For large n, we shall compute asymptotic equivalents to the number of all b-balanced ordered trees with n nodesand of all such trees with height h. Furthermore, assuming that all b-balanced ordered trees with n nodes are equally likely, we shall determine the exact asymptotic behaviour of the average height of such a tree together with the variance.
-
Page 219This paper deals with the amount of disorder that is left in a permutation after one of its elements has been selected with quickselect with or without median-of-three pivoting. Five measures of disorder are considered: inversions, cycles of length less than orequal to some m, cycles of any length, expected cycle length, and the distance to the identity permutation. “Grand averages” for each measure of disorder for a permutation after one of its elements has been selected with quickselect, where 1, 2,. .. , n are the elements being permuted, are computed, as well as more specific results.
-
Page 251A large number of results in analysis of algorithms contain fluctuations. A typical result might read “The expected number of. .. for large n behaves like log2 n + constant + (log2 n), where (x) is a periodic function of period one and mean zero.” Examplesinclude various trie parameters, approximate counting, probabilistic counting, radix exchange sort, leader election, skip lists, adaptive sampling. Often, there are huge cancellations to be noted, especially if one wants to compute variances. In order to see this, one needs identities for the Fourier coefficients of the periodic functionsinvolved. There are several methods to derive such identities, which belong to the realm of modular functions. The most flexible method seems to be the calculus of residues. In some situations, Mellin transforms help. Often, known identities can be employed. This survey shows the various techniques by elaborating on the most importantexamples from the literature.
-
Page 271We give an overview of the running time analysis of the random divide-and-conquer algorithm FIND or QUICKSELECT. The results concern moments, distribution of FIND’s running time, the limiting distribution, a stochastic bound and the key: a stochastic fixed point equation.
-
Page 297In 1986 S. Sattolo introduced a simple algorithm foruniform random generation of cyclic permutations on a fixed number of symbols. Recently, H. Prodinger analysed two important random variables associated with the algorithm, and found their mean and variance. H. Mahmoud extended Prodinger’s analysis by finding limit laws for the same two random variables.The present article, starting from the definition of the algorithm, is completely self-contained. After giving a simple new proof of correctness, we generalize the abovementioned probabilistic results by determining the “grand” probability generating functions of the random variables. The focus throughout is on using standard methods that give a unified approach, and open the door to further study.