Formal Languages, Automata and Numeration Systems 1 (eBook)
John Wiley & Sons (Verlag)
978-1-119-00822-4 (ISBN)
Combinatorics on words deals with problems that can be stated in a non-commutative monoid, such as subword complexity of finite or infinite words, construction and properties of infinite words, unavoidable regularities or patterns. When considering some numeration systems, any integer can be represented as a finite word over an alphabet of digits. This simple observation leads to the study of the relationship between the arithmetical properties of the integers and the syntactical properties of the corresponding representations. One of the most profound results in this direction is given by the celebrated theorem by Cobham. Surprisingly, a recent extension of this result to complex numbers led to the famous Four Exponentials Conjecture. This is just one example of the fruitful relationship between formal language theory (including the theory of automata) and number theory.
Contents to include: • algebraic structures, homomorphisms, relations, free monoid • finite words, prefixes, suffixes, factors, palindromes
• periodicity and Fine–Wilf theorem
• infinite words are sequences over a finite alphabet
• properties of an ultrametric distance, example of the p-adic norm
• topology of the set of infinite words
• converging sequences of infinite and finite words, compactness argument
• iterated morphism, coding, substitutive or morphic words
• the typical example of the Thue–Morse word
• the Fibonacci word, the Mex operator, the n-bonacci words
• wordscomingfromnumbertheory(baseexpansions,continuedfractions,...) • the taxonomy of Lindenmayer systems
• S-adic sequences, Kolakoski word
• repetition in words, avoiding repetition, repetition threshold
• (complete) de Bruijn graphs
• concepts from computability theory and decidability issues
• Post correspondence problem and application to mortality of matrices
• origins of combinatorics on words
• bibliographic notes
• languages of finite words, regular languages
• factorial, prefix/suffix closed languages, trees and codes
• unambiguous and deterministic automata, Kleene’s theorem
• growth function of regular languages
• non-deterministic automata and determinization
• radix order, first word of each length and decimation of a regular language
• the theory of the minimal automata
• an introduction to algebraic automata theory, the syntactic monoid and the
syntactic complexity
• star-free languages and a theorem of Schu ̈tzenberger
• rational formal series and weighted automata
• context-free languages, pushdown automata and grammars
• growth function of context-free languages, Parikh’s theorem
• some decidable and undecidable problems in formal language theory
• bibliographic notes
• factor complexity, Morse–Hedlund theorem
• arithmetic complexity, Van Der Waerden theorem, pattern complexity • recurrence, uniform recurrence, return words
• Sturmian words, coding of rotations, Kronecker’s theorem
• frequencies of letters, factors and primitive morphism
• critical exponent
• factor complexity of automatic sequences
• factor complexity of purely morphic sequences
• primitive words, conjugacy, Lyndon word
• abelianisation and abelian complexity
• bibliographic notes
• automatic sequences, equivalent definitions
• a theorem of Cobham, equivalence of automatic sequences with constant
length morphic sequences
• a few examples of well-known automatic sequences
• about Derksen’s theorem
• some morphic sequences are not automatic
• abstract numeration system and S-automatic sequences
• k − ∞-automatic sequences
• bibliographic notes
• numeration systems, greedy algorithm
• positional numeration systems, recognizable sets of integers
• divisibility criterion and recognizability of N
• properties of k-recognizable sets of integers, ratio and difference of consec-
utive elements: syndeticity
• integer base and Cobham’s theorem on the base dependence of the recog-
nizability
• non-standard numeration systems based on sequence of integers
• linear recurrent sequences, Loraud and Hollander results
• Frougny’s normalization result and addition
• morphic numeration systems/sets of integers whose characteristic sequence
is morphic
• towards a generalization of Cobham’s theorem
• a few words on the representation of real numbers, β-integers, finiteness
properties
• automata associated with Parry numbers and numeration systems
• bibliographic notes
First order logic
• Presburger arithmetic and decidable theory
• Muchnik’s characterization of semi-linear sets
• Bu ̈chi’s theorem: k-recognizable sets are k-definable • extension to Pisot numeration systems
• extension to real numbers
• decidability issues for numeration systems
• applications in combinatorics on words
Formal Languages, Automaton and Numeration Systems presents readers with a review of research related to formal language theory, combinatorics on words or numeration systems, such as Words, DLT (Developments in Language Theory), ICALP, MFCS (Mathematical Foundation of Computer Science), Mons Theoretical Computer Science Days, Numeration, CANT (Combinatorics, Automata and Number Theory). Combinatorics on words deals with problems that can be stated in a non-commutative monoid, such as subword complexity of finite or infinite words, construction and properties of infinite words, unavoidable regularities or patterns. When considering some numeration systems, any integer can be represented as a finite word over an alphabet of digits. This simple observation leads to the study of the relationship between the arithmetical properties of the integers and the syntactical properties of the corresponding representations. One of the most profound results in this direction is given by the celebrated theorem by Cobham. Surprisingly, a recent extension of this result to complex numbers led to the famous Four Exponentials Conjecture. This is just one example of the fruitful relationship between formal language theory (including the theory of automata) and number theory.
Michel RIGO, Full professor, University of Liège, Department of Math., Belgium.
FOREWORD ix
INTRODUCTION xiii
CHAPTER 1. WORDS AND SEQUENCES FROM SCRATCH 1
1.1. Mathematical background and notation 2
1.1.1. About asymptotics 4
1.1.2. Algebraic number theory 5
1.2. Structures, words and languages 11
1.2.1. Distance and topology 16
1.2.2. Formal series 24
1.2.3. Language, factor and frequency 28
1.2.4. Period and factor complexity 33
1.3. Examples of infinite words 36
1.3.1. About cellular automata 43
1.3.2. Links with symbolic dynamical systems 46
1.3.3. Shift and orbit closure 59
1.3.4. First encounter with beta-expansions 62
1.3.5. Continued fractions 69
1.3.6. Direct product, block coding and exercises 70
1.4. Bibliographic notes and comments 77
CHAPTER 2. MORPHIC WORDS 85
2.1. Formal definitions 89
2.2. Parikh vectors and matrices associated with a morphism
96
2.2.1. The matrix associated with a morphism 98
2.2.2. The tribonacci word 99
2.3. Constant-length morphisms 107
2.3.1. Closure properties 117
2.3.2. Kernel of a sequence 119
2.3.3. Connections with cellular automata 120
2.4. Primitive morphisms 122
2.4.1. Asymptotic behavior 127
2.4.2. Frequencies and occurrences of factors 127
2.5. Arbitrary morphisms 133
2.5.1. Irreducible matrices 134
2.5.2. Cyclic structure of irreducible matrices 144
2.5.3. Proof of theorem 2.35 150
2.6. Factor complexity and Sturmian words 153
2.7. Exercises 159
2.8. Bibliographic notes and comments 163
CHAPTER 3. MORE MATERIAL ON INFINITE WORDS 173
3.1. Getting rid of erasing morphisms 174
3.2. Recurrence 185
3.3. More examples of infinite words 191
3.4. Factor Graphs and special factors 202
3.4.1. de Bruijn graphs 202
3.4.2. Rauzy graphs 206
3.5. From the Thue-Morse word to pattern avoidance 219
3.6. Other combinatorial complexity measures 228
3.6.1. Abelian complexity 228
3.6.2. k-Abelian complexity 237
3.6.3. k-Binomial complexity 245
3.6.4. Arithmetical complexity 249
3.6.5. Pattern complexity 251
3.7. Bibliographic notes and comments 252
BIBLIOGRAPHY 257
INDEX 295
SUMMARY OF VOLUME 2 303
"This nice book is devoted to a quickly growing field, at the frontier between theoretical computer science, combinatorics, and number theory." (Zentralblatt MATH, 2016)
Introduction
This book, comprised of two volumes, is a somewhat extended version of lectures basically dedicated to combinatorics on words and numeration systems that I am giving at the University of Liège. The course is usually (but not necessarily) followed by students interested in discrete mathematics or theoretical computer science. The chosen level of abstraction should allow undergraduate students to study the exposed topics.
1.1. What this book is or is not about
In the long process of writing this book, I have expanded my initial notes with many examples and many extra concepts to acquire a consistent overview of the field. Nevertheless, this book is not intended to serve as an encyclopedic reference.
I have picked some of my favorite topics in the area and I have also decided to shorten the presentation of some items (not because there are less interesting but choices had to be made to keep this book reasonably short). Indeed, the book most probably reflects what I myself prefer: I am always more interested in the combinatorics and the underlying discrete structures arising from a problem.
When preparing this book, I chose to present a fairly large variety of basic notions and important tools and results. Sometimes, I only give an overview of a subject and proofs are, therefore, omitted. For the reader wanting to study a specific topic further, many pointers to the relevant bibliography are given and each chapter ends with notes and comments. Indeed, the main goal of this book is to give quick access to actual research topics at the intersection between automata and formal language theory, number theory and combinatorics on words.
1.2. A few words about what you will find
The notion of a word, i.e. a (finite or infinite) sequence of symbols belonging to a finite set, is central throughout this book. It has connections with many branches of mathematics and computer science: number theory, combinatorics, formal language theory, mathematical logic, symbolic dynamics, coding theory, computational complexity, discrete geometry, stringology, etc.
Combinatorics on words. We can be interested in the combinatorial properties of finite or infinite sequences of symbols over a finite alphabet: what the possible arrangements are, how many such configurations can be achieved and so on. As a trivial example, over a binary alphabet any word of a length of at least 4 contains a repeated factor of the kind uu (try to prove it). Therefore, we can look at patterns that are unavoidable in sufficiently long sequences or count the number of patterns or configurations that may appear in a particular context. These are some of the general questions that will be considered in this Volume. In particular, we will concentrate on infinite words that can be obtained by a simple procedure consisting of the iteration of a morphism over a free monoid. We will mostly deal with a large class of self-similar words: the so-called morphic words and, in particular, and with automatic sequences that are generated by a constant-length morphism.
Formal language theory. A language is merely a set of words. In this book, we will mostly encounter languages of finite words. One exception is a short incursion into symbolic dynamical systems with the language of the β-expansions of the real numbers in the interval [0, 1). Chomsky’s hierarchy introduced in the theory of formal languages provides a classification depending on the machine needed to recognize an infinite language of finite words. From a computational perspective, the simplest languages are the regular languages. They are accepted (or recognized) by finite automata, and described by regular expressions. Chapter 1, Volume 2 is a short chapter presenting the main properties of these languages. We will constantly see connections existing between regular languages, automatic sequences and numeration systems. For instance, quite often we associate a finite automaton with a morphism.
Number theory. A finite word can also be used to represent an integer in a given numeration system (e.g. integer base expansions and many other non-standard systems are discussed in depth in several chapters of this book). To quote A. Fraenkel: “There are many ways of representing an integer uniquely!” [FRA 85]. Similarly, an infinite word can represent a real number or the characteristic sequence of a set of integers. With that respect, a natural question is to study links existing between arithmetical properties of numbers (or sets of numbers) and syntactical properties of their expansions. Chapter 2, Volume 2 is dedicated to numeration systems with a particular emphasis on words representing numbers. Indeed, the chosen numeration system has a strong influence on the syntactical properties of the corresponding representations. A cornerstone is the notion of a recognizable set of numbers whose elements, when represented within a given numeration system, are recognized by a finite automaton.
Formal methods applied to infinite words and sets of numbers. In Chapter 3 of Volume 2, I describe a recent trend in combinatorics on words. Due to automata theory and Büchi’s theorem, we will see how formal methods enter the frame regarding decision problems, or automatic theorem-proving, relevant in combinatorics on words. If a property about some infinite words can be described by a well-written logical formula, then this property can be tested automatically. Such a procedure holds for a large class of infinite words generated by iterated morphisms (for automatic sequences and those stemming from Pisot numeration systems as presented in this book). The expressiveness of Presburger arithmetic (with an extra predicate) provides an interesting alternative to dealing with a sufficiently large class of problems about infinite morphic words. We can imagine automated certificates for several families of combinatorial properties. But the price to pay is that we would have to deal with fairly large automata. It is a field of research where combinatorists and computer scientists can work together fruitfully: on the one hand, it is well-known that, in the worst-case, the obtained decision procedures can be super-exponential, but on the other hand, the considered problems about words seem to be of relatively small complexity.
1.3. How to read this book
The goal is that, after reading this book (or at least parts of this book), the reader should be able to fruitfully attend a conference or a seminar in the field. I hope that the many examples presented along the text will help the reader to get some feeling about the presented topics even though we are not going too far in the technical aspects of the proofs. Also, prerequisites are minimal. We will not explore topics requiring measure theory or advanced linear algebra (we have avoided results related to Jordan normal form of matrices) or non-elementary number theory. Two sections are devoted to results in algebraic number theory and formal series. Sections 1.1.2 and 1.2.2 serve as references that the reader may consult when needed. Sections 3.1 and 3.2 of Volume 2 give a self-contained presentation of the concepts of mathematical logic needed in this book. Those rigorous and technical sections should not discourage the reader to pursue his/her study. Most of the material can be accessed without much background.
My initial aim was to quickly get to the point but it seems that the stories I wanted to tell were indeed quite longer than I initially thought. I have to confess that writing this book was a quite unexpected adventure (I was perpetually trying to meet the deadlines and also dealing with my other duties at the University and at home).
There are several paths that the reader can follow through this book. Some are quite long, some are shorter.
– For a basic introduction, I propose reading parts of Chapter 1 (skipping the reference sections), Chapter 2 up to and including section 2.4. If the reader already has some knowledge about automata, then we can conclude with Chapter 3, Volume 2, concentrating on results about integer base systems.
– For a one-semester course in combinatorics on words, I propose a reading of the first three chapters, not sacrificing the rigorous presentation of section 1.2.1.
– For a numeration system oriented reading, again organized over one semester: browse through the first chapter (with a careful reading of the examples related to numeration systems), then go to section 2.3 and conclude with the last two chapters of Volume 2.
– For a course oriented toward interaction between automata, logic and numeration systems, we can focus on Chapters 1 and 3 of Volume 2.
About other sources treating similar subjects, an excellent companion for this book is definitely Automatic Sequences: Theory, Applications, Generalizations [ALL 03a] written by Allouche and Shallit. I do hope that the two books can be read independently and can benefit from each other. There is also a non-zero intersection with several chapters of the Lothaire’s book Algebraic Combinatorics on Words (namely those about Sturmian words written by Berstel and Séébold and the one on numeration systems written by Frougny) [LOT 02]. Some chapters of the volume Combinatorics, Automata and Number Theory [BER 10] as well as [PYT 02] can also serve as a follow up for the present book. In particular, Cassaigne and Nicolas’s chapter on factor complexity is a natural continuation for our Chapter 2. I should finish by mentioning two papers that were very...
| Erscheint lt. Verlag | 10.9.2014 |
|---|---|
| Reihe/Serie | ISTE | ISTE |
| Sprache | englisch |
| Themenwelt | Mathematik / Informatik ► Informatik ► Programmiersprachen / -werkzeuge |
| Mathematik / Informatik ► Informatik ► Theorie / Studium | |
| Technik ► Elektrotechnik / Energietechnik | |
| Schlagworte | Automata • cellular • Computer Architecture • Computerarchitektur • Computer Science • cyclic • Factor • Factors • First Encounter • Formale Sprache • infinite words • Informatik • Introduction • Irreducible • Ix • matrices • Matrix • Morphism • Morphisms • Number • Proof • Properties • Scratch • Sequence • Structures • symbolic dynamical |
| ISBN-10 | 1-119-00822-0 / 1119008220 |
| ISBN-13 | 978-1-119-00822-4 / 9781119008224 |
| Informationen gemäß Produktsicherheitsverordnung (GPSR) | |
| Haben Sie eine Frage zum Produkt? |
Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM
Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belletristik und Sachbüchern. Der Fließtext wird dynamisch an die Display- und Schriftgröße angepasst. Auch für mobile Lesegeräte ist EPUB daher gut geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine
Geräteliste und zusätzliche Hinweise
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich