By Rens Bod

Over the last few years, a brand new method of linguistic research has began to emerge. This method, which has grow to be recognized less than a number of labels corresponding to 'data-oriented parsing', 'corpus-based interpretation' and 'treebank grammar', assumes that human language comprehension and creation works with representations of concrete earlier language reviews instead of with summary grammatical principles. It operates by means of decomposing the given representations into fragments and recomposing these items to investigate (infinitely many) new utterances. This booklet indicates how this basic technique can follow to varied sorts of linguistic representations. Experiments with this procedure recommend that the efficient devices of normal language can't be outlined by way of a minimum algorithm or ideas, yet have to be outlined through a wide, redundant set of formerly skilled constructions. Bod argues that this final result has vital effects for linguistic idea, resulting in a wholly new view of the character of linguistic competence.

**Additional resources for Beyond grammar: an experience-based theory of language**

**Sample text**

It is easy to see that for every derivation in G there is a unique derivation in G' with the same probability. Thus, the sum of the probabilities of all derivations of a string in G is equal to the sum of the probabilities of all derivations of this string in G'. This means that G and G' assign the same probability to every string in their string language. Thus, G and G' are weakly stochastically equivalent. D From the propositions 1 and 2 the following corollary can be deduced. Corollary 1 The set of stochastic string languages generated by STSGs is equal to the set of stochastic string languages generated by SCFGs.

A parse whose yield is equal to string W, is called a parse of W. The probability of a parse is defined as the sum of the probabilities of its distinct derivations. A word string generated by an STSG G is an element of VT+ such that there is a parse generated by G whose yield is equal to the word string. For convenience we will often use the term string instead of word string. The set of strings, or string language, generated by G is given by Strings(G) = [W\3T:Te Parses(G) A W= frontier(T)} The probability of a string is defined as the sum of the probabilities of its distinct parses.

However, as an example of how Formal Stochastic Language Theory may be used to formally articulate this, we will compare SCFG and STSG in the context of this theory. 2 Definition Stochastic Context-Free Grammar A Stochastic Context-Free Grammar G is a 5-tuple < VN, VT, S, R, P> where: VN is a finite set of nonterminal symbols. VT is a finite set of terminal symbols. Se Vf] is the distinguished symbol. e (VN(jVT)+. P is a function which assigns to every production a—»/? ) < 1 and Z* P(a-^x) = 1.