On September 30, Crystal Akers successfully defended her doctoral dissertation, “Commitment-Based Learning of Hidden Linguistic Structures”; abstract below. Congratulations, Crystal!
Commitment-Based Learning of Hidden Linguistic Structures
Learners must simultaneously learn a grammar and a lexicon from observed forms, yet some structures that the grammar and lexicon reference are unobservable in the acoustic signal. Moreover, these “hidden” structures interact: the grammar maps a given underlying form to a particular interpretation of an overt form. Learning one structure depends on learning the structures it interacts with, but if the learner knows one, its interactions can be exploited to learn the others. The Commitment-Based Learner (CBL) employs this strategy, using error-driven learning (Gold 1967, Wexler and Culicover 1980) and inconsistency detection (Tesar 1997) to determine when to make commitments and what kinds of commitments to make.
The CBL overcomes structural ambiguity by extending branches from a hypothesis and committing to a separate structural interpretation in each branch, as in the Inconsistency Detection Learner (Tesar 2000). It resolves lexical ambiguity by committing to a feature value only when certain of that value, following the Output-Driven Learner (Tesar, to appear). Each hypothesis branch has its own lexicon whose values reflect the interactions of underlying forms with the branch’s structural commitments.
In computer simulations, the CBL learns all 97 languages in a constructed typology whose linguistic system includes 370 million grammar and lexicon combinations. For each language learned, the CBL takes far fewer steps than those needed to exhaustively search for a consistent and restrictive combination.
The dissertation also uncovers a previously unrecognized relationship: paradigmatic equality. Paradigmatic equals (PEs) have different maps, but because their morpheme behaviors are identical, their learning data is equivalent. Therefore, the learner cannot completely learn either PE unless it receives additional information. The CBL proposes resolving this persistent uncertainty by committing to an input-output mapping which is consistent with the language hypothesis yet yields an error on the current ranking. In the system investigated, there are always two such mappings, each a member of one of the PEs. Committing to a mapping adds new ranking information that allows the learner to derive the hypothesis consistent with the PE that includes the mapping. The learner can learn both PEs by extending branches and separately committing to each mapping.