Noam Chomsky, Fundamental Issues in Linguistics (April 2019 at MIT) - Lecture 2

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
well last time I talked about a number of things and got up to the point of beginning to discuss the problems that exist with the concept merge that was developed back in the 90s and has been used in many ways since there's a kind of a simplest version of merge which was the original intention which just had the two special cases external and internal merge as I mentioned last time the more primitive of the two is actually internal merge but because of the fact that language has X eccentric constructions that can't suffice I mentioned some of the things that you can explain on the basis of merge and also wanted to make the point that a genuine explanation in linguistics will if we're viewing the study of language as part of the study of nature basically the bio linguistics program which I think has roots back to the 17th century as I mentioned last time although you have to skip the structuralist a vollis period if we're engaged in that enterprise then a genuine explanation will always have to meet these austere conditions of learner ability and evolved ability which are very hard to meet anywhere in biology and in particular here there's some reason to think that they might be the conditions might be attainable here because of the specific conditions of human evolution which i mentioned briefly last time if those pictures correct there's some antecedent reason to believe that there might be success in the enterprise which is rarer in the biological sciences well the concept merge does happen to meet those conditions it beats the condition of learnability because there's nothing to learn it meets the condition of evolve ability because since in fact the basic problem the basic principle does exist there had to be something to evolve the computational procedures that yielded it and it would obviously be at least the simplest one so we can be secure with explanations based on the concept merge but anything else is problematic it's a very austere condition but it's one that really has to be met well I then started in done talked about some of the examples where you can get an explanation some interesting cases but there are problems the problems are that the concept was very loosely defined and all sorts of other applications implementations have been given which kind of more or less fall within the original loose definitions but I think are probably illegitimate I'll talk about that today and I think if we think through the matter carefully we end up with just allowing what was originally intended and none of the extensions for good reasons which leaves us with many problems some of which have I think potential solutions others look quite mysterious well the I mentioned last time something which is a kind of a paradigm for many cases the simplest case when you have only two elements and first of all since we do have X eccentric constructions it's going to be necessary for the operation merged actually operate on a workspace not on elements because you're always changing the workspace every time you fly merge so we have some sort of a definition that bullet capital merge which does take two things that you want to merge on a workspace that exists and form the new workspace which will include at least this and then other things and I suggested the notation I will use square brackets for the workspace and curly brackets for the syntactic objects there's a crucial difference between them the works the workspace is a set but it's not an accessible object for operations so we'll just distinguish them by that notation and the notation actually means something so for example we if suppose the workspace consists of just X we want to distinguish that from X so the singleton set is different from the it's member because the workspace is not a syntactic object and X is on the other hand for the syntactic objects we want the opposite convention singleton set is the individual element and there are good empirical reasons for this which go back to phrase structure grammar so in phrase structure grammar just by convention didn't allow rules like say ax and B that's assumed not to be a reasonable rule that's essentially saying that a singleton set is identical with its member now this was fudged often in the use of phrase structure grammar so there were allowed rules were allowed like this when you move from phrase structure grammar which is a totally unacceptable for language for myriad reasons as was recognized since the fifties when you move from that to x-bar theory then this becomes meaningless right because there's no VP if it's only V actually the despite the fact that it's meaningless it is used so for example if you try to implement Ritchie canes LCA you know try to get it to work you're forced to have rules like this which is a serious problem in the LCA system I think you have to show that you argue that if you have a a verb object structure the object even if it's pronoun still is complex otherwise you don't get the right ordering but that's technically illegitimate and x-bar theoretic structure and here the analog of that legitimacy is this convention so we want to accept this convention which has a number of consequences for syntactic objects and this convention for work spaces which are different kinds of things that the role sets okay well the if in the simplest case we just have a work space content consisting of these two guys and we merge them he is a workspace which contains set a which we've merged and the question is what else now if it was normal recursion likes a theoretical you would have here a and B but you can't have it for language for reasons which I mentioned last time and this is a kind of a paradigm that applies to a great many cases of the extensions of merge the reason it doesn't work is that this can be built up to an object of arbitrary complexity this since it's accessible and then merged that gives you a relation between the thing that's merged up here the thing down here which violates every imaginable condition of movement so that's illegitimate and we do not want to have legitimate operations which yields illegitimate conclusions Elementary so therefore we conclude that surprisingly these things aren't here for language recursion for language is different from general recursion namely it has the property the way both lest I'm restricting resources and it's what computation for our language and presumably for organisms generally is doing is trying to keep the resources as limited as possible you have to get something new or you know you don't generate anything but you want to generate as few things as possible this really turns into a number of sub cases one sub case is limiting accessibility accessibility means something's accessible if the merge operation can see it do something to it we want to limit accessibility if we allowed general recursion we'd have too much accessibility here so we want to limit it to the minimal amount it's tempting as I mentioned less time to try to relate this to a more general property of brain computation namely the brain is pretty dumb and slow so what it does is throw out tons of data that are coming in fact that's its main activity is to get rid of lots of stuff that's coming in so in the visual system the sensory part of the visual system is essentially perfect so you get a cell responding to a photon of light can't do better than that but that's pouring into the brain tons of information that are going to totally overwhelm computation so the sensory system throws out almost everything get down to just the limited part the same true in language acquisition HIV the phonetic system is throwing out just about all the noise that comes in picking only very limited kinds of phonetic properties and even those are being thrown out very quickly in early language acquisition that's the main part of language acquisition charles Yang's model for general language acquisition kind of exploits this generally a notice instantly speaking of yang that if you have this property you infer determinacy it turns out that if you think it through when you limit the resources available you're also going to force determinacy meaning the operation will be uniquely determined by what it's looking at that's not a trivial property it wasn't true for example of standard versions of phrase structure grammar so in a phrase structure grammar we reached a point where you had something of the form and you have a rule expanding NP which one you apply it to is indeterminate but if you have in this narrow much narrower system resource restriction you get determinacy and that's kind of important because Charles's work on the price of productivity or very important work depends crucially on the assumption that the operations are determinate so we want to get those rich consequences we want that property it's one of the few examples I know of work and computational statistical computation linguistics that has real consequences very rich consequences very important work so resource restriction is this make sense as a property has a lot of interesting consequences Lots follow from them now resource restriction is going to have two opponents sort of one of them is restrict computation the other is restrict resources restrict computation means limit yourself to the minimal kinds of computation that are possible we'll merge is one case it's the least possible computational operation but it also wants to operate in the most limited way so one of the consequences would be heap'd a minimal search don't use deep search minimal search already works many empirical consequences that notice that limit accessibility already includes things like general condition that limits accessibility minimal search is another movement when you move to the next stage you don't look down you only find the thing that you first thing that you find that raises many interesting questions about possible ambiguity are there ambiguous cases I'll come back to that quite interesting question later on but those are the things that are in the background now if you look at the original definition of merge back in 1995 it actually had this property inadvertently wasn't noticed particularly but the operation of merge as it was defined was defined basically as replace which says don't keep what you already had but get rid of it that was not particularly noticed but it's a property of the original definition and there are good reasons for it we can now see good reasons for it if you don't accept it you do get legitimate operations which yield illegitimate can ujin's the clearest sign that something's radically wrong and this is a paradigm case but it extends to many others go into that well so for example take something that I presume no one's ever proposed all I'll draw trees but let me make should make it clear the trees are very misleading notations one should be aware of them for one thing a tree notation suggests that this exists that the root two exists but why does the word exist if you just have merged operations in fact what the root is understood to be is something that comes from some other source namely project ability but that should be a completely separate property projection seems to have nothing to do with the compositional operations the tree notation kind of sticks them all together and misses many questions about what project ability is a particularly interesting case in ex eccentric constructions so what's projected that has interesting consequences the old theory of labeling deals with that I assume you're familiar with it or put it aside but we don't want that the other thing that three notations allow you to do is throw lines and all kind of complicated ways throw a line from here to here that seems to mean something in a tree it means absolutely nothing at a base system and you really have to be very careful about that so I'll use trees for exposition but with the condition that you don't take them seriously well let's take something that I presume nobody's ever suggesting suppose you have a structure like this and you decide to merge these the original definition doesn't say you can't that would mean you're forming the set X P Y P but it has nothing to do with this original set just something added on notices it has exactly the properties that are barred here it adds accessibility it's adding new accessible items which will then be subject to exactly the problem I already mentioned namely this object can be made as complexes you like you could then merge one of these guys do it and you violate all conditions so we're not allowed to do this as far as I know nobody ever proposed this but there's a good reason why you can't do it however there are things that people have proposed and I think they're all ruled out for the same reason I'll kind of leave it you as an exercise to work out why but if you think about things like parallel merge which is usually written in trees like tree notation seems to be saying something but it's not because there's no way to constructors but if you think about what parallel merge is doing it's increasing accessibility runs into this very same problem a parallel merge is the basis for many a lot of work in the literature which yields multidimensionality the idea of multidimensionality goes back to the 70s work by Jimmy Coley and others but if you try to reconstruct it in a merge based system you can draw the trees with funny lines like this but if you end the way of constructing it is through parallel merged which has this lethal property that it yields illegitimate consequences from legitimate operations so everything that if you look up the handbooks of contemporary syntax there's a chapter on multidimensionality which has many interesting consequences about a TV across-the-board movement parasitic gaps and so on but none of them works because they're all based on an illegitimate operation which has this deficiency same true of the side works merge as the same problem the same is true in spades this time of late merge which is widely used I've used number of times many others have late merge first of all has this problem it's creating a new when you draw a tree it looks as if you can do it you just add a line to the tree and you've got late merge but if you try to spell it out in terms of the merge operation you're forced first creating a new object which is bad enough because that's already illegitimate but then you're adding a new operation a substitution operation which inserts what you've just created at just the right point inside the tree that's a pretty tricky operation try to formulate it it's a new complex operation so late merge has a double problem one the problem of not restricting accessibility a second the problem of invoking a new operation which is really unfortunate thing about it and it's way out of the framework of anything we're talking about so I don't and there's many very interesting results that follow from late merge very much like multidimensionality but I think the way to look at those results is as problems the problems that have been constructed in an interesting way so we have organized data instead of chaotic data which is step forward but it's only a step towards eventual explanation in terms of something that meets the austere conditions that we're interested in and that sets interesting empirical problems to address I think there are some answers in some cases which all okay what's going on let there be light okay is that the second day of creation I think your divine okay so where are we sorry we need a different God okay you guys doing that well there's a lot that follows from all of this I'm kind of leaving it as an exercise to think it through but if you think it through what you'll find is that all of these applications extensions of merge including the kind that nobody's ever thought of including others that have been used widely all have the same problem they all have exactly the problem that you see with the simplest case and the problem again crucially is that if they are constructing what are alleged to be legitimate operations but when you apply them you get illegitimate conclusions and that is the sign that there's something seriously wrong you know obviously that can't be so all of those the entire literature big literature that yields very interesting array of results is not the and explanation it's a proposal of problems that's posing problems that are interesting important big step forward it's useful to have organization of data instead of just chaotic data but that's not explanation that's not the goal of linguistics at least as a science well all of this suggests the kind of a research program the first determine that which subclass of the loosely characterized operations of merge that which subclass of them are in fact legitimate that's a research problem if you run through it I think think you think through it case-by-case again leave this is an exercise I think what you end up finding is the originally intended ones not defined properly but the originally intended ones are probably the only ones the rest of the extensions are not legitimate interesting consequences but not legitimate the next problem is to formulate merge so it gives you just the right ones and then try to explain why that definition of merge is the kind that should be reached on general considerations general conditions that any linguistic operation oughta meet and this should be deducible from them along with third factor properties minimal computation minimal resources well I won't run through the cases but we get something that looks like this and we have to put various conditions on X 1 2 X n so what are the conditions well I won't bother spelling it out formally I'll just say it intuitively don't lose anything in WS spell it out it means if Y is in WS and Y is distinct from P and Q it's got to be in the X's okay so you don't lose anything we don't want something to just disappear of course of the operation second condition limit accessibility in fact limited to one it has to be at least one because you're constructing a new object okay otherwise you're not doing anything but don't do anything beyond that so no new accessibility should be permitted by merge at a third condition is x1 in other words don't throw in some new junk that has nothing to do with the operation well what we want to do of course is get rid of these we want a definition of merge which has no conditions but notice we basically already have that the first one that follows from the no tampering condition the no tampering condition has to be revised now so that doesn't apply to syntactic objects but to the work space because all the operations are on the work space the no tampering condition says if something's in the work space don't change it ok well the most extreme form of changing is to delete it so you can't delete it so for many reasonable interpretation of the general condition NTC which is part of the strong minimalist thesis it follows y'all can lose anything this one we've gotten rid of the tallest from resource restriction which is a special crucial property of organic computation seems at least for language but probably quite general generally probably related to the general brain activity of massively reducing the data available for computation so we've gotten rid of this one but this one implies this one if you don't addict if you're going to add any more junk it'll increase accessibility automatically so therefore we don't need that condition so therefore we can get rid of all of these they all follow from plausible in fact necessary conditions on general computational procedures for a organic object so that gives us the best possible definition of merge and if you think it's through on principle grounds and if you think it's through it restricts it just to the original intention which was never captured by the actual formulations just kind of in mind turns out that what was in mind was actually created correct and all of the extensions have to be barred now I should say one word about one of the general conditions kind of a meta condition descriptive adequacy that we want them we want the operations of course to be descriptively adequate but that's not an innocent notion you don't know from data whether they're the right data descriptive adequacy and this is not just linguistics all through rational inquiry all through science is a theory determined notion it's not innocent you get a lot of data and say chemistry you don't know is this real data or not there are two kinds of problems the data could have one it might involve too many variables lots of other factors that you're not interested in in fact they should just look at the phenomena of the world they're just worthless you know too many things are going on so you don't develop physics on the basis of just observation of the phenomena in the world if you're in Silicon Valley that's the way you do it what I'm talking about science no no Silicon Valley so what you do is try to find get rid of the data that doesn't really matter it doesn't have to do with what you're interested in that's a theory internal notion of the other problem is that you look at the phenomena that around you they usually don't include the relevant data they don't include the critical experiments the kind that matter this was all these are all problems that were faced in the early days of the Scientific Revolution and they were sort of settled for the sciences but linguistics and then the soft sciences haven't really internalized it so if you go back to say the 17th century of the Galilean effort to try to rebuild science on firm grounds throwing out Neos scholastic occult properties and so on what was important and he had a hard time convincing the funders the aristocrats not the National Science Foundation convincing the funders that there was some point in this so it was very hard for the aristocrats to see why you should care about whether what happens when a all rolls down a four in the Splane which can't happen why should you care about that and not leaves blowing around in the wind you know which you see all the time that's a big move actually and if you think about the problem it's not trivial so why do why is the rate of fall independent of mass I mean if Galileo had done experiments they wouldn't have worked you know too many other things would have happened so what he did was just a thought experiment neat thought experiment said suppose you have two masses which are identical to objects which are absolutely identical okay and suppose they fall well obviously they'll fall at the same rate suppose you bring them a little bit closer together that's not going to make any difference they'll still fall at the same rate I suppose you bring them so close together that they actually touch in a point well that can't change anything but now it has double the mass okay so we've proved the theorem without an experiment most of Galileo's experiments if you run through the dialogue and so on are really like this so for example another problem that was bothersome is what happens if you have a sailboat sailing through the ocean and you drop something from here where is it gonna fall is it gonna fall here or is it gonna fall here Brazilian physics is it'll fall here sailboats moving forward so of course it will fall behind where you dropped it from oh they wanted to argue that it's gonna fall here was he had done experiments I leave it to your imagination to see what you would find about junk okay so you don't do experiments you just do critical experiments often just thought experiments and for linguistics that happens all the time so you read the literature these days linguistic papers cognitive science papers stuff coming out of Silicon Valley at Google one of the great achievements heralded is to be able to parse 95% of the sentences that you find in the Wall Street Journal okay suppose you could parse a hundred percent of the sentences and get the right result with training it would mean absolutely nothing a sentence in The Wall Street Journal is just an experiment is this a acceptable sentence or not you don't care if you can match a hundred percents of random experiments that's of no interest first of all a lot of the experiments have the wrong data too many variables also the other thing is they don't include the critical experiments the kind you're interested in can you get parasitic apps for example can you get garden path sentences well it turns out when you look at the critical experiments they fail almost totally they get maybe 95 percent of the data but that's a result of absolutely no interest and a lot of the field is sort of going off in that direction even in the linguistic literature you find it anyhow without going on this concept is not a trivial concept it's not an innocent concept what depends a lot follows from trying to understand what descriptive adequacy means as a theory internal notion anyway we certainly want to have we want to be able to achieve the level of understanding that was reached in the 17th century in the sciences I think that's a fair goal to understand that there's something significant and serious and theory internal about what we call descriptive adequacy there are other kin missions that have to be satisfied one of them many Muslim we just take for granted every one of them call it stability by that I mean in the course of a derivation a syntactic object can't change its interpretation okay so for example if you topical eyes say Mary's book I want to read in the internal system the non externalize system it's going to be Mary's book but these two objects have to have the same interpretation like this one you can't be saying I want to read the book that Mary owns but I'm talking about of the book that she bought let's say okay that's sort of taken for granted same for ellipsis if you say I read Mary's book and so did Bill what Bill read is it's the same Mary and if she ownership in both cases so you have to have a general principle that's telling you that anywhere through a derivation you can't change the interpretation of the expression it doesn't matter for right now how you express this fact but it's got to be somewhere in the deep inside the theory that has a lot of consequences we'll come back to that at this point notice at this point we're getting into a very interesting area the area where we have to identify what are called copies and repetitions okay so here the two cases of Mary are copies of each other copies symmetrical the term is a little misleading that you have to recognize it symmetrical so these are the same basically the same entity they have precisely the same interpretation for ellipses for any operation they could be copies like if I say John saw John then these two are repetitions if you look at the generation of the sentence you felt you had the same formal object but they were generated independently and they have nothing to do with each other this one might as well have been built okay that distinction between copy and repetition is a tricky one there's an interesting paper by Chris Collins and Eric wrote which goes through sind Ling buz I think I don't think they published it which goes through a lot of problems in trying to distinguish copies of repetitions okay I think we can cut through all those problems in a non trivial way and again I'm going to leave it as an exercise for you to work it out but if we appeal to a general a very general principle which seems overwhelmingly true looks false in some cases but in that kind of situation it's reasonable to assume that if we understood enough it would always be true it comes down to saying is that argument structure theta theory in particular is determined by pure comes compositionality so in fact the strongest conceptual reason I think for Dominique's predicate internal subject hypothesis is that the subject gets a theta role Dominique and Hilde have a lot of other arguments for it but the basic conceptual argument I think is it's a theta rule so therefore it oughta be determined in these and in fact anything that gets a theta rule ought to be in this what about things that are out and that includes functional categories like they have an argument structure but they're going to be determined by just well that's always yielding things which have no independent interpretation so internal merge is sort of determining aspects of semantics which have to do with discourse with information structure not argument structure that looks like a very sharp distinction then it follows that when at the phase level when interpretation is trying to determine what's a copy and what's a repetition it has a big and stand on a pretty high ladder if something's in a non theta position it's a copy if it's in a theta position it's a repetition okay that cuts very sharply now it does leave possible cases of ambiguity if you think through the possible cases there are some that seem not to be determined by this but here I'll just give you a thesis and ask you to prove it in your spare time it turns out I think that there is a kind of a conspiracy of other principles it solves the ambiguities things that seem to matter or news which makes distinctions that you don't see sometimes in the morphology and that turns out to be quite important another is Ritchie's left periphery theory which assumes that there are actual positions is that like topic focus and so on that a raised element moves to okay and third connected with this labeling theory tells you when those movements are legitimate when they give you a real interpretation when you have to move on when you don't have to move on I think all these together it probably solves the ambiguity problems that all these artists another exercise very interesting you might think through the one of the tricky cases which is not easy to deal with there's small clauses so you might want to think about that so it has interesting consequences when you try to think how that would work for this problem but if it we can solve it this way then we can solve the copy repetition problem simply by looking at we'll say that every merge operation yields a copy nothing else yields a copy in the case of external merge the copies that are yielded disappear on the replace interpretation of merge so you don't have to see them with internal merge they remain and then this page level algorithm based on relative semantics along with the conspiracy that language is kind enough to provide us with should resolve the ambiguities of interpretation that's the details you know the two trees you put down at the start now you copy either see the way people draw it is like this we're not allowed trees what you're actually doing is forming a new object E so now you have two objects a B now you take this one you make it arbitrarily complex anything you want we're allowed to merge this this is accessible remember okay it's a copy of that so you can keep merging it again you merge it to this you now have a copy relation here but it violates every condition it's back to the initial case all the cases follow from the simple case so you're so parallel emerges out boxes not adding accessible things you're adding if you do this you're adding this thing which is accessible and you're also adding this which is accessible in fact this so you're adding three accessible things to the workspace I did this one these are old new remember they're not of the ones you started with they're new objects and they're now accessible okay this one personally because this is the one you've moved and now you have two copies of it both of them accessible okay plus the pair which is accessible so you're violating a resource restriction which is the crucial condition most everything is following from resource restriction which i think is a probably a deep problem organic computation the same is true of side words merge collapses for the same reason notice in duality of semantics that has consequences that may be objectionable so for example it seems to rule out Norbert horn stuns a theory of control right Norbert's interesting theory of control relating control to raising raises a controlled element to a new theta position okay so that's giving a internal merge to a theta position violating duality of semantics so here we have a problem either Norbert's theory is wrong or duality of semantics isn't properly formulated is this the same reason why sideward movement is ruled out so I heard smoothing the same person it's even more reasons because at least this it does form a new object with more accessibility but there's also the question about how you connect these two separate things at least it has the problem of more accessibility that one runs across all of the extensions of merge look through them all of them have this property so they all basically reduce to this very simple operation that I had down here somewhere about Olymp in the simplest case that serves as a paradigm for just about everything so we're now down to with various problems hanging around on the side like what about Bernstein's theory of control we seem to have these converging on exactly what we want some friends some the simplest possible operation conceptually justified which gives us exactly the cases that don't yield illegitimate operations and illiteracy cases well what about all the problems that are left over so it takes a across-the-board movement the parallel merge and multidimensionality give you interesting ways of describing a TV notice describing because none of them are legitimate but at least you can kind of draw graphs that kind of look as if they were doing something what do we want to is is the point was it so doing parallel merge work you know whatever similar cyberthreats you want that to be bad due to the constraints imposed by whatever give us a restriction of resources there's not enough stuff accessible to do that the way that we restrict resources is we just buy a phases I guess if I understand things right does that mean there's a phase you can't have any operate as soon as you look at the operation if it adds accessibility itself okay because there's a meta condition on linguistic operations probably on general Organa communicate a computation but at least on linguistic operations which says you can't add resources when you have an operation it can create one new accessible object the object that you've constructed can't construct any others holds also for agree and labeling and other things but the interesting case is merged yeah but none of the operations should add new accessible items this should be a general property two elements which is av MVC right why do you call them the elements of the why do you call those accessible yeah there's no reason wasn't I mean unless you stipulate somehow that they're not I mean this one was able to move from here then it's able to move from here I mean it's the same structure unless you impose some other rule that says I'm not allowed to move but there's no reason for that remember this is and uh this is a copy in I am copies are always allowed to move so unless you simply stipulate unless you simply stipulate I'm not allowed to move which gives the game away then you're not allowed to have the operation because there's no basis for that stipulation it was originally notice that it was originally allowed to move that's the whole point so therefore it should still be allowed to move it's the same structure okay the other I don't think there's any way to salvage it it's just wanna take a look at the things that the exam the empirical cases that were described by these illegitimate operations like say a TV and read well the how do you generate this first of all well it would start with I'll forget the what Sean what what let's remember we're talking about the internal correlate that's not the external light what that's the thing that we generate internally well why doesn't that is the interpretation standard answer was the thing that he read might be different than the thing that he bought so it's not giving it the right interpretation of however think it through for a second this is a coffee this is a copy what is the property of copies property of copies is they all delete except for the top one that's a general principle of computation so like in success of cyclic movement you delete all the copies I mean there are languages where you lead some residue somewhere will forget about that basically it delete all the copies that's general computation okay so that means that in the externalization that we delete this course we delete these well notice that that gives you the right form it gives you the form what did John buy and read it does it give you the right interpretation well it does because of the principle of stability principle says in ellipses or topical ization or anything at the in the CI system you can only delete if you have absolute identity otherwise just can't delete okay general property of interpretation that see deletion requires perfect identity but notice that that gives you a TV automatically nothing else to say if you think about parasitic gaps essentially works the same way there's actually a couple of interesting paper by which goes through the details of it but in fact the basic idea is pretty straightforward okay so ATP in parasitic caps hit straight notice also you get in the case of parasitic caps it follows that a movement won't yield parasitic gaps because you don't have the initial WH freeze which will allow the second one to be a copy so our city cap with a bow a movement would be something like John can't delete it at the sea that's fine they're just there's nothing wrong with this interpretation it's just not ATP and Mary you just decided not to call this a copy call this something say call this a repetition you can always call something a repetition the book was John was John saw John repetitions generate them separately if you don't say anything one of the options is a TV another option is the other interpretation which is all you want and notice that if you use the ATB interpretation then you're deleting so if you're giving the a TV interpretation you're reading it as a TV without with the deletion because the deletion is dependent on the stability what did he read no what did you buy and Mary totally different sentences what did john-boy and Tom went to the store but it just it just has the interpretation it has and Mary if you have two WHS which are copies by definition because they're in I am nan theta positions so therefore there is the option for them to be interpreted as copies and if they're identical to delete which gives you the correlation between the ATB semantic interpretation and the external ization with deletion you don't have to delete it you couldn't be saying what did john-boy and what it taught if it's a TB it has to be the same thing that's point about a TB if you delete if it's a lead you can't get the different interpretation what did John buy and Mary read a book what did John buy and Mary read is the same but there's no if either it's a TB or it's two independent constructions that's what a TB is think about it otherwise there wouldn't be any a TB phenomenon what it's John drink and Mary read is fine well for those people they don't have a TV okay we're talking about a TB the phenomenon right if somebody says my language sorry but we're talking about people who have a TB which has this interesting property that you delete the things and interpret them identically that's the whole point of a TV okay so we're not going into the question of to somebody you have some different language we're talking about the languages which have a problem to solve okay namely a TB if you don't have the problem to solve John bought a Cadillac to marry drank crepes and if that's your insurance then you don't accept the molten dimensionality analysis either okay fine that's that's fine I don't care I mean if somebody has a different system we can try to describe that one but I'm talking about a TB okay if you don't have it fine no you don't you have something different you have something that is differ from what it described in the ATB literature okay but so then we can talk about that yep okay but I'm trying to say that what is captured by parallel merge in the multidimensionality interpretation in fact is yielded automatically with nothing okay same with parasitic gaps if you look at parasitic apses a million different problems about them and this doesn't address those problems it's just saying the basis the very basis for power city gaps we already have without anything including the fact that they are conditional on WH movement in the first sentence not a move all of that follows right away then you get into the morass of problems about different kinds of powers that it caps okay different kind of if you have two things in the positions they're either copies or repetitions and you get to choose which one John likes John what I'm saying I'm giving you as an exercise is to show the following and I think it works that if you accept duality of semantics the general principle and you look at these properties of language I think you end up resolving the ambiguities determining what are copies and what are repetitions that's a claim okay it's up to you to falsify so no prisoner gasps but a movement does is claim to that to have to show across the board movement so this is just dealing with a TB with wh movement okay listen there's other questions about a move yeah but I'm not talking about this the main ones are the ones in the multi-dimensional you don't get a movement because the element in the gap the wh the operator in the gap has nothing to be a copy of so these two things are just totally independent of one another it's like John ate a sandwich before reading doesn't mean anything different problem it's not dealt with in the multidimensionality literature of the kind that I'm discussing okay I think it works but it's basically the same that's a fair question we should look at that yeah okay well this the question of the what you doing right up to buying what a John read gap in buying what is the higher copy of it that you like well there is something like what did John read what John but before what John buying and the - what's our copies so you get the same phenomenon if you want the details look a Trinis paper but that's basically the structure okay and it gives you the core properties of parasitic atoms leaves lots of questions open about different kinds of well let me yeah anymore yep Sabine the questions that we might have in mind is you to comment on a sentence like what two graves were one friend there are many other questions about interpreting anybody who's interested in the kinds of questions that Barry's raising should look at the book the longest book in the literature with the shortest title and the author is sitting over there the hundreds of pages of very interesting examples of ways of interpreting conjunction and complex structures based on a a kind of a new David Sony and event calculus and there are tons of problems there that are really interesting to solve but what we're interested here and is asking what is the basis in the structures for yielding those interpretations that Barry actually doesn't go into that question and the book I think um you tell me if I'm falsifying it that Barry describes it as fundamentally as a kind of conjunction reduction but if you think about that formally it can't me and I think this is a hope a friendly amendment to the book it doesn't mean that you first generate all these huge conjunction things you know infinitely many of them for short sentence and then get the syntactic structure it must mean no that's the thing I'm going to get to next that you have a syntactic structure and there are some kinds of interpretive rules that we have to figure out that give you this mass of amazing stuff that you find it there okay that's the challenge to face and and if the invent event calculus approaches the correct one that will yield the event calculus interpretations of the structures that you unfortunately Barry tells me he's now working on another book a lot of things I'm not talking about okay yeah it's a fair question but not talking not that there's any other way of talking right it's just that if it exists we're gonna want to have an explanation of it in terms of operations which meet the austere conditions of explanation that's the general point whatever problem you're working on whatever it may be phonology semantics syntax morphology if you want an actual explanation within the context of a program that regards language as part of the natural world if that's your framework you're going to have to have explanations into terms that meet this highly austere condition of learnability and evolvability about the only thing we know that meets those conditions is merge so if we can account for things in those terms like say a TB and parasitic gaps we're in business okay otherwise we have problems okay let me turn to another one which is problematic and is related to what Barry was just raising well let me just make one more comment about this I won't oh they're spilling it out you'll notice that this is a sketch I haven't really formalized it but you can figure out how to formalize it when you're left with this definition of merge the simplest one I think is principal then you can reformulate it in the usual style of transitive closure frigging ancestral you know like if you're characterizing the set of integers the standard way of doing it is to say the set of integers includes say 1 and is the least set it's the least set containing 1 and the successor of any member of the set that's a standard transitive closure ancestral definition the analogue here would be the set of work spaces for a given language is the set not the least set we leave out least which includes the lexicon and merge of any triple okay that's the set of work spaces we don't have to say least because that's already incorporated in resource restriction okay otherwise you get the standard recursive definition of the set of work spaces sort of fits the norm instead of that let's go to something else there's pretty good reason to think that in addition to merge which maybe we've now got in the optimal form there's probably another operation at least one other operator that is a symmetrical there are strictly a symmetrical sulcus structures like say young man man the structure is it's a noun phrase of the two elements and young men are inaccessible right like you can't extract this and leave this or the other way around so we have an a symmetric structure where young is attached to man and the whole result is still basically man adjuncts essentially all adjunct structures I think require hara mirch which is the next operation to look at now there's a very interesting property of pair merge which has been a thorn in the side of all generative systems since the 1950s namely unbounded unstructured coordination okay so things like young happy eager you can have an unbounded unstructured coordination this is a real problem you can't generate it by phrase structure grammar even unrestricted rewriting systems which are universal don't in the standard interpretation don't give you these structures for unbounded unstructured coordination now notice that since they are universal you can code it but that's not interesting they are universal Turing capability so you can find a coding for it but that's not of interest if you look at the generation by first structure grammar you need an infinite set of rules it was thought for a while by George Miller me back papers back in the fifties that you could get around this with generalized transformations but Howard Lesnick had a paper showing that the same problem arises you'd have to have infinitely many of them so you can't do it by phrase structure grammar and do it by transformations there's no way of generating it it's been a big problem all along but notice there is a way of dealing with it in terms of pair merge namely super multidimensionality right so you have say man and you link to it any number of adjuncts they're all on different dimensions right but there's no limit to the number of dimensions you can pair merge to the element okay weird there's no reason to believe that just because blackboards are two-dimensional so is the mind you know it does whatever it does so it could have any number of possible dimensions attaching to a particular point so for simply and that would give you something like unbounded unstructured coordination this can incidentally become extremely complex here we get into various type of questions so for example one of the conjunct it could be a disjunct okay it could be John is young angry either going to Harvard or to go to MIT so on so you can have unbent that and the disjunct could also be unbounded so you can have unbounded unstructured disjunction inside of unbounded structured unstructured coordination and this can yield incredibly complex structures I leave it to you to give the semantic interpretation of them but it's but in the nature of the system you can see that this is possible well instead of trying actually it's just a the formalism is not very difficult so I won't go into it but just take the simplest case and take a look at that unstructured if we can deal with unstructured unbounded coordination then the simplest cases of adjuncts are just automatic they're the case where there's only one element instead of instead of an unbounded sequence of elements so we will get simple i junks if we can handle the unstructured case so let's take a look at that that's the notice a few properties of it for one thing it matters what the order is so if if I say if the order of the adjuncts is young angry that's different from angry young the reason is because of something that Jim McAuley noticed back in the 70s of the word respectively okay so if you think of structures with respectively then the young angry man that ate the turkey sandwich and the the young angry men ate the turkey sandwiches and the chicken sandwiches respectively the order of the adjuncts determines the nature of the interpretation so somehow we have an on-court of the structure the objects that we have in an unbounded coordination is actually a sequence the furthermore can have iterations like you can say John is young angry young tired and so on you can iterate them so basically we have the problem is we have unbounded coordination or disjunction you with which has a sequential structure with possible repetitions and that sequence is interpreted both at the CI level and the externalization level okay now this does not tell us that linear order enters into syntactic operations it just tells us there's some object being constructed which is going to be interpreted in terms of its order and spelled out that way okay so we're not crossing the barrier into believing that externalization feed ci okay that's important even though there's order involved so what we have to have is something that works sort of like this if you just think of the general properties you have to be able to pick out a set of things that are going to be adjuncts and you have to form from that set a sequence where the elements of the sequence are drawn from the set okay but in any possible way that requires an operator it's actually an operator that's familiar in logic it's Hilbert's epsilon in Hilbert's formalization of metamathematics the or operator that he develops as a basis is this epsilon which is the out of a set you can pick an element basically an you know it's like it's like indefinite articles so we need an operator like this which tells us that given a set in the generation of an expression given a set you pick out a sequence and then somehow the elements of this sequence link to something each of them is going to link to it independently so if I say young angry man the man is both young and angry okay so independently they're going to link to something so what we're getting out of this is a set which first of all we'll have to it'll have to be identified as either coordination or conjunction or disjunction okay so we have an element here call it a which will be plus or minus conjunction will be one of the other and remember they can be interspersed but that's just our formalism and this sequence will include the pair merged elements y1 and some link that it's linking to all the way up to Y n a link that it's linking these links have to be identical all the way through like if one of them is a wh phrase there all have to be wh phrases if you think about unbounded coordination you can't stick a question and in one of the positions and of course you can't have different links so we have an object that looks like that and that has to be merged into the general expression okay that's the formal problem of taking of dealing with a junction okay I won't bother spelling it out it's raises interesting questions so for example one question is what do you actually link to so suppose you have coordinated noun phrases John Bill Tom Mary the guy I met yesterday etc etc each of those things is going to link to something what is it going to link to well the natural interpretation would be that the link the individual items here Y a L if it's a noun phrase they should all link to whatever is common to noun phrases okay some thing call it in notice here that I'm not using the DP analysis which i think is a mistake I won't go into it here but it seems to me that nominal phrases or should be regarded as nouns not as determiners determiners are probably something that hang off the outside and definiteness is probably a feature of the whole noun phrase so I'm assuming that Semitic is the universal language that the determiner is just a feature on the noun phrase which distributes somehow differently in different languages depending on externalization so you have a feature of the noun phrase specific on specific you have a structure which is basically in with determiners hanging on somewhere is it what is this end well here we get back to something that's again suggesting years ago that the basic structure of languages again kind of like proto semantics you have roots which are unspecified as to category and then you have categorize errs that determine what they are so for example n and the root probably pair merged probably in the lexicon that's probably the first operation going back to a paper of yours the first operation is probably a lexical operation there are many operations inside the lexicon that involves birds type operations that one of them is probably categorization but notice that this end that's determining that this root is a noun can't be the same as this one okay same with verb the V that's determining that something's a verb has to be distinct from what we usually call a V or V star you know up at the face level those are just different elements they shouldn't be confused I don't think in fact these the ones at the face level I think should simply be regarded as phase markers independent of category okay categories decided down below probably in the lexicon at the phase level you have something saying I'm a phase a phase marker and notice that if you take a look at noun phrases and verb phrases they have some interesting similarities some differences but some similarities that one similarity is that both noun phrases and verb phrases can be either call it strong or weak with regard to extraction so the complex noun phrase constraint it's well-known is strong for specific noun phrases but it's basically inoperative for nonspecific noun phrases that's it hold it that's the same it sounds like the same distinction as between strong phases and weak phases so transitive verbal phrases are strong with regard to extraction you have to move to the edge the weak ones you don't move to the edge it looks like the same property as a weak noun phrases so possibly what we have is something like this going back to classical Greek grammar philosophy linguistics didn't distinguish we have the notions substantive and predicate which gives us a four-way classification of things that are substantive on predicate those are the nominal phrases absent if predicate is adjective phrases non-substantive plus predicate verbal phrases and then on either which is all the junk prepositional phrases and so on some structure like that and the phase the crucial phase operations seem to be restricted to those that are the sort of perfect elements pure substantive pure verbal okay with either either of the stronger weak of property now one of the curious distinctions between noun phrases and verb phrases which has kept prevented the thinking of noun phrases as phases is you don't get the normal escape hatch but I discovered a couple of days ago thanks to somebody that you bully has an escape hatch for noun phrases so that fits in the gaps that we were worrying about so let's say boolean proto Semitic or the or languages and that would have put noun phrase and verb phrases together with the ending the idea of using the same notion for the categorize er and the phase marker there are probably different notions I would like to go back to the original para verge emerge and PQ in the workspace the original one yeah if you run the clock backwards to the first merge the stuff you sort of talked about and Q then becomes the workspace because in that definition q equals the workspace there's nothing work space is empty if you store it with just two things you start with one right before you start merging there is a workspace which is empty well there's no work space until workspaces you've said okay so the workspace is the set key Q in this definition because why do you have three elements peachie the workspace can't have P cube unless they get into the workspace you merge them so how do you get them into the workspace okay so in this recursive way Q is the prior workspace right so at some point you finish merge and you have a workspace before you start another set before you do any merge you have nothing it just means there's the option of creating a certain set which you can put things in if you want the workspace has nothing in it unless you put something in it okay and then the question is so the workspace at some point Q though if you have Q equals Q if you've put Q into it yes let's imagine that we haven't we're just beginning the computation that we take we take P and stick it into the workspace now the workspace is the singleton set including P now we put Q in the workspace now it has two elements then we decide to merge them we get a new element set P Q but not P and Q there is an interesting empirical question here how do you start okay and there are various options and they have lots of consequences one possibility is that the only thing that goes into the workspace at the beginning is things you've already merged in the lexicon okay so if the first remember inside the lexicon there are instructional operations going on like the words and the lexicon already have structure okay part of the structure if a gate is correct and I'm assuming she is is for taking a category like n:v may be broken down into substitution predicates and categorizing a route as one thing or another an operation inside the lexicon which is giving you a pair like the pair V hit let's say the verb hit then that thing can be put in the workspace it already has two elements paired then you could put something else in the workspace begin to merge them put more things in build up the workspace you have to every operation that you carry out is going to create something that's what an operation is create something now the resource restriction says don't create too much create as little as you can at least the thing that you're forming but nothing else notice that when you put in the pair root verb neither is accessible because an adjunct structure that's what I said before she takes a young man you can't extract young you can extract man notice incidentally that this approach I mentioned earlier less less time that there's a way I've mentioned a paper of the chaco Boscovich pointing out a way of putting the adjunct island and coordinate island problems in the same package making them essentially the same problem based again on the idea of event calculus which somatic event calculus which treats a junction like coordination so it kind of unifies the problems and that happens automatically here the air merge structure gives you both the islands of conjunction and the islands of ad Junction now notice that it leaves the mysteries just like shell closed paper does so if you look at say the adjunct island effect which Jim talked about years ago it has interesting properties there are some languages where you can't extract the adjunct there are other languages where you can't extract from the adjunct okay and other distinctions of that sort those are interesting problems that remain furthermore if you look at adjuncts they're not uniform there are some kinds of adjuncts which you can extract from there are other kinds which can't extract from so the notion adjunct is to diffuse we have to sharpen it further to find different kinds of probably different kinds of pair emerge those problems all are sitting out there more problems to solve but you begin to get a unification of the problems the adjunct island conjunct island effects do reduce to the same structure this sequence that you pick out by the epsilon operator now as far as the point that you're making as I understand it it does raise it leaves open some questions about how you get things from the lexicon into the workspace there are a couple of ways of thinking about this and they have different consequences one approach is just to take something in the lexicon insert it in the workspace and then go on from there another approach is to take something from the lexicon and to merge it with something that's already in the workspace okay that's formally slightly different it has different consequences when you spell it out that you may need both you know those are questions that you want to resolve certainly but I think it's plausible to believe that the OP that the whole system of operations begins by just forming categorized routes inside the lexicon then building up from there okay I don't see a problem with categorized route itself is but not the parts like you can't just raise the route and leave the categorize er or conversely I mean here we're talking about accessibility to merge there are questions you can raise about whether you can have agreement into an adjunct let's see that's a different question here we're talking about accessibility to merge okay lots of other questions yeah okay let me just get in kind of late so I mentioned a couple other things that you might deal with in terms of para merge there's lots of interesting questions hanging around that have a potential I think pair immersion alysus so let's take one that's been a crazy problem for a long time there's a strange restriction on extraction from positive type verbs and perception verbs so if you look at structures like John bill walking down the street you can pacifies this you can say bill was seen walking down the street on the other hand if this was a bear of her walk then you can't do it you can't say bill was in walk down the street this also holds for the kind of positive type verbs verbs like let and make they don't have the full paradigm but they have part of it they have I saw John I let John walk down the street but you can't say John was let walk down the street okay now there's a couple there there is a there's a long problem in the literature about how to deal with this the only partial solution I've seen is a paper by Norvin I don't know if it's in print even left ok the paper by Norvin in terms of contiguity analysis which gives a description of how you can block back ok I mean come back in I'm insulting you so the only paper I know that says anything about it is Norman's paper which this is the blocking a passive ization out of perception verbs and causative verbs in terms of contiguity theory which is an interesting description but it doesn't cover the whole set of data because the fact of the matter is you get the same property without extraction so for example in English these structures a little bit odd and other languages they're normal but things like were seen last night three men walking down the street you can't say three men were seen you can't say there were seen last night three men walk down the street okay so even without extraction you get the same property so it can't be based on extraction it's got to be based on blocking passive ization now if you think about it with the let make type verbs you can think of those as being basically causative there they have essentially a causative structure and in fact the verb cause itself is kind of resistant to passivation so John was caused to leave and that sort of thing so suppose we think of the let and make as being essentially causative affixes the kind that show up in many languages that would mean that their hair merged with C with C probably in the lexicon well that gives the unit that's invisible to the operation of passive ization okay it's a hair merged element which is resistant to whatever we think passive ization is maybe eliminating the case structure okay that would block both the in situ cases and the raising cases in fact it's the only way I know of dealing with that now it's natural for let and Cohen make but there's a very interesting question that goes way back as to why perception verbs should act the way the causative type verbs do actually Jim Higginbotham has interesting paper on this in the 60 70 80 s I guess in which he tries to argue that the complement of the perception type verbs is basically some kind of nominal expression with the very verb but maybe that's an avenue to explain it but at least using the device of pair merge you have an opening to try to account for this strange phenomenon I don't see any other way of dealing in another kind of case that's quite interesting is head movement a head movement has always been a terrible problem it doesn't have any of the right properties it doesn't fit anywhere or in the movement system for all sorts of reasons there is an approach an interesting approach by he said Keith de haro guess papers on this in terms of error merge I'll just give the simplest case take T to see movement ok so you have a structure C T V whatever and at some point this moves here how does that work it's one of the cases of notice that the thing that's moving is really not T it's V ok this is a error of the traditional head movement analyses but the thing here is usually described as a T with the V joined to it by an a junction operation but it's actually the other way around it's a V with a T adjoint to it one of the reasons the traditional Junction operations just don't give you the right result there are many reasons what heesu suggests is that when you get to this point and the you've created this object you have a C and then the next operation is to form C T notice that the elements of CT are not accessible because that's a pair pair emerged a junction structure so you've only added you've actually enlarged the workspace but you've only added one actually haven't even because you've taken Seon out of t2 it you've kept the workspace the same size but you haven't the only accessible thing you've added is this so that's permitted by resource restriction then the next operation is you've got this thing and this thing is just to merge them okay when you merge these two things you get what you wanted off the structure CT with so that gives you a possible way of looking at head movement notice it has a problem it has the same problem as all of the examples back to the original one of resource restriction what happens if you then make some new thing here X you start building it up it ends up being of the form T V and then you decide to merge this okay that's crazy doesn't make any sense but what blocks that that's the same paradigm we are always have now this gets kind of in complicated but I think there's a way out of this problem by and I'll leave it to you as something to think about there's a way out of it by wrist by sharpening the notion of restricting computation so that it tells you at each point to add as little as possible to the workspace to still continue if you think about that it gives in an interesting direction into perhaps blocking this option that amounts to a condition they'll say you're going to have to merge this one before you create something new now you can't make that too strong or you won't be able to lebecq's eccentric instructions they have to put conditions on it that allow just the right ones to block the wrong ones I'll leave that as another exercise to the reader he so has a paper which doesn't go into this it just gives the proposal well there's a lot more that could be said I think I'll stop at this point that these are the kinds of problems that arise when you try to give a principled approach to the nature of explanation you get some interesting results get a hoard of problems the problems may have be presented in an organized form which is helpful but we want to go on to try to find real explanations for them sometimes you can as in the case of unifying compositionality and movement or structure dependence or reconstruct the basis for reconstruction things like ATB and power city gaps maybe some of these things but there's a mass of problems out there to try to deal with in a principled fashion so that's why it's an interesting you
Info
Channel: Abdessamia Bencheikh
Views: 11,580
Rating: 4.9571428 out of 5
Keywords: Noam Chomsky, Linguistics, MIT, Massachusetts Institute of Technology, Language, Generative Linguistics, Chomsky, Syntax, Phonology, Phonetics, Morphology, Grammar, Mental Grammar, اللسانيات, اللغة, علم اللغة, النحو الكلي, النحو, نعوم, تشومسكي
Id: GPHew_smDjY
Channel Id: undefined
Length: 112min 54sec (6774 seconds)
Published: Fri Jun 07 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.