In general XML parlance a "vocabulary" is a set of element types and attributes designed to be used together for some purpose. A given vocabulary may be "encompassing", meaning that it is intended to be used as the main or only vocabulary for a given document, or "enabling", meaning it is intended to be integrated into and used with encompassing vocabularies.
Examples of encompassing vocabularies are DocBook, XHTML, and NLM. Examples of enabling vocabularies are MathML and Dublin Core metadata.
A challenge historically with managing XML (and SGML) vocabularies is that, while it's easy to define enabling vocabularies like MathML and possible to define "extensible" encompassing vocabularies like DocBook, there was no standard-defined mechanism for managing how encompassing and enabling vocabularies are combined or extended in a way that ensured understandability and interchange.1
In particular, before DITA, all mechanisms for combining or extending vocabularies were entirely syntactic—they provided no way to examine a document and know how that document's vocabulary (it's document type) related to any known vocabulary so that you could know, for sure, whether your processing environment could handle it or if you could share content from that document with your documents.
For example, DocBook provides for extension by providing the syntactic hooks needed to allow local modification to content models (e.g., parameter entities in DTDs). This allows you to define your own element types and use them in a nominally "DocBook" document. However, having defined your new tags, there's nothing in the markup that tells a processor or an observer how your new element type relates to any known element types (that is, to the element types defined in the DocBook standard or defined in any other DocBook-based document type). Thus, there's no reliable way for a general-purpose DocBook processor to know what to do with your document. Thus, to say that your document is "DocBook" is not accurate or useful. Rather, at best your document is "DocBook based". But knowing that doesn't tell you anything particularly useful. In particular, it doesn't tell you what you'd need to know in order to process it reliably (because you have no way to know what to do with the non-DocBook-defined elements in the document without actually talking to the developer of the custom markup).
By the same token, there are no DocBook-defined constraints on how you can extend DocBook so there's no way to predict what sort of changes a given "DocBook" document might reflect so that, for example, a general-purpose DocBook processor could provide useful fallback processing or so that a human observer can understand the nature of the extensions even if they don't understand the details of the new markup (for which they would need some form of documentation).
That is, using only the syntactic tools provided by DTDs or XSD schemas (or any other available form of XML document constraint, such as RelaxNG) extension and customization of XML vocabularies is inherently unmanageable because there is no machine-processable mechanism for communicating or understanding the relationship between any two vocabularies.
If customization is not manageable then the only way to ensure interchange is to disallow customization. This was the approach taken by "interchange" document types, such as ATA 2100. But of course invariant vocabularies suffer a number of serious and fatal problems. They tend to become very large because they must reflect a union of the requirements of all current and expected interchange partners. They tend to not satisfy key requirements of individual interchange partners because you can never put everything in or because local requirements are at odds with interchange requirements (for example, markup that is specific to a given company's internal business processes, which might be trade secrets). I think it's fair to say that, as an industry, we've conclusively proved over the last 20 years or so that monolithic interchange document types do not work.