The Largest Vocabulary in Hip hop

"Literary elites love to rep Shakespeare's vocabulary: across his entire corpus, he uses 28,829 words, suggesting he knew over 100,000 words and arguably had the largest vocabulary, ever.

I decided to compare this data point against the most famous artists in hip hop. I used each artist's first 35,000 lyrics. That way, prolific artists, such as Jay–Z, could be compared to newer artists, such as Drake.

35,000 words covers 3–5 studio albums and EPs. I included mixtapes if the artist was just short of the 35,000 words. Quite a few rappers don't have enough official material to be included (e.g., Biggie, Kendrick Lamar). As a benchmark, I included data points for Shakespeare and Herman Melville, using the same approach (35,000 words across several plays for Shakespeare, first 35,000 of Moby Dick).

I used a research methodology called token analysis to determine each artist's vocabulary. Each word is counted once, so pimps, pimp, pimping, and pimpin are four unique words. To avoid issues with apostrophes (e.g., pimpin' vs. pimpin), they're removed from the dataset. It still isn't perfect. Hip hop is full of slang that is hard to transcribe (e.g., shorty vs. shawty), compound words (e.g., king shit), featured vocalists, and repetitive choruses.

It's still directionally interesting. Of the 85 artists in the dataset, let's take a look at who is on top."

(Matt Daniels, May 2014)




Simon Perkins
21 JULY 2012

Reflection-in-action: framing, naming, moving and reflecting

"Reflection–in–action proceeds by a construction cycle of framing, naming, moving and reflecting. Framing and naming concern the problem–setting in that the designer constructs a problem out of a situation by naming the things to which she will pay attention whilst at the same time framing the way that the problem is viewed (Schön 1991). Framing in this sense imposes an order onto the problem; moves are made towards a solution in relation to how the situation is framed. However, the situation 'talks back'; surprise at the outcomes of moves leads to reflecting. Reflecting on outcomes may trigger either further moves or a new framing (Schön 1996). Reflection–inaction is not an interruption to fluid action; it is always embedded within action."

(Simone Stumpf and Janet McDonnell, CiteSeerX)

Simon Perkins
19 JUNE 2011

A comparable dichotomy between metaphor and metonymy

"Roman Jakobson found a comparable dichotomy between metaphor and metonymy in his seminal paper, 'Two Aspects of Language and Two Types of Aphasic Disturbances,' published in his monograph, Fundamentals of Language (Mouton & Co––Gravenhage, 1956). Here Jakobson discussed two types of aphasia based on complementary disorders in comprehending language: (a) a similarity disorder whereby one primarily depends on syntactic context to draw words into use (pp. 63–64); and (b) a contiguity disorder whereby one's style becomes a telegraphic 'word heap' without much, if any, evidence of syntax (pp. 71–72). According to Jakobson, two faculties are thus involved in the use of language: (a) selection in the choice of words to express an idea (metaphoric); and (b) the combination of words, again to express an idea (metonymic). Elaborate sentences without a particularly impressive vocabulary (for example in the prose of Henry James) illustrates the similarity disorder, while big vocabulary in loosely constructed sentences (for example in the prose of James Joyce) illustrates the contiguity disorder. Joyce heaped together his words with apparent abandonment, while James strenuously belaboured his syntax to produce exactly the right effect––an effect he found difficult to articulate with words alone as opposed to their combination in intricate sentences. An inferior choice of words, Jakobson claimed, is at the sacrifice of metaphor, whereas an inferior combination of words is at the sacrifice of metonymy (p. 76)."

(Edward Jayne)

aphasia • choice of wordsClaude Levi-Strauss • combination of words • comprehending language • constructed sentences • contiguitydeconstructionismFerdinand de Saussuregrammar • Henry James • ideasJacques DerridaJacques LacanJames Joyce • John Langshaw Austin • language • langue • langue and parole • Louis Hjelmslev • metaphormetaphoric • metonymic • metonymynaming • paradigmatic relations • parole • Paul de Man • rhetoricRoland Barthes • Roman Jakobson • selection • semiology • semiotics • sentences • signifiedsignifierstructuralism • syntactic context • syntagmatic relations • syntaxtelegraphictropesvocabularyword heapwords


Simon Perkins
06 NOVEMBER 2009

Rehearsal as a Naming Process Central to the Development of Creative Identities

"Students in the Multimedia degree programme at Nottingham Trent University (NTU) are requested to keep online journals in the form of weblogs. They do so to document their evolving design practice and experimentation....

By maintaining the journals NTU Multimedia students engage in a naming process where they rehearse their creative identities into practice. Through doing so they script their individual narratives as they contribute to a shared discourse about the nature of their field. Through assimilating and reflecting upon new knowledge in this way, the students are able to participate in localised Communities of Practice that act as vehicles for naming, sharing and critiquing common practices. In doing so they become located within a broader network of symbolic exchange readied for forging new opportunities for collaboration and prepared for establishing individualised practices within a broader network of global interconnections."

(Julius Ayodeji and Simon Perkins, 2009)

Ayodeji, J. and S. Perkins (2009). Rehearsal as a Naming Process Central to the Development of Creative Identities. Designs on e–Learning International Online Conference. London, UK, University of the Arts London.



28 DECEMBER 2003

Natural History: System, Method

"Establishing character is at the same time easy and difficult. Easy, because natural history does not have to establish a system of names based upon representations that are difficult to analyse, but only to derive it from a language that has already been unfolded in the process of description. The process of naming will be based, not upon what one sees, but upon elements that have already been introduced into discourse by structure. It is a matter of constructing a secondary language based upon that primary, but certain and universal, language. Rut a major difficulty appears immediately. In order to establish the identities and differences existing between all natural entities, it would b»e necessary to take into account every feature that might have been listed in a given description. Such an endless task would push the advent of natural history back into an inaccessible never–never land. unless there existed techniques that would avoid this difficulty and limit the labour of making so many comparisons. It is possible, a priori, to state that these techniques are of two types. Either that of making total comparisons, but only within empirically constituted groups in which the number of resemblances is manifestly so high that the enumeration of the differences will not take long to complete; and in this way, step by step, the establishment of all identities and distinctions can be guaranteed. Or that of selecting a finite and relatively limited group of characteristics, whose variations and constants may be studied in any individual entity that presents itself. This last procedure was termed the System, the first the Method. They are usually contrasted, in the same way as Linnaeus is contrasted with Button, Adanson, or Antoine–Laurent de Jussieu or as a rigid and simple conception of nature is contrasted with the detailed and immediate perception of its relations, or as the idea of a motionless nature is contrasted with that of a teeming continuity of beings all communicating with one another, mingling with one another, and perhaps being transformed into one another. . . . And yet the essential does not lie in this conflict between the great intuitions of nature. It lies rather in the network of necessity which at this point rendered the choice between two ways of constituting natural history as a language both possible and indispensable. The rest is merely a logical and inevitable consequence.From the elements that the System juxtaposes in great detail by means of description, it selects a particular few. These define the privileged and, in fact, exclusive structure in relation to which identities or differences as a whole are to be examined. Any difference not related to one of these elements will be considered irrelevant. If, like Linnaeus, one selects as the characteristic elements 'all the different parts related to fructification' (Linnaeus, Philosophie botanique, section 192), then a difference of leaf or stem or root or petiole must be systematically ignored. Similarly, any identity not occurring in one of these selected elements will have no value in the definition of the character. On the other hand, when these elements are similar in two individuals they receive a common denomination. The structure selected to be the locus of pertinent identities and differences is what is termed the character. According to Linnaeus, the character should be composed of 'the most careful description of the fructification of the first species. All the other species of the genus are compared with the first, all discordant notes being eliminated; finally, after this process, the character emerges' (Ibid., section 193)."

(Michel Foucault, The Order Of Things pp. 151–153)


