4 Results and Discussion
4.1 Introduction
In this chapter, we will go on a multifaceted examination of commencement addresses, using the capabilities of Natural Language Processing (NLP) and text mining tools combined with qualitative analysis. The goal is to find patterns, similarities, and unique insights concealed within these rich narratives. We use complex modeling approaches to delve further into underlying issues after discovering repeated words and themes through frequency analysis. We want to create a more nuanced perspective of the dialogue by connecting these subjects to the speakers’ occupations. It provides an opportunity to present the findings, discuss the relation to the research questions, and draw conclusions based on the data.
4.2 Word Frequency Analysis (Unigrams, Bigrams, Trigrams)
In this section, we will extract the words that exist in each and every corpus. Then we extract the most frequent words within each corpus (each field of profession) and also within all the corpora. Next, we extract the top common words defined as the words that existed in every corpus. After that, the same process is conducted for bigrams and trigrams.
To do that, after loading the required packages and importing our data frame, we split the text into single words using unnest_token function that strips all punctuation and converts each word to lowercase for easy comparability. We had done more than these text cleanings in our preprocess_f() function when we made our data frame. However, we will do it again just to tokenize our text. We follow the same process to tokenize bigrams and trigrams. Notice that in the unnest_tokens function, we apply a token="ngrams" argument to state we want n-grams, and the n = 2 is used for bigrams and the n = 3 for trigrams. Table ?? presents the total number of unique n-grams. Here, by unique, we mean that we did not count an n-gram more than once. It represents the total number of unique combinations.
Now we have a table with columns: speaker names, profession (field of activity), doc_id, year, and words. We can count the frequency of each word in each field. Since speeches vary in length, the frequency of words increases as the length increases. Therefore, we compute the relative frequency to get a fairer comparison. We have also defined a new column that counts the number of professions that each word appeared in. This column will help us to filter words that appeared in only one profession or in all the five professions: “Arts and Literature”, “Law and Politics”, “Business”, “Entertainment”, and “Academia”.
4.2.1 The Most Frequent Words
Figure 4.1 shows the word cloud of our most frequent words in all the corpora:

Figure 4.1: Most Frequent Unigrams in Corpora
In examining the most frequent words in these speeches, it seems that a clear pattern of emphasis on community, ethical behavior, and personal growth emerges. The speakers may try to emphasize the potential and capacity each individual holds (“can”, “one”), highlighting the power of action proactively (“get”, “go”, “make”, “work”).
Simultaneously, they underscore the importance of knowledge and understanding (“know”) throughout life’s journey (“life”, “year”). They also stress the interconnectedness of individuals within broader communities and humanity (“people”, “us”), and acknowledge the larger global context we are part of (“world”).
While analyzing the frequency of individual words (unigrams) provides a valuable first glimpse into these speeches’ main themes and topics, this approach has its limitations. It treats every word in isolation, ignoring the context in which words appear and the relationships between them.
Analyzing the most frequent words within each profession group may provide insights into the themes and subjects most relevant to each field. Each profession might have its own unique language that they tend to focus on more. Figure 4.2 shows the most frequent words in each field of activity of speakers.

Figure 4.2: Top 10 Frequent unigrams in each Field
In the speeches across all professions, the words “can”, “people”, “good”, “one”, “life”, “year”, “know”, “make”, “think”, and “us” are the most common. These terms consistently appear in the top lists for each profession, which indicates their wide-ranging use in different fields. The prevalence of these words suggests that despite the diversity of professions, common themes resonate in the speeches made by individuals in these fields. These include the ability to do something (“can”), interactions and relations with others (“people”, “us”), quality and value assessments (“good”), individuality and singularity (“one”), time and experience (“year”, “life”), understanding (“know”), creation or action (“make”), and cognitive processes (“think”).
As we can see, numerous common unigrams are shared across different fields. However, to delve deeper into the specific linguistic nuances of each profession, we will next focus on words unique to each field, providing a more distinctive perspective on the language used within a particular domain.
4.2.2 The Most Frequent Unique n-grams
We will extract the most common unigrams, bigrams, and trigrams that appeared in only one of the professions in this section. By “unique,” we mean “exclusive within a domain.” Indeed, analyzing unique n-grams within each profession allows us to capture the specific linguistic patterns and themes that characterize the discourse in each field.
Figures 4.3, 4.4, and 4.5 show unique unigrams, bigrams, and trigrams, respectively.

Figure 4.3: Top 10 unique unigrams in each field

Figure 4.4: Top 10 unique bigrams in each field

Figure 4.5: Top 10 unique trigrams in each field
Words like “oberlin”, “dartmouth”, and “notre dame”, are locations where addresses were delivered. “new zealand” was probably used multiple times by Jacinda Ardern, the 40th prime minister of New Zealand.
Key terms in Academia, such as “hbcus”, “underfund”, and “minority serve institution” indicate the frequent discussion of historically black colleges and universities (HBCUs), the challenges of funding in academic institutions, and the importance of educational services for minority students.
The phrase “make good art” and the word “default” in Arts and Literature may suggest a strong emphasis on the creation of quality artwork and challenging the standard or conventional ways of creating art.
Words like “rocket” and phrases such as “big meaningful project” or “give everyone freedom” in the Business field suggest that speeches in the business sector often focus on innovation (symbolized by “rocket”), the importance of undertaking significant projects, and individual freedom.
Speakers in the field of Entertainment used phrases like “edge awareness”, “let light shine”, and “character define moment”. It shows that speeches in the entertainment sector often revolve around the importance of self-awareness, personal enlightenment, and the influence of defining moments on a person’s character.
Phrases like “can take grant”, “dr king say”, and terms like “cold war” in Law and Politics, show that political speeches often refer to historical events and figures, like Dr. Martin Luther King Jr., to discuss civil rights, freedom, and historical context, including international relations.
While the unique phrases and words have given us a detailed view of the distinct topics discussed in each profession, it is equally interesting to observe their common ground. Some words and phrases cut across all fields, revealing universal themes and concerns. The following section will delve into these shared elements that connect Academia, Arts and Literature, Business, Entertainment, and Law and Politics. The exploration of these commonalities will offer us insight into the collective narrative.
4.2.3 Common unigrams, bigrams and trigrams
4.2.3.1 Common unigrams
We can select the words that appeared in all the professions by filtering out the unigrams that appeared in every profession. These unigrams are shown in Figure 4.6.

Figure 4.6: Top Common Unigrams
Words like “get”, “go”, “make”, “work”, “know” and “people” which were in the top frequent words, also appeared in the top common unigrams in all corpora; this observation is interesting because it suggests that despite the diversity in backgrounds and professions of the speakers, there is a commonality in the language used to express their thoughts. Words like “get”, “go”, “make”, “work”, and “know” are action-oriented and reflect a shared focus on practicality, progress, and understanding. The speakers from diverse professions and backgrounds may focus on different aspects of the human experience, but the term “people” brings universality to their messages.
4.2.3.2 Common bigrams
The same process generates the common bigrams that appeared in all professions. The common bigrams are depicted in figure 4.7.

Figure 4.7: Top Common Bigrams
Common bigrams are pairs of words that appear in speeches from all professional sectors. These shared bigrams can help us understand the shared vocabulary and ideas that cross professional boundaries in these talks. Given the context of commencement speeches, it is not surprising that the terms “commencement_address”, “family_friend”, “graduate_school” and “graduate_student” appeared frequently. They could be used when addressing the audience directly or discussing the speakers’ experiences.
Storytelling (“tell_story”) is a typical style in speeches to make the content more engaging and relatable. “year_ago”, “long_time”, “look_back” and “year_late” suggest that the speeches frequently reflect on past experiences and possibly discuss changes and developments over time.
“help_us” and “don_know”1 seem more related to the concept of wisdom. However, the actual meaning of these bigrams can depend heavily on their context within each individual speech. The further qualitative analysis would help to confirm and expand upon these initial interpretations. To do this, we may extract all instances of these bigrams and choose some of the most relevant contents to analyze. We create a 40 words window around keywords of interest in order to grasp the context and meaning of how the term is being utilized.
Table ?? shows the context where “don_know” bigram were used. Chimamanda Ngozi Adichie’s lecture highlights the need to accept ignorance, which, according to Meacham (1983), is a vital part of wisdom. She encourages the audience to be brave enough to state, “I don’t know”, implying that wisdom frequently entails acknowledging one’s own limitations.
Engel’s lecture highlights the significance of confronting unknown situations in order to learn and grow. This demonstrates a wise perspective on life, recognizing the value of experience and learning even in confusing or challenging situations, which links with Webster’s concept of grasping life’s unpredictable nature as a key to wisdom.
This part of Natalie Portman’s speech is connected with the balance theory of wisdom. It discusses the importance of understanding why you are pursuing achievement, indicating the wisdom in aligning actions with values. Therefore, acting and not knowing the values, might not be considered a wise action, even if one gains excellent success.
4.2.3.3 Common trigrams:
| Trigrams |
|---|
| president tessier lavigne |
| world war ii |
| deliver commencement address |
| every day life |
| board overseer member |
| happen right now |
| many year ago |
| take deep breath |
| can still remember |
| every single day |
| make difference life |
All professions shared no trigram. However, the common trigrams in at least three professions from our dataset, represented in Table 4.1, provide another layer of insight into the recurring themes and specific phrases that the speakers tend to use. Some possible interpretations are as follows:
“Deliver commencement address”: This phrase directly refers to the act of delivering the speeches that are being analyzed. It is not surprising to find this as a common trigram among addresses. “President Tessier Lavigne” and “Board overseer member”: These phrases are related to the administration of the universities where the speeches are given. Particularly Marc Tessier-Lavigne, the president of Stanford University.
“World War II”, as the historical event, might be used as a reference point for lessons learned, historical context, or comparisons to current events.
“Many years ago,” typically introduces a reflection on the past, whether personal memories or historical events.
“Every day life” & “Every single day”: These trigrams suggest a focus on daily routines or habits, implying that the speakers might be providing advice or insights that can be applied regularly.
“Happen right now” may suggest a focus on current events or recent developments, indicating that the speeches are responsive to contemporary issues.
“Take deep breath” could be a piece of advice given to the audience, often suggesting a moment of pause or reflection.
“Can still remember” is frequently used before the speaker shares a personal experience or reminiscence.
“Make difference life” is likely part of a larger discussion about impact, purpose, and the value of contributing positively to the world or to others’ lives.
Similar to common bigrams, context is crucial when interpreting these trigrams, especially “Make difference life”. While these interpretations can provide a general idea of the shared language and themes, the actual meaning will depend on each speech’s surrounding text. Therefore, we can take a look at the context around this specific trigram.
The windows around the trigram “Make a difference in life” that appeared in “Law and Politics”, “Business”, and “Entertainment” (table ??) provide some good material for qualitative interpretation. Looking at the context can help us understand how this phrase is used across speeches.
Obama uses the phrase to speak about love, service, and making a difference in others’ lives. This is framed as a call to action for his audience, encouraging them to positively impact the world during their time on Earth. Obama leverages the phrase to emphasize altruism and effecting a change in the lives of others. This sentiment aligns with the views of McKenna, Karami, and Nonanka.
In his famous Stanford commencement address, Steve Jobs refers to his own life and how following his intuition and passion has made a difference in it. Here, the phrase is more inwardly focused, reflecting on personal growth and fulfillment. He underscores intuition’s role, reflecting McKenna’s definition of wisdom, which values non-rational and subjective elements when making decisions. However, based on the excerpts we are analyzing, Jobs does not explicitly advocate for a balanced approach to addressing intrapersonal, interpersonal, and extrapersonal interests, as Sternberg’s definition of wisdom would suggest. Jobs’ primary focus seems to lie more on personal passion.
Oprah Gail Winfrey speaks about her aspiration to make a difference in people’s lives aligns with Nonaka’s perspective that wisdom is crucial for employing political power to initiate action; Oprah Winfrey discusses her ambition to positively influence people’s lives and the world via journalism as her [political] power.
We can see from these examples that the term “make a difference in life” is used to refer to making a difference in the lives of others (in the sense of service or influence) as well as one’s own (in the sense of personal growth or fulfillment). This implies that the speakers are pushing graduates to pursue both personal fulfillment and a more significant societal effect in their future endeavors.
Having completed our analysis of word frequencies and common and unique n-grams, we have gained a valuable understanding of our corpus’s keywords and phrases. However, the frequency of certain words or phrases does not necessarily reveal the emotional tone or sentiment of the content. Therefore, it is essential to expand our analysis to investigate the underlying sentiments carried by speakers. This leads us to the next crucial aspect of our text analysis: sentiment analysis.
4.3 Sentiment Analysis
Before the sentiment analysis, we must create a customized sentiment lexicon that suits our data. This is an important step in sentiment analysis, allowing us to refine our tools to fit our needs better. We have identified words that potentially influence the sentiment score in a way that might not be accurate or useful for our specific analysis. Terms such as “obama”, “musk”, “gates”, “jobs”, “harvard”, and “stanford” might be associated with positive sentiments due to the prestige of these individuals or institutions. But in the context of our specific analysis of speeches, these are likely just neutral references to individuals or places. Thus, should not be considered as influencing the sentiment of the speech. So, we have set their sentiment scores to neutral (0) in our customized lexicon list.
We are now ready to execute sentiment analysis after customizing the lexicon for our data. We do a sentence-level sentiment analysis on the original transcripts using a mix of the get_sentences() and sentiment_by() functions. That is, we do not employ n-grams to generate sentiment scores. We preserve the speeches’ original context and semantic structure by not using n-grams to produce sentiment scores, which can be critical for properly identifying sentiment. Sentence-level sentiment analysis is frequently more accurate than n-gram-level sentiment analysis because it captures the context in which the words are used.
Valence shifters like negators (e.g., “not”, “n’t”, “never”) can change the polarity of a phrase, and sentimentr’s algorithm handles this by default. For example, the phrase “I am not afraid” would typically be given a positive sentiment score, even though “afraid” is a negative word, because the “not” negates it. In the sentimentr package, the sentiment_by function is designed to calculate text polarity sentiment at the sentence level quickly and considers negation.
The outcome may be divided into two sections. Analyzing the distribution of sentiment in each profession and investigating how sentiment varies during the course of speeches in every field.
4.3.1 Profession and Sentiment
To visualize distributions of data across different categories, we use The geom_jitter() function. It provides a little bit of random variation to each point’s position. This can make the plot easier to read by reducing overlapping. On top of that, we overlay box plots to show the median, upper and lower quartiles. The result is shown in Figure 4.8. Each point on the plot represents a document, colored according to whether its sentiment is negative (red for less than -0.1), neutral (between -0.1 and 0.1, Gray), or positive (green for greater than 0.1).

Figure 4.8: The distributions of sentiment across each field
The visualization presented in Figure 4.8 also illuminates the extreme sentiment values extracted from each profession. These values represent the documents associated with the most positive and negative sentiment scores. As mentioned above, the geom_jitter() function has been employed to improve the visibility of individual data points within each profession category. Consequently, the position of the labels might not perfectly align with the respective points horizontally. However, given that the labeled points correspond to the extreme values within each profession, their identification with the corresponding labels should be straightforward.
Speeches from all professions show both negative and positive sentiments. However, the sentiment analysis of the addresses across many professional sectors demonstrates a positive trend. There does not seem to be a clear correlation between profession and sentiment score. Within the scope of the investigation, the academic discourse was the least negative in comparison to the other professional sectors. This outcome aligns with expectations, as the realm of academia often addresses various life facets, encompassing psychological elements like personal growth, self-esteem, and life management, and economic factors such as operational skills and the pursuit of economic success. The entertainment industry followed closely behind, with a lower level of negative sentiment. In contrast, professions such as arts and literature, as well as law and politics, were shown to have higher levels of negativity.
To enhance our comprehension of the context, we can extract the initial paragraphs that display intense positive or negative sentiments. Appendix IV showcases the original paragraphs corresponding to extremely positive and negative expressions across various professions.
In academia, the most positively rated paragraph (Simmons_13) celebrates Harvard’s impact, highlighting the university’s influence achieved through scholarly and professional contributions. This positivity for academia’s role in societal advancement stands in contrast with the negative sentiment found in Gawande_29, which expresses uncertainty and anxiety about the impact of a human-made crisis. It captures the inherent uncertainty of the real world.
The very positive sentiment (Wallace_29) in the Arts and Literature area stems from a critique of modern civilization and a demand for real freedom. It aligns with several definitions of wisdom, emphasizing the importance of self-awareness, empathy, and a broader perspective that seeks the common good. For instance, the passage’s emphasis on being ‘truly able to care about other people and to sacrifice for them’, aligns with Grossmann’s definition of wisdom that wisdom balances self-interest with others values truth and cares for humanity.
This capacity is not simply to act based on one’s self-interest but to consider the interests of others, a quality Sternberg suggests. However, when referring to “sacrificing for others”, the passage seems to strongly emphasize interpersonal interests, potentially at the expense of intrapersonal ones. Therefore, the alignment proposed in the previous explanation indeed requires further clarification. The key to resolving this apparent contradiction lies in adopting a broader, more nuanced understanding of “sacrifice.”
Historically, numerous individuals have demonstrated such selflessness and commitment to a cause greater than themselves, such as the sacrifices made during civil rights movements worldwide. In these contexts, “sacrifice” takes on its most profound meaning, often involving the ultimate sacrifice of one’s life. The decisions they made and the actions they took required an understanding of the complex dynamics they faced. Much like many other aspects of human experience, wisdom is inherently subjective and multifaceted. It is not a one-size-fits-all concept but rather a deeply individual process that relies on personal experiences, values, and circumstances.
In addition, it is essential to note that while these extreme sacrifices are illustrative of certain aspects of wisdom, they are not a prerequisite for wisdom. Wisdom can be presented in numerous ways and does not always require such extreme demonstrations. It can be displayed in daily actions and decisions. Wisdom does not mandate an always future-oriented perspective or a preference for the interests of future generations over one’s own life. Rather, it requires awareness and consideration of these longer-term effects, even if the final decision ultimately prioritizes immediate or personal needs.
Therefore, “sacrifice” does not necessarily imply a severe loss to oneself (intrapersonal interest). Instead, it can be perceived as acts of kindness, empathy, and understanding towards others (interpersonal interests) that might require one to deprioritize immediate personal desires occasionally. This balance reflects the essence of phronesis. While these acts may seem to be “sacrifices” in the short term, they could potentially contribute to long-term intrapersonal interests by fostering relationships, personal growth, and a sense of purpose.
Still, a more explicit discussion of how one balances these differing interests in the context of wisdom is necessary. It is a complex task, and, as noted by Sheppard, leaders often need to navigate paradoxes and make complex decisions that balance various aspects of these interests. While not explicitly discussed in this section of Wallace’s speech, this delicate balance is inherent in many definitions of wisdom and is an integral part of navigating life in a wise manner.
On the other hand, the paragraph with the most negative feeling (Wallace_17) criticizes consumer society’s greed, particularly those who recklessly utilize fossil fuels without regard for future generations. He voiced his displeasure at wasteful conduct that contributes to climate change. It is interesting how he criticizes the unwise actions of people. The reckless consumption of fossil fuels may satisfy short-term intrapersonal interests. Still, it neglects interpersonal interests (current societal needs) and extrapersonal interests (the well-being of future generations and the planet).
In business, Sandberg_70, characterized by positive sentiment, reflects on the importance of gratitude, even in challenging times. The concept of practical wisdom as presented by Aristotle, involving the capacity to determine suitable behavior in specific circumstances, can be recognized in Sandberg’s adaptation to the situation of losing her spouse. In contrast, paragraph 35 of Bill Gates’ speech is a series of infectious diseases such as Malaria and hepatitis-B.
In entertainment, OBrien_23 offers practical advice to graduates in a positively scored paragraph, aiming to equip them for life after graduation. Brown_44, showing a negative sentiment, illustrates a critique of society’s fear of enlightenment and change.
In Law and Politics, the positive sentiment of MichelleObama_10 stems from an appreciation of community, commitment to service, and social justice. Contrasting this is the negative sentiment in Baron_14, which expresses concern over threats to free expression and the importance of an independent press for a functional democracy. This depiction starkly contrasts with several elements of wisdom. For instance, both Sternberg’s and Karami’s definitions emphasize achieving a common good and positive ethical value. Suppressing free expression for the sake of personal power directly conflicts with these principles.
After having completed a comprehensive sentiment analysis of the speeches, evaluating positive, negative, and neutral sentiment scores, as well as identifying the most emotionally charged paragraphs, it is important to delve deeper into the emotional content of these dialogues. We have seen how different sentiments are expressed, but understanding the specific emotions may help us better understand the speaker’s messages. The next phase of our investigation will focus on a detailed emotion analysis, exploring the presence and impact of particular emotions such as fear, anger, and trust.

Figure 4.9: Emotional spectrum across professions
Figure 4.9 shows the spectrum of emotions across different professions.
Fear is often considered a negative feeling in the general human experience, with its relevance varying greatly from individual to individual based on unique circumstances, experiences, and cognitive-emotional frameworks. This subjectivity is vital to emphasize since emotions do not have universally set valences. However, fear appeared as a highly prominent sentiment in our text-based study. Fear often signifies a threat or potential harm. It can be related to the wisdom dimension of “awareness and management of uncertainty” mentioned by the Berlin School, as recognizing fear and addressing it effectively requires an understanding of uncertainties and potential risks.
Trust, on the other hand, is among the most positive emotions and is highest in all professions. Trust can be associated with the “relativism of values” dimension of wisdom, as it often involves understanding and respecting diverse perspectives and values. It also relates to the “interpersonal” dimension outlined by Sternberg, emphasizing the importance of fostering healthy interpersonal relationships. In addition, trust implies a certain level of “moral maturity,” as mentioned by Karami, as it requires ethical behavior and a commitment to honesty and integrity.
Anticipation, another positive emotion, shows a forward-looking perspective. This is indicative of the “lifespan contextualism” facet of wisdom outlined by the Berlin School, as it requires a long-term perspective and understanding of the evolving context of life. Anticipation also suggests elements of Sternberg’s “adaptation to, shaping of, and selection of environments” dimension of wisdom, as it involves envisioning future scenarios.
Sadness, prevalent in Academia, Arts and Literature, and Business, often stems from experiences of loss or failure, which are inherent uncertainties in life. In this context, wisdom might involve acknowledging these feelings and learning to cope with such disappointments, enabling resilience and growth.
In Entertainment, the presence of both anger and sadness suggests a mix of emotional challenges. Anger can connect to the wisdom dimension of “relativism of values,” as it often arises from perceived injustices or personal or social norms violations. In this case, wisdom might involve understanding diverse perspectives and finding constructive ways to express and manage this strong emotion.
In Law and Politics, the prominent emotion of anger may arise from injustices or conflicts. Sternberg’s dimension of wisdom regarding achieving a common good by balancing interests also resonates here, as it may imply striving for justice and the greater good.
This emotion-wisdom connection offers insights into how different professional contexts might cultivate wisdom, demonstrating the links between emotional experiences and the various dimensions of wisdom. In the next section, we try to understand how the speakers conveyed their wisdom to graduates.
4.3.2 Timing and Sentiment
As we continue our investigation of wisdom leadership in speeches, it is critical to understand what is said and how it is presented. The tone of a speech might change, creating a cadence that keeps the audience’s attention. In this part, we will look at the timing of emotions in speeches from various professions. We can get insights into how leaders plan their discourse for optimum impact by evaluating the sentiment evolution from the beginning to the end of a speech, perhaps adding new dimensions to our knowledge of wisdom leadership. The generated plot in 4.10 shows the results.
In this graph, the y-axis represents the average sentiment of each profession. Positive values indicate a generally positive sentiment, while negative values denote a negative sentiment. The x-axis symbolizes the sequence of paragraphs in percentage. It allows us to understand how the sentiment of speeches changes over the speeches’ timeline.
The graph has been plotted using geom_smooth function. A smooth line (a line created by local regression using LOESS method (Cleveland, 1979)) is drawn for each profession to show the general trend of sentiment throughout the speeches. The span parameter in the LOESS function controls the degree of smoothing. Values of the span parameter typically range from 0.1 (indicating a highly local model where each regression is based on 10% of the data) to 1 (indicating a global model where each regression uses all data points). We have chosen span = 0.8 to reduce noise and highlight larger patterns over capturing smaller, local variations.

Figure 4.10: Sentiment of the speeches in each profession, from start of speech to end
The findings from the sentiment analysis indicate commonalities in the sentiment path across speeches from diverse professions. Generally, each discourse appears to begin with a positive sentiment. This pattern could potentially be attributed to the typical structure of commencement speeches, which often start with congratulations aimed at the graduating students and their families. However, subsequent to this initial phase, there is a noticeable reduction in sentiment. Except for speeches from the academic domain, a gradual decline in average sentiment can be observed towards the midpoint of these discourses. Interestingly, this decline is followed by an increase as the speakers approach their conclusion. This trend suggests a characteristic sentiment structure in commencement speeches, despite the variation in the professional backgrounds of the speakers.
When these trends are interpreted through the storytelling framework outlined by O’Hara (2014), it is clear that the speakers have effectively used storytelling tactics in their talks. These commencement addresses appear to follow a narrative arc consisting of an introduction, a challenge, and a resolution, much like well-crafted stories.
In contrast to the other professional domains, speakers within the academic field demonstrate a distinct sentiment path. There appears to be a peak in positive sentiment around the midpoint of the speeches. A conservative interpretation of this anomaly could suggest that it might be related to the inclusion of more stories within the speech content. However, this is a preliminary interpretation that would require further investigation for substantiation.
The commencement speech, much like a story, is initiated with a message. This phase, marked by highly positive sentiments, often contains greetings, congratulations, and the establishment of a shared sense of achievement. As the speeches move into the challenging phase, sentiments become less positive. This could be reflective of the speakers sharing their personal experiences, struggles, and the lessons they learned. Thus, the speakers make the narrative more authentic and relatable to the audience.
Finally, the narrative arc is completed with the growing feelings near the end of the talks. This is the resolution phase, in which the initial message is reinforced, typically in conjunction with a call to action or transmission of wisdom to the graduates, therefore increasing engagement and connection between the speaker and the audience. This narrative format, which mirrors the arc of a story, has the ability to increase audience engagement by allowing them to relate to the speaker’s experiences, appreciate their insights, and retain the wisdom conveyed.
Whereas sentiment analysis helped us comprehend the emotional dynamics of the speeches, the following section will employ topic modeling techniques to delve deeper into their content, uncovering the principal themes that characterize these discourses.
4.4 Topic Modeling
After analyzing unique and popular terms across many professional disciplines and sentiment analysis, we now investigate further the thematic structure of the speeches. In this part, we use topic modeling approaches to find latent patterns in the talks. We begin with Latent Dirichlet Allocation (LDA), a popular topic modeling approach. We then go on to more sophisticated approaches, such as Structural Topic Modeling (STM), which add metadata to the study. Finally, we will employ Top2Vec, a newer approach using word embedding techniques to determine topics accurately. By taking these steps, we try to achieve a better understanding of the complexity of the discourse inside commencement addresses from various professions.
4.4.1 LDA
LDA is a popular topic modeling approach that assumes each text is a collection of topics, and each topic is a collection of words. As mentioned in 2.10.1, it uses an iterative process to assign topics to words and documents. We want to discover key topics in our collection of commencement addresses by using LDA, offering an essential understanding of the major themes.
There is no definite approach for determining the ideal number of topics for Latent Dirichlet Allocation (LDA). Several approaches, such as Perplexity and Topic Coherence, have been proposed and are widely utilized. First, we utilize the FindTopicsNumber function from ldatunining package (Murzintcev, 2021), which computes several metrics to assess the quality of LDA topics. Figure 4.11 depicts the results of this calculation. The results of the FindTopicsNumber function are displayed in Figure 4.11. This method determines the optimal number of topics by pinpointing where the curves of Griffiths2004 and Deveaud2014 reach their maximum values and where the curves of CaoJuan2009 and Arun2010 reach their minimum values. Upon analysis of all four curves, it is discernible that five, six, and seven topics present the most favorable choices.

Figure 4.11: The optimum number of topics determined by FindTopicsNumber function.
Human judgment is still a significant aspect in determining the number of subjects for LDA topic modeling. The appropriate number of topics might be a judgment call depending on what makes the most sense for a certain application. As a result, we personally examined the results of various number of topics and selected five topics that give us the most meaningful results.

Figure 4.12: Top words that represent each topic.
Figure 4.12 illustrates the top terms that define each topic. While the results show some themes in our corpora, interpreting the results is not very straightforward. To understand topics more, we apply NLP techniques like heat-map to demonstrate the distribution of topics over different professions, the relation between documents and topics, and sentiment analysis.
4.4.1.1 Heatmap
For each profession in our data set, we may compute the mean of the document-topic probabilities (posterior probabilities from the LDA model). The values show the profession’s average topic distribution. For example, according to the LDA model, topic 1 relates to 96% of business speeches. Figure 4.13 illustrates these proportions that represent the average degree to which each topic is included in the speeches of the respective profession. As it is seen, each profession covers every issue to some level, yet one or two themes dominate each group.

Figure 4.13: Distribution of topics over different fields
4.4.1.2 Topics-Profession Network
For more in-depth analysis, we can extract the Topics Over Professions network. Or, in other words, professions-topics relations. Figure 4.14 reveals the relational network of topics and professions.

Figure 4.14: network of topics and professions
Topic 5, serving as a universal theme, appears in all professions except business, showing its broad relevance. In contrast, Topics 2 and 4 are distinctly associated with Law and Politics and Arts and Literature, respectively, revealing a more exclusive focus on their respective fields. The Business profession is remarkably influenced by Topic 1, suggesting that the key bigrams defining other topics may not significantly contribute to our interpretation in the business context. In the Entertainment field, Topic 5 predominates. Law and Politics professions demonstrate a close relationship with Topics 2 and 3, with almost equal representation of these themes. Arts and Literature speeches predominantly reflect Topic 4, but also incorporate elements of Topics 1 and 5. Within the Academic profession, the speeches seem to span a diverse range, with a stronger emphasis on Topic 5, followed by Topics 1 and 3, demonstrating the nature of academic discourse. Table 4.2 displays the most probable document associated with each topic.
| doc_id | profession | topic |
|---|---|---|
| Lewis_11 | Law and Politics | Political Considerations(2) |
| MichelleObama_11 | Law and Politics | Civil Rights and Global Issues(3) |
| Wallace_10 | Arts and Literature | Art and Truth(4) |
| Winfrey_10 | Entertainment | Personal Development and Awareness(5) |
| Zuckerberg_73 | Business | Inspiration and Purpose(1) |
The results of the Latent Dirichlet Allocation (LDA) indicate the presence of five distinctive topics. Using visualization techniques, we presented the relationship between the topics and professions.
Topic 1 has terms like “think_good”, “one_day”, “sense_purpose”, and “can_get”. These phrases suggest the encouragement of positive thinking, envisioning a better future, and having a clear purpose. This topic is particularly prevalent in speeches from the Business and Academia professions. We can label this topic as “Inspiration and Purpose”.
Topic 2 includes terms like “new_zealand”, “notre_dame”. The noticeable emphasis on “new_zealand” is because it was the focus of Jacinda Ardern as the prime minister of New Zealand. And Barck Obama’s speech was delivered in “notre_dame”. “prime_minister”, “last_year”, and “independent_press” are other bigrams in this topic. Obviously, this topic is about the political and social context of the time when the speeches were delivered. Not surprisingly, this topic is dominant in speeches within the field of Law and Politics. We call this topic “Political Considerations”.
Topic 3 features terms such as “civil_right”, “dr_king”, “lead_us”, “human_shield”, and “climate_change”. It hints at the discussion of civil rights, leadership, and pressing global issues. This topic is equally represented in speeches from Law and Politics and Academia fields. We label it as “Civil Rights and Global Issues”.
Topic 4 includes terms like “tell_truth”, “make_good”, “default_set”, “liberal_art”, and “good_art”. It suggests discussions around the importance of truth-telling, making good art, and the value of liberal arts. Unsurprisingly, this topic is most prevalent in speeches from the Arts and Literature profession. “Art and Truth” can be a good label for topic 4.
Finally, Topic 5, with terms such as “say_yes”, “look_like”, “pay_attention”, “light_shine”, can hold the label of “Personal Development and Awareness”. It predominantly appears in speeches from the Entertainment field. This topic seems to encourage personal development, paying attention to the world, and allowing one’s “light” to shine.
In conclusion, each topic offers unique insights into the themes of different professions. To continue our investigations, we will analyze the sentiment of speeches to understand how the speakers convey their wisdom to the audience.
4.4.2 STM
As mentioned in section 2.10.3, metadata can be entered in the topic model in STM. “Metadata covariates for topical prevalence allow the observed metadata to affect the frequency with which a topic is discussed (Roberts et al., 2019).” The powerful stm function is used for topic modeling and “profession” as the topical prevalence.
4.4.2.1 Number of topics selection
| Topics | Exclusivity | SemanticCoherence |
|---|---|---|
| 4 | 9.04430321559016 | -254.863395848544 |
| 5 | 8.9973350140803 | -258.137172796541 |
| 6 | 9.22461123413171 | -246.623344229249 |
| 7 | 9.22655682811675 | -254.913690543226 |
| 8 | 9.23430467468784 | -257.221076094765 |
Exclusivity and semantic coherence are key components of evaluating the quality of topics generated by STM. In the context of topic modeling, exclusivity is a measure that helps identify words that are most exclusive or unique to each topic. High exclusivity shows that a word is used frequently in a given topic but rarely used in other topics. This helps to distinguish one topic from another. A topic is considered semantically coherent if its top words are related or tend to co-occur in the same context, which usually makes the topic easier to interpret.
When applying STM to our dataset, we try to achieve high exclusivity and semantic coherence for each number of topics. This would indicate that the topics are distinct from each other, and that each topic represents a clear, understandable concept or theme. However, there’s often a trade-off between exclusivity and semantic coherence in topic modeling. Increasing one may decrease the other, so it is important to find a balance that optimizes the interpretability of the topics. Table 4.3 shows the exclusivity and semantic coherence of topics considering the number of topics. These considerations lead to the decision to pick a seven-topic model.

Figure 4.15: Lack of convergence for the model with 5 topics
Convergence is another essential element in the context of topic modeling. When the model converges, it means that it has found a stable solution where subsequent iterations in the estimation process do not change the outcome. If a model does not converge, it suggests that it might not be a good fit for the data or could still be improved with additional iterations. For example, Figure 4.15 depicts the reality that if we choose five topics for our model, even with a high number of iterations, convergence will not happen. A non-converging model can produce unreliable and unstable topic assignments, undermining our findings’ validity. Figure 4.16 shows the fact that the seven-topic model converged. Therefore, it is a stable and reliable model for our data.

Figure 4.16: Convergence of the model with 7 topics
4.4.2.2 Topics correlation

Figure 4.17: Correlation between topics
The inter-topic correlation for the derived topics was explored through the topicCorr() function in the STM package, and the results were visualized using the plot.topicCorr() function. The resulting plot (4.17) revealed no significant correlation between the seven distinct topics, suggesting that each topic captured a unique theme in the dataset.
| 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|---|---|---|---|---|---|---|
| unite_state | heart_mind | year_ago | make_good | also_read | tell_truth | foot_mouse |
| year_late | human_being | last_year | high_school | take_grant | social_medium | hungry_stay |
| one_day | outside_world | human_right | every_single | say_yes | new_zealand | stay_foolish |
| want_make | don_know | let_us | didn_know | graduate_class | thank_much | stay_hungry |
| harvard_graduate | change_world | berlin_wall | go_college | can_take | today_want | go_back |
| get_one | president_faust | dr_king | look_back | silicon_valley | default_set | president_tessier |
| commencement_address | year_old | climate_change | even_though | full_transcript | think_good | tessier_lavigne |
| tell_story | find_someone | don_get | teach_think | along_way | notre_dame | personal_computer |
| president_unite | florentino_cullar | independent_press | thing_happen | world_need | want_talk | know_go |
| make_world | know_know | make_mistake | sense_purpose | take_responsibility | public_service | try_figure |
These topics provide insights into wisdom and leadership through themes such as learning from past experiences, addressing global challenges, fostering collaboration, personal growth, and resilience. Each topic emphasizes different aspects of wisdom and leadership. However, an interpretation of the topics extracted from STM topic modeling of commencement addresses is more valuable when we consider the context of defined topics. Consequently, we have extracted the top two documents associated with each topic in Appendix V.
Topic 1:
It is evident that Oprah Winfrey effectively incorporates various dimensions of wisdom into her narrative. One notable aspect is her wisdom of learning from failure. She persuasively argues that failure is not an end but a redirection. This notion aligns with Grossmann’s idea of wisdom as recognizing and managing uncertainty. She acknowledges the presence of uncertainty, setbacks, and disappointments but encourages learning and growth from these experiences, an essential component of wisdom, according to Webster.
Winfrey also emphasizes the importance of an internal moral and emotional GPS, essentially advocating for self-regulation and moral maturity. This idea resonates strongly with Karami’s wisdom dimension, which highlights the application of successful intelligence and virtues in pursuit of a common good. This moral compass serves as a guiding principle for ethical behavior and decision-making, reflecting Karami’s emphasis on self-regulation and moral maturity as elements of wisdom.
The story of the Angel Network exemplifies Nonaka’s dimension of wisdom that involves the altruistic use of power. Oprah Winfrey employed her influence not for personal gain but to mobilize resources for the public good, manifesting the principle of altruism and the benevolent use of political power that Nonaka underscores.
Topic 2:
Zakaria’s address includes various dimensions of wisdom while he emphasizes the integration of heart and mind, self-reflection, ethical understanding, and societal transformation.
To begin with, the appeal to “heart and mind” echoes Karami’s view of wisdom as involving both affective (heart) and cognitive (mind) components. According to Zakaria, the qualities honored by human beings are those that integrate intelligence and emotion. “Intelligence, hard work, discipline, courage, loyalty and, perhaps above all, love and a generosity of spirit” are the qualities, he proposes, that can lead not just to a successful life (rewarded by the ‘outside world’) but also a fulfilled life (appreciated by those who know us best). It resonates with McKenna’s studies that have noted the importance of “seeking intrinsic personal and social rewards.”
Moreover, Zakaria’s advice to “Trust yourself; you know what you should do” reflects Grossmann’s view of wisdom as a form of self-reflection and self-understanding. The notion that one “doesn’t need an ethics course to know what you shouldn’t do” underscores that wisdom includes an innate understanding of one’s ethical and moral boundaries. Another possible explanation for this is that he advises graduates to consider “non-rational and subjective elements when making decisions,” parallel to McKenna’s point of view.
Further, the final encouragement, where he urges the listeners that by living such a life, they will “change the world”, resonates with Nonaka’s idea of wisdom as a catalyst for societal transformation. The assumption here is that by embodying these virtues, individuals can influence the world positively, reinforcing Nonaka’s understanding of wisdom as a force for the common good.
Topic 3:
Merkel_1 is Angela Merkel’s short story about East Germany during the Cold War. However, this simple story is indeed a profound reflection of her life experiences and showcases various dimensions of wisdom when analyzed carefully. It is not only about her personal journey but also about her learning, adapting, and using her wisdom for the betterment of her country and society.
As she reflected on her past experiences, Merkel exhibited the ability to recognize her own limitations and the uncertainty of her situation - an essential facet of wisdom, according to Meacham and Taranto. She had to acknowledge that she did not know when the oppression would end or when she would be able to live in freedom.
Her narrative illustrates her empathy for the people who were suffering, which aligns with Ardelt’s definition of wisdom involving affective traits like empathy and concern for the well-being of others.
Furthermore, her experiences in East Germany during the Cold War, and her subsequent reflections upon them, have shaped her into a leader who prioritizes the common good, human rights, and freedom, all aligning with Nonaka’s concept of wisdom.
In terms of the topic modeling words, phrases such as “year_ago”, “last_year”, “human_right”, “berlin_wall”, “don_get”, and “make_mistake” are all interconnected with Merkel’s recounting of her experiences during the Cold War. Her reference to the “Berlin Wall” and the violation of “human rights” in East Germany aligns with the values she upheld and the challenges she faced. Her phrase, “I don’t know how often I thought that I just couldn’t take it anymore” captures the sentiment of “don_get”, highlighting the emotional and psychological strain she endured.
Topic 4:
Spielberg_4 presents several dimensions of wisdom. Spielberg talks about life as a series of character-defining moments, illustrating the concept of “reflection,” as emphasized by Webster in understanding life’s dialectical and uncertain nature. He reflects on his own journey, which is a critical aspect of Ardelt’s “reflective” dimension of wisdom that underscores self-awareness.
The narrative also aligns with Meacham’s interpretation of Socratic wisdom, where acknowledging one’s ignorance or uncertainty is seen as a type of wisdom. Spielberg admits that while he knew what he wanted to do at the age of 18, he did not know who he was, suggesting an understanding of his own limitations and the acceptance of his evolving identity.
Spielberg’s speech intertwines with the topic-associated words from your STM topic modeling. The narrative shares themes of self-discovery and evolution (“didn’t know”, “go_college”), reflective moments (“look_back”, “every_single”), recognition of unique perspective (“teach_think”), and the importance of making morally grounded decisions (“make_good”), which underpin the various dimensions of wisdom.
Topic 5:
Oprah Winfrey’s address carries the essence of several dimensions of wisdom as defined by various scholars. Winfrey encourages Harvard graduates to utilize their skills and knowledge to achieve a common good, aligning with Sternberg’s concept of wisdom as the balance of interpersonal and extrapersonal interests.
Her storytelling about Michael Stolzenberg is an example for the wisdom dimension defined by Karami et al. – the adequate use of knowledge, intelligence, creativity, self-regulation, and moral maturity to solve critical problems. Michael’s resilience and determination to help others after overcoming his personal tragedy exemplify these characteristics, while his actions underline Grossmann’s definition of wisdom as morally-grounded excellence in social-cognitive processing. This story also resonates with McKenna’s viewpoint of wisdom as valuing humane and virtuous outcomes in decision-making, and “being practical and oriented towards everyday life.”
The narrative also matches the topic-associated words from our STM topic modeling. For instance, the phrase “along_[the ]way” ties in with Winfrey’s mention of challenges or setbacks that the graduates may encounter. “World_need” aligns with her encouragement to pursue what makes them come alive, as the world needs such individuals. The act of Michael taking responsibility to help fellow amputees resonates with “take_responsibility”.
Topic 6:
Simmons’s address displays some dimensions of wisdom as discussed by various scholars. She underscores the importance of institutional responsibility, historical awareness, and societal engagement, all vital elements of wisdom.
She speaks about the responsibility of Harvard, not just to be relaxed in its achievements but to address the disparities. This is close to Sternberg’s definition of wisdom, where he posits that wisdom involves using skills and knowledge to achieve a common good by balancing different interests over the long term. Simmons’s call for Harvard to use its influential status to address these disparities also aligns with Nonaka’s conceptualization of wisdom as prioritizing the common good and creating societal value. This suggests an understanding of the importance of decisions for resource distribution and societal engagement, which mirrors the Berlin School’s facet of rich procedural knowledge of life.
Simmons’s recognition of the work of historically black colleges and universities (HBCUs) and other minority serving institutions, despite the systemic underfunding and isolation they have faced, aligns with Grossmann’s view of wisdom. Grossmann regards wisdom as morally grounded excellence in social-cognitive processing, requiring an understanding of various contexts and perspectives and care for humanity.
Topic 7:
In Jobs’ context, “Stay Hungry” does indeed suggest a continual thirst for learning and improvement, while “Stay Foolish” encourages risk-taking and embracing the unconventional or unexpected paths that life may present. Ardelt’s wisdom dimensions include a cognitive component (knowledge application, critical thinking) and a reflective component (self-awareness, introspection). In “Stay Hungry. Stay Foolish.”, there’s an emphasis on continual learning (cognitive) and embracing unconventional paths or challenging the norm (reflective).
However, the phrase “Stay Hungry. Stay Foolish” inherently implies a focus on one’s personal aspirations and continuous growth. This emphasis, while potentially fostering individual achievement, creativity, and resilience, doesn’t explicitly address the balance of personal pursuits with the consideration and understanding of others’ needs and the broader societal context, a dimension prominently featured in many of the wisdom definitions.
According to McKenna’s, Sternberg’s, Karami’s, and Grossmann’s definitions, wisdom necessarily involves moral judgment, altruism, an orientation toward a common good, and balancing self-interest with others’ values and the care for humanity. Such wisdom facets advocate for an empathetic and ethical approach in one’s actions, suggesting a concern for the needs and interests of others, not just oneself.
In contrast, the message “Stay Hungry. Stay Foolish” doesn’t directly refer to these aspects of ethical decision-making, care for others, or an orientation toward a common good. The focus on personal goals and continual learning, while crucial in many respects, may potentially overshadow the necessity of empathy, altruism, and moral consideration in one’s actions. As such, it could be argued that while the phrase “Stay Hungry. Stay Foolish” inspires elements of wisdom, it may not contain the full spectrum of wisdom’s dimensions as defined by various scholars.
So far, our analysis has been limited to examining the highest-scoring paragraph for each topic in our structural topic modeling (STM). However, our analysis can be expanded beyond this point using the stmBrowser package. This handy tool allows us to delve deeper into our data set by providing insight into other aspects of our documents. For example, we can evaluate each document’s scores for all topics, not just the highest one. This can help us better understand the distribution of topics across our corpus. Furthermore, stmBrowser enables us to explore the semantic polarity of the documents, providing an understanding of the positive or negative sentiment associated with each topic. We can also explore the distribution of documents across different professions or speakers, providing us with richer and more nuanced insights into our data. The stmBrowser package is an instrumental tool in STM that allows us to enhance our topic modeling by offering a multifaceted examination of our documents.
The detailed results and interactive visualizations of the Structural Topic Modeling analysis can be found at the following link: https://vis.nylyn.com/res/stm/, providing a comprehensive exploration of topic distribution, semantic polarity, and document classifications across various professions, speakers, and even locations.
4.4.2.3 Heatmap
Given the varying number of commencement addresses delivered by individuals from each field, before we plot the distribution of topics across different professions, the normalization of the topic distribution is a necessary step. Without normalization, a profession with a higher number of addresses, like Business with 9, could falsely appear to have a higher representation of certain topics merely because of its larger number of transcripts. Normalization allows us to truly compare the topic focus across different professions, irrespective of the size of their representation in the dataset.
The distribution of topics across different professions and throughout the entire corpus, depicted in Figure 4.18, varies quite significantly. When examining the corpus as a whole, it becomes evident that Topic 6 dominates discussion across all professions, with the highest representation found in business, followed closely by arts and literature. Meanwhile, Topic 4 is also significantly represented across all fields, particularly in arts and literature, business, and entertainment. Topics 1 and 7 have the least representation overall. This indicates that certain subjects, namely Topics 6 and 4, tend to resonate across various professions, while others are more specific to specific fields.

Figure 4.18: Distribution of topics across professions
We may now summarize our analysis of the seven topics spread among five professions, each of which offered different aspects of wisdom.
Analyzing the topic distributions across different professions sheds light on the unique characteristics and dominant themes evident in these discourse communities. The following discussion will interpret the topic distributions and consider how they reflect various dimensions of wisdom.
Topic 1, prominently present in all professions, was especially prevalent in the Entertainment sector, followed by Business and Academia. This topic is characterized by an emphasis on learning from failure, moral maturity, and the altruistic use of power, reflecting wisdom dimensions such as Grossmann’s recognition and management of uncertainty, Karami’s emphasis on self-regulation and moral maturity, and Nonaka’s focus on altruism and benevolent use of political power. These characteristics suggest a universally recognized set of wisdom dimensions cutting across various fields, with Entertainment professionals potentially demonstrating a heightened sensitivity towards them due to their broad audience and influential public roles.
Topic 2 appears to be most strongly associated with the field of Law and Politics, followed by Academia and Entertainment. This topic underscores the integration of heart and mind, self-reflection, ethical understanding, and societal transformation, reflecting wisdom dimensions from scholars such as Karami, Grossmann, and Nonaka. The strong representation in Law and Politics may stem from the inherent necessity of these wisdom dimensions in positions of governance and policymaking. The bigram “don-know” also features prominently within this topic, a common bigram that has been thoroughly investigated in the word frequency section of this chapter.
Topic 3, where Angela Merkel’s narrative of resilience and empathy in East Germany exemplifies various wisdom dimensions, including recognizing one’s limitations, empathy, and commitment to the common good, is dominated by Law and Politics. This result is unsurprising as such wisdom dimensions, captured by scholars like Meacham, Taranto, Ardelt, and Nonaka, are often deemed essential for leaders in this field.
Topic 4, which is dominated by Business and Arts and Literature, reflects Spielberg’s journey of self-discovery, acceptance of uncertainty, and reflective life moments. This topic aligns with wisdom dimensions from Webster, Ardelt, Meacham, and Sternberg, possibly reflecting these professions’ need for personal creativity and adaptability in rapidly evolving fields.
Topic 5, most prevalent in Business and Entertainment, emphasizes Winfrey’s narrative of fulfilling the highest expression of oneself, utilizing skills for common good, and appreciating personal and social rewards. This topic encapsulates wisdom dimensions from scholars such as Sternberg, Karami, Grossmann, and McKenna, highlighting the need for personal development and societal contribution in these fields.
Topic 6, overwhelmingly represented in Business, resonates with Simmons’ emphasis on institutional responsibility, historical awareness, and societal engagement, reflecting wisdom dimensions from Sternberg, Nonaka, and the Berlin School. This finding underscores the critical role of wisdom in leading and managing organizations effectively in a business context.
Topic 7, while comparatively underrepresented across all professions, demonstrates the wisdom dimensions inherent in Jobs’ “Stay Hungry. Stay Foolish.” Despite its emphasis on personal aspirations and continuous growth, it arguably lacks a broader consideration of societal context, underscoring a potential limitation in representing the full spectrum of wisdom.
Overall, the topic distribution suggests that different wisdom dimensions may be more prominent or relevant in certain professional contexts. However, it also highlights the universal nature of wisdom, with all professions reflecting its various aspects to differing extents.
4.4.3 Top2Vec
Under the umbrella of topic modeling, we employ the innovative algorithm Top2Vec. This unique approach automatically extracts topics from the text data, not just by clustering words, but by leveraging their shared semantic meanings captured in word embeddings. These topics are semantically coherent and robust to the scale of the dataset.
In this section, we begin with the creation of Word2Vec embeddings, which lay the foundation for semantic relationships among words. Next, we turn to the construction of a ‘Terms Network’, a visualization tool that allows us to inspect these relationships. Following this, we delve into the extraction of Top2Vec topics, clusters of words that represent the unique themes in the corpus. Finally, we close this section by analysing the distribution of these topics across documents, using ‘Top2Vec Document Vectors’, providing a comprehensive view of the prevalence and significance of each theme. Each subsection represents a crucial step in our journey to uncover hidden thematic structures in relation to wisdom.
4.4.3.1 Terms Network
In the impending network analysis, we leverage the capabilities of word2vec to investigate the notion of wisdom as conveyed in commencement addresses. The word2vec algorithm, at its core, builds upon the premise that words frequently appearing together in the text (co-occurring) are semantically related. By converting these transcripts into high-dimensional vectors, word2vec deciphers hidden semantic relationships, thereby permitting us to depict the interconnections of these terms that condense the concept of wisdom.
Our chosen words are aimed at reflecting the multidimensional nature of wisdom as defined in various scholarly discourses. Our goal is to consolidate existing wisdom-related themes found in the literature while also being motivated to unveil potential new dimensions that might expand our understanding of wisdom. Figure ?? serves as a visual representation of these connections, offering a visual platform for exploring the concept of wisdom within our data set.

Word network generated using Word2Vec
This graph is a way of visualizing word embeddings (from the doc2vec model) and their similarity to each other. It is not a correlation graph in the statistical sense, and while it uses the concept of similarity (like cosine similarity), the exact similarity measure is determined by the doc2vec model’s internal calculations. This kind of visualization can help to understand the semantic relationships among a set of words. The nodes (points) represent words and the edges (lines) represent the similarity between these words. The similarity in this context is calculated by the doc2vec model.
We can analyse the results that might uncover additional insights about the nature of wisdom, by exploring the context (i.e., the specific paragraphs where these words appear in our data) of these nodes. Let’s delve into the specific contexts of selected nodes to discern how they further contribute to our understanding of wisdom.
BillGates_64: “The complexity makes it hard to mark a path of action for everyone who cares and makes it hard for that caring to matter. Cutting through complexity to find solutions runs through four predictable stages: Determine a goal. Find the highest impact approach. Deliver the technology ideal for that approach and in the meantime use the best application of technology you already have.”
Hastings_4: “The changes to the world since Stanford was founded are breathtaking. The change rate over the rest of your lives will be exponentially higher, creating opportunity – as well as risk – for you and humanity. As the world speeds up, will our wax wings melt? Or will we bend the arc of the moral universe toward justice? To find answers, let’s look to the past.”
Rowling_43: “Of course, this is a power, like my brand of fictional magic, that is morally neutral. One might use such an ability to manipulate, or control, just as much as to understand or sympathise.”
BillGates_116: “I hope you will judge yourselves not on your professional accomplishments alone but also on how well you have addressed the world’s deepest inequities. On how well you treated people a world away, who have nothing in common with you but their humanity.”
Spielberg_15: “But make sure this empathy isn’t just something that you feel. Make it something you act upon. That means vote. Peaceably protest. Speak up for those who can’t and speak up for those who may be shouting but aren’t being hard. Let your conscience shout as loud as it wants if you’re using it in the service of others.”
BillGates_113: “And with that awareness, you likely also have an informed conscience that will torment you if you abandon these people whose lives you could change with modest effort.”
Simmons_10: “I believe that each of us has a solemn duty to learn about and embrace that difference. That undertaking takes not a month or a year but a lifetime of concerted action to ensure that we are equipped to play a role in caring for and improving the world we inhabit together. This responsibility should encourage us to commit to our individual as well as professional role in advancing access, equality and mutual respect.”
Interpreting the results of the graph and the specific context in which these words are embedded reveals fascinating patterns that enrich our understanding of wisdom. The selected terms form a dense network of meanings depicting wisdom’s complexity and multifaceted nature, corroborating the richness and depth of our data selection. The network is organized around several central nodes, including “wisdom”, “moral”, “conscience”, “action”, and “judge.” These nodes are interconnected via paths that seem to suggest these concepts’ dynamic and interdependent nature in the discourse of wisdom.
The key terms and their interrelations in the semantic network clearly exhibit the components of Sternberg’s Balance Theory of Wisdom. The term “judge”, connected to both “humanity” and “goals”, embodies the idea of achieving a common good by balancing various interests. The ability to judge implies analytical and practical abilities and the capability to balance short and long-term goals while considering the broader implications for humanity. Furthermore, the term “moral”, associated with “humanity”, underscores the centrality of positive ethical values in wisdom, a primary principle of the Balance Theory. The use of “moral” in the context of the collected speeches implies a value system that considers the welfare of others and the collective good of humanity.
Moreover, the term “action”, associated with “complexity” and “improving”, mirrors the Berlin School’s idea of wisdom as a pragmatic concept and judgment about complex and uncertain matters. The term, as used in the speeches, implies a practical understanding of complex situations and an active pursuit of improvement—elements that resonate strongly with Sternberg’s understanding of practical intelligence or tacit knowledge.
For instance, “moral” directly connects to “humanity”, which further highlights the ethical implications of wisdom as underscored by McKenna, who argues that practical wisdom involves leading a morally upright life and developing healthy interpersonal interactions. Similarly, “wisdom” directly connects to “conscience”, and “conscience” further connects to “empathy” through the intermediary concept of “act.” This pattern appears to resonate with the definition by Grossmann, who describes wisdom as morally grounded excellence in social-cognitive processing and emphasizes the importance of empathy, reflection, and a sense of moral balance.
“Judge” stands as a vital node in the network, connecting to both “goals” and “action” through the concept of “improving.” This aligns with Sternberg’s definition that a person is wise to the extent that they use their knowledge to achieve a common good by balancing various interests over different temporal scales through ethical means. Notably, “action” also emerges as a crucial node in the network. This association is consistent with Karami’s assertion that wisdom involves adequate knowledge, intelligence, and creativity for problem-solving, again emphasizing the pragmatic aspects of wisdom.
Furthermore, the presence of the terms “democracy” and “vote” is also noteworthy. In its ideal form, democracy supports the selection and development of leaders who exemplify honesty and the best interests of the people. The concept of “democracy” being associated with “decision” underpins the societal dimensions of wisdom. As a result, it is analogous to Rooney and McKenna’s notion of wisdom as a collective obligation that prevents the development of toxic leaders, which accords with democratic norms.
Lastly, the direct link between “wisdom” and “conscience”, which is further connected to “respect” via “policies”, accentuates the importance of intrapersonal, interpersonal, and extrapersonal considerations in wisdom. The excerpts highlight that conscience motivates actions that respect the interests of others and confront global inequalities. These features resonate profoundly with the Balance Theory’s emphasis on wisdom as the ability to balance personal, others’, and broader societal interests.
In conclusion, the semantic network and contexts drawn from the speeches provide a robust illustration of Sternberg’s Balance Theory of Wisdom. The prominence of judgment, moral considerations, practical actions, and a conscientious approach towards balancing varied interests, all in service of addressing complex challenges, aligns remarkably well with Sternberg’s nuanced definition of wisdom.
4.4.3.2 The Most Semantically Relevant Paragraph to Wisdom
We can find the paragraph that exhibits the closest semantic connection to definitions of wisdom by using the capability of the word2vec model. This sophisticated model allows us to analyze the semantic distance between paragraphs of our corpus and the fed paragraph – the new paragraph that we prepared as a conceptual essence of wisdom defined in literature. The paragraph that most concisely embodies the underlying principles of wisdom (found paragraph) would be the paragraph with highest cosine similarity to the fed paragraph. The paragraphs, which have been fed and found, are displayed in the following grey boxes.
“moral and intellectual development virtue practical rich factual knowledge, rich procedural knowledge, lifespan contextualism, relativism of values, and awareness and management of uncertainty. common good through a balance of different interests and perspectives, long term as well as the short term ethical values, humility, and concern for others. the adequate use of knowledge, intelligence and creativity, self-regulation, openness and tolerance, altruism and moral maturity, and sound judgment to solve critical problems Morally grounded wisdom balances self-interest with others, values truth, cares for humanity. excellence in social-cognitive processing involves considering different contexts, perspectives, short and long-term effects, thinking reflectively and dialectically, and being aware of limitations and subjectivity of thinking. inspiring and empowering others, the use of reason and careful observation, allowing for non-rational and subjective elements when making decisions, valuing humane and virtuous outcomes, being practical and oriented towards everyday life, and being articulate. understanding the aesthetic dimension of work and seeking intrinsic personal and social rewards. ability to communicate effectively, build relationships, and inspire others in a balanced way, depending on the specific situation they are facing. prudent judgments lead to decision-making for the good of the organisation and society. perception and understanding of people, things, and events quickly. creating contexts for meaningful interactions. using metaphors and stories to convey tacit knowledge. employing political power to mobilise action. mentoring and cultivating practical wisdom in others. emotional regulation, humor, critical life experiences, reflectiveness/reminiscence, and openness to experience.”
Simmons_11: “Thus, I believe that the task of a great university is not merely to test the mettle and stamina of brilliant minds but to guide them toward enlightenment, enabling thereby the most fruitful and holistic use of their students’ intelligence and humanity. That enlightenment suggests the need for improving upon students’ self-knowledge but it also means helping them judge others fairly, using the full measure of their empathy and intelligence to do so. In an environment rich in differences of background, experience and perspectives, learning is turbo charged and intensified by the juxtaposition of these differences. Those open minded enough to benefit fully from the power of this learning opportunity are bound for leadership in this time of confusion and division. The Harvard model intentionally and successfully provides to students a head start in understanding how to mediate difference in an ever more complex reality in which some exploit those differences for corrupt purposes.”
In Ruth Simmons’ perspective, several key dimensions of wisdom are apparent:
Self-Knowledge and Self-Regulation: Simmons identifies the university’s role as guiding students towards enlightenment, which entails an enhanced understanding of one’s own capacities and limitations. This self-knowledge, along with the ability to regulate one’s own behavior, is a central aspect of wisdom. It facilitates the development of personal integrity and the ability to learn from past experiences and mistakes.
Empathy and Altruism: According to Simmons, universities must encourage students to judge others fairly using empathy and intelligence. This capacity to empathize, understand, and care for others is a significant dimension of wisdom. It nurtures an individual’s altruistic tendencies and promotes interpersonal relationships.
Sound Judgment and Decision-Making: Simmons posits that through exposure to diverse backgrounds, experiences, and perspectives, students are equipped to make fair judgments and mediate differences. This ability to make informed, balanced, and ethical decisions in complex situations is a crucial aspect of wisdom.
Understanding and Appreciating Diversity: The paragraph stresses the importance of being open-minded and appreciating differences in backgrounds, experiences, and perspectives. Understanding and appreciating diversity, supercharges the learning process and prepares students for leadership in a time of confusion and division.
Leadership and Social Responsibility: Simmons highlights the potential for students who embrace this diverse learning environment to become leaders. Leadership, in this context, implies the capacity to use one’s wisdom to promote a common good, balancing various interests, and making decisions that consider both immediate and long-term consequences. This emphasis on leadership underscores the social responsibility aspect of wisdom.
Ethical Values: Simmons mentions the corrupt purposes for which differences can be exploited. This suggests the importance of “relativism of values” as a dimension of wisdom. It entails making morally upright decisions and behaving ethically even under pressure.
These dimensions of wisdom, as outlined by Simmons, are not only crucial in the academic setting but also apply broadly to personal and other professional contexts. They provide a solid foundation for nurturing wise individuals capable of effectively navigating an increasingly complex and diverse world.
4.4.3.3 Top2Vec Topics
Finally, we concentrate on Top2Vec’s key output: semantically coherent subjects. Each topic is a semantic space cluster of words that conveys a distinct theme in the corpus. We determine the major themes in our dataset by analyzing these topics. This allows us to move away from individual words and toward a linked web of themes. The complete result of Top2Vec is shown in Appendix VI.
Topic 0 appears to emphasize the notion of ‘Development and Improvement’ with words such as “distance”, “gradually”, “end”, “enable”, and “reached”, possibly indicating the trajectory of personal and societal progression. Moreover, the documents related to this topic touch on the urgency of responsible algorithm development, the fluidity of plans and goals, and the importance of persisting in connecting communities. Concerning wisdom, these narratives echo elements of cognitive and reflective dimensions of wisdom. They suggest an understanding of a broader context, recognition of uncertainty, and an ability to adapt and respond to change, aligning with Sternberg’s “Wisdom as a form of Reasoning” perspective.
Topic 1 seems to center around ‘Personal Choices and Attitudes’ with words such as “compare”, “follow”, “happiness”, “courageous”, and “defined”. The documents in this topic invite reflections on personal choices, attitudes toward challenges, the essence of authenticity, and the balance between personal goals and pragmatic constraints. This topic likely resonates with the reflective and affective dimensions of wisdom. It captures reflections on self and others, emotions, and values, consistent with Ardelt’s ‘Three-Dimensional Wisdom Model’.
The issue of ‘Communal Responsibility and Global Challenges’ is highlighted in Topic 2. with words like “stronger”, “potential”, “teams”, “governments”, and “diversity”. The associated documents underscore the power of relationships, the sense of community with words such as “we”, “our”, “teams”, global responsibilities, and the pursuit of social justice. This topic resonates with wisdom’s reflective and cognitive dimensions, suggesting a broad understanding of social matters and a capacity for empathy and compassion, particularly relevant to ‘Berlin Wisdom Paradigm’.
Topic 3 focuses on “Personal Experiences and Challenges” featuring words like “finished”, “wondered”, “kept”, “reading”, and “launched”. The documents reveal personal loss stories, critical self-reflection, challenges, and aspirations. This topic, while not directly tied to a particular wisdom model, may represent a form of experiential wisdom, suggesting that wisdom could stem from personal experiences and the reflective process they prompt.
Topic 4, distinctly apart from the wisdom framework, emphasizes “Formality and Gratitude,” as illustrated by the words “thank”, “inviting”, “honored”, and “graduates”. The associated documents are largely expressions of gratitude and honor, customary to commencement addresses. Nonetheless, they offer a valuable insight into the contextual nature of the analyzed speeches.
These findings support the multifaceted character of wisdom, which encompasses personal decisions, communal obligations, personal narratives, and social betterment. While not every issue clearly ties to wisdom, their overall perspective of wisdom as a multidimensional construct highlights the wisdom buried in commencement addresses.
4.4.3.4 Heatmap
For each profession in our dataset, we can calculate the mean of the document-topic similarities generated by the Top2Vec model. These values represent the average topic distribution for each profession. The heatmap in Figure 4.19 visually depicts these proportions, which represent the average extent to which each topic is present in the speeches of the corresponding profession. As can be observed, each profession addresses every topic to a certain degree, but one or two themes predominantly characterize each group.

Figure 4.19: Distribution of topics over different fields
This heatmap provides an insightful visualization of the distribution of topics across various professions. The stronger colors in specific areas indicate a higher frequency of that topic within speeches given by the professionals of the corresponding field.
One of the most salient features of the heatmap is the strong prevalence of Topic 4 (‘Formality and Gratitude’) across all professions. This may be attributable to the ceremonial nature of commencement addresses, which often start with acknowledgments and expressions of gratitude.
Additionally, Topic 1 (‘Personal Choices and Attitudes’) appears to have a significant presence across the sectors of academia, arts and literature, business, and entertainment, suggesting these fields emphasize personal growth, decision-making, and individuality in their addresses.
In contrast, Topic 2 (‘Communal Responsibility and Global Challenges’) is notably more prominent in speeches from law and politics professionals. This reflects the societal and global focus inherent in these fields.
Overall, this heatmap serves as a visual summary of how different thematic elements are distributed across speeches from various professional sectors, revealing unique patterns that characterize each profession’s discourse.
4.5 Conclusion
In conclusion, text mining techniques such as word frequency analysis, sentiment analysis, and topic modeling have been instrumental in deriving meaningful insights from the corpus of commencement speeches purportedly without the necessity to read the whole texts. Each technique offers its unique perspectives and strengths, although they each have their limitations. Furthermore, the results of each technique have been analyzed through the lens of various dimensions of wisdom, drawing correlations between the findings and the wisdom concepts proposed by various scholars.
Word frequency analysis offers insights into the most frequently discussed terms and ideas within the corpus. It identifies the dominant narratives in different domains like Academia, Arts and Literature, Law and Politics, and others. However, it operates at a very surface level and lacks the ability to capture complex semantic and syntactic structures. The results from this technique often need to be interpreted with caution due to its context-agnostic nature. Therefore, the context of some of the most frequent words was extracted and analyzed.
Sentiment analysis unveils the emotional arcs of the speeches, revealing how the sentiment changes throughout the narrative. This gives us an understanding of how speakers create engagement, express empathy, and deliver wisdom throughout their stories. However, sentiment analysis, much like word frequency analysis, can sometimes fail to capture nuances, especially when sentiments are expressed subtly or indirectly.
Topic modeling, through methods such as LDA, STM, and Top2Vec, provides more nuanced and in-depth insights. It helps to identify latent themes across the speeches, bringing out the essence of the messages delivered by speakers from various fields. Yet, topic modeling can sometimes produce less intuitive results, demanding interpretation from the analysts.
We observed that each analytical approach provides distinct yet interconnected insights, therefore complementing one another when applied in combination. Topic modeling provides various comprehensive insights, while sentiment analysis offers the context of emotional dynamics, and word frequency analysis delivers a quick scan of the key themes. The combination of these techniques provides a holistic perspective, reinforcing and complementing each other’s results. For instance, while topic modeling might discover the themes around ‘civil rights’ and ‘global issues’, sentiment analysis can further expose the emotional nuances when such topics are discussed. Similarly, word frequency analysis can quickly verify these themes by indicating the high occurrence of relevant terms.
The analysis is enriched by interpreting the text mining results through the lens of various dimensions of wisdom proposed by scholars like Ardelt, Grossmann, Karami, Sternberg, and others. This approach bridges the gap between raw textual data and meaningful insights about wisdom, showcasing how commencement speeches encapsulate various aspects of wisdom as defined by these scholars.
The limitations of these techniques, however, underline the importance of combining multiple techniques and incorporating human interpretation to derive the most accurate and comprehensive understanding from the data. While these techniques can capture various elements of the speeches, they cannot fully grasp the depth of wisdom conveyed. However, as a starting point, they provide us with a robust framework to systematically analyze a large volume of text and filter valuable insights. The interplay of these techniques, coupled with the integration of wisdom dimensions, offers an intriguing avenue for exploring wisdom within commencement speeches.