To hype or not to hype

From the Tulip Mania, the first speculative bubble in 1637, to more recently the cryptocurrency craze, we have observed the effects of hype throughout our history. More specifically in technology, every decade has its own set of buzzwords and trends: PCs in the 80s, the internet in the 90s, social media in the 2000s, the cloud and AI in the 2010s. With the internet spreading to every corner of the planet, the number of potential users (or consumers) keeps growing rapidly and so do the opportunities for capital, intellectual and social gains. The fast adoption of new platforms is usually followed by enormous hype and misconceptions, which often leads to big systematic problems1. The internet bubble of the 90s ended in a market crash. The Cambridge Analytica scandal (and many others) showed the deep and impactful flaws in social media and data privacy practices. And now that we are in the initial phase of adoption of Artificial Intelligence in real world applications, the hype could once again put us in a bad position as we observe a gap between the reality of the field and public concerns and perceptions.

Marketing strategies and terminology help build a false notion around technology subjects. In the case of AI, consumers are hit with endless advertisement with promises of smarter smart assistants and magical algorithms that understand them on an unprecedent level. Adding to that, the high technical complexity of these systems warrants the use of vague terms for communicating with the public. The "Intelligence" of different AI systems, for example, varies vastly in terms of capacity and complexity. Many times, this leads to a false idea of equivalence between two fundamentally distinct ways of thinking2, as exemplified by sentiment analysis algorithms that are not capable of understanding emotions as humans do but try to label them anyway. The truthfulness of these promises is hard to evaluate due to the difficulty in grasping how AI works. This lack of transparency of the inner operations of the systems means that sometimes even for the system creators it is challenging to identify problems and ways of fixing them3.

In popular culture, there is a tendency to represent anthropomorphised AI. Characters are usually humanoid robots, carrying person-like characteristics such as relationships and gender. WALL-E is a garbage collecting robot and has no biological organs of any sort, let alone genitalia, but it is still perceived as a male thanks to his behaviour, appearance, and relationship with another (seemingly female) robot4. This way of representation is distant from the reality of our uses for AI, which is much more focused on data than emotion, but using human-like entities in the narrative is better for provoking the viewer. Another recurrent theme is the dystopian trope of man versus machine, showing the destiny of humankind at the brink when confronted by a similarly intelligent being, such as in productions Ex Machina, Westworld and Terminator5. This might make for good cinema, but again represents dangers that seem detached from the direction AI has been developing.

Unless someone works in technology fields related to AI or is deeply interested in the subject, they access it through the aforementioned media, which are not an accurate representation of the current (and probable future) state of AI technologies. Marketing delivers unmeasurable promises, and popular culture misrepresents. This problematic way of tackling complicated technical topics can have a real effect in how they develop.

Exaggerated expectations and fears about AI, together with an over-emphasis on humanoid representations, can affect public confidence and perceptions. They may contribute to misinformed debate, with potentially significant consequences for AI research, funding, regulation, and reception. (“Expectation vs. Reality: AI Narratives In The Media”)

There has been a rapid rise in the mediatic coverage of AI since 20096. The expectations and perceptions created in popular culture and marketing leak into the news, where clickbait articles suffer from the same problems of shallow understanding of the subject and focus on unimportant aspects. New developments in the field are grossly misinterpreted and sensationalized as journalists try to position it in an apocalyptical timeline. A natural language generator developed by Facebook outputted conversations with English words: two bots would talk to each other and negotiate products. In the tests, however, the AI model generated a semantic alternative to English that made no sense to humans but was well evaluated by the AI system. Getting unexpected and non-sensical results is common in the process of developing an AI model, but a few months later news sources had removed these findings from context and exaggerated, comparing the episode to the human decimation setting of Terminator.

Meanwhile, there are pressing matters to address in the field of Artificial Intelligence. Prediction algorithms used by governmental and private institutions have shown bias or prejudice in many cases. And with the fast adoption cycle we are currently going through, many times these issues are identified too late. We saw this happening as early as 1988. The St. George's Hospital Medical School in London was found guilty by the former Commission for Racial Equality of "practicing racial and sexual discrimination in its admissions policy"7. The software being used by the institution to screen candidates "unfairly discriminated against women and people with non-European sounding names". In more recent years, police departments in the US have started using AI systems to forecast criminal activity. A paper from places like Chicago, New Orleans and Maricopa County employ these programs8. By using skewed training data, the prediction software echoes pre-existing biases in the police records. In 2011, the United States Department of Justice found that the police department of New Orleans repeatedly engaged in patterns of misconduct, by practicing racially profiling and LGBTQ discrimination. Months later, the city deployed a predictive policing system trained on the data which, as revealed by the DOJ, was inaccurate and biased, and perpetuated the racial disparity problems9.

Amongst stakeholders involved in technology development, policy and law related to AI, there seems to be a consensus that the systems rely heavily on the source data used to train them10. Problematic AI systems likely suffer from bad data, an aspect that the software architects cannot control. When datasets used to train models are small, fragmented, and heterogeneous, such as the ones in St. George's medical school and New Orleans, the output is significantly limited.

[...] that model is limited to the data that it processed, the data it was trained on. That's its entire universe of knowledge. So if that data set is incomplete or if it's biased, it's going to result in a biased result for the end user. ("Good systems, bad data?: Interpretations of AI hype and failures")

For AI systems to avoid this problem, more homogeneous data is needed. Prediction software, especially those built on top of deep-learning models, require a huge volume of data to work well. Better represented groups of society, such as light-skinned, economically well-off men, have more accurate data available and are thus favoured by the AI when compared to racial minorities, who are less represented in data volumes11. To solve this, data collection needs to be revamped, as to increase its amount and resolution. However, after many episodes of data privacy violations, is it in public interest to support this expansion of collection and control? Companies know this is a sensitive topic and try to find smart ways of gathering data from users. Google hides a data labelling program in their spam prevention system reCaptcha, making users train their models without realizing. FaceApp serves as a photo filter for selfies, but in practice is an immense database of high-resolution facial data from its millions of users.

In a market where innovation, novelty and being the first are prioritized, failed products slip through open cracks, left by light regulation and lack of technical and ethical standards, in an effort to feed the hype. If we are to continue developments in Artificial Intelligence in a responsible manner, we need better tools and resources to uphold accountability and good practices, especially for governments and law enforcement agencies. More data than ever will be needed for such endeavours, and all spheres of society should participate in deciding priorities. The misrepresentation of AI disrupts the public attention and debate around the subject, directing criticism towards software developers when the focus should be in designing data standards that reinforce privacy and equality instead of undermining them. Academic, political, and governmental entities do not have enough open lines of communication with society, directly affecting how technologies are socialized and communicated. People who are directly affected by technologies need to understand what they are and how they are used, thus maybe avoiding repetition of history when it comes to this particular hype.


1 Alok Aggarwal, "The Birth of AI and The First AI Hype Cycle" (2018).

2 Adrian A. Hopgood, "Artificial Intelligence: Hype or Reality?", Computer 36, no. 5 (2003): 1-2, accessed on December 5, 2020.

3 Stephan C. Slota et al., "Good systems, bad data?: Interpretations of AI hype and failures", Proceedings of the Association for Information Science and Technology 57, no. 1 (2020): 2.

4 Roman Barrosse, "Is WALL-E Male or Female?", published June 25, 2020, eve-the-great-gender-debate-407befd7d932.

5 Magalhães, Raquel. "Expectation vs. reality: AI narratives in the media". Published October 18, 2019.

F6 Ethan Fast et al., “Long-Term Trends in the Public Perception of Artificial Intelligence.” ArXiv abs/1609.04904 (2017).

7 Stella Lowry, Gordon Macpherson, "A blot on the profession", British Medical Journal 296, no. 6623 (1988): 657. 8 Rashida Richardson et al., "Dirty Data, Bad Predictions: How Civil Rights Violations Impact Police Data, Predictive Policing Systems, and Justice", New York University Law Review 94 (2019). Accessed December 5, 2020, on

9 Richardson, 214.

10 Slota, 4-6.

11 Slota, 5.

Aggarwal, Alok. "The Birth of AI and The First AI Hype Cycle", KDnuggets (2018), accessed December 5, 2020.

Barrosse, Roman. "Is WALL-E Male or Female?", published June 25, 2020, the-great-gender-debate-407befd7d932.

Fast, Ethan and E. Horvitz. “Long-Term Trends in the Public Perception of Artificial Intelligence.” ArXiv abs/1609.04904 (2017).

Hopgood, Adrian A. "Artificial Intelligence: Hype or Reality?", Computer 36, no. 5 (2003): 1-2, accessed on December 5, 2020.

Lefford, F. and V. van Someren. “A blot on the profession.” British Medical Journal (Clinical research ed.) 296, no. 6623 (1988): 1065 - 1065. Accessed on December 5, 2020, at

Magalhães, Raquel. "Expectation vs. reality: AI narratives in the media". Published October 18, 2019.

rRichardson, Rashida, J. Schultz and K. Crawford. “Dirty Data, Bad Predictions: How Civil Rights Violations Impact Police Data, Predictive Policing Systems, and Justice.” (2019). Accessed December 5, 2020, on

Slota, Stephen C., Kenneth R. Fleischmann, Sherri R. Greenberg, Nitin Verma, Brenna Cummings, L. Li and Chris Shenefiel. “Good systems, bad data?: Interpretations of AI hype and failures.” Proceedings of the Association for Information Science and Technology 57 (2020).