The Use of Algorithms in Modern Society: How Data Collection and Organization Contribute to the Corporate Sector

Privacy: a concept long debated, dismissed, and fought for, is commonly believed to be a natural right – one whose infringement is worthy of harsh punishment, regardless of who presents themself as the violator. This view holds especially true when it comes to the perceived intrusion of government agencies or commercial entities into the personal lives of individuals who consider many aspects of their lives, such as the places they visit, the people they communicate with, the things they purchase, and other sorts of online activity, to be owned by them exclusively and without allowance for access to that by other companies to whom they do not approve.

Relevant examples of what some see as this uninvited invasion of privacy include the tracking of online activity such as browsing history by growing companies like Google and Facebook. According to Ethan Cramer-Flood from eMarketer (Cramer), Facebook and Google currently hold a “duopoly” over the advertising market, and this of course is done so through means of collecting massive amounts of user data and proceeding to use that data to produce personalized modifications in the treatment of, such as with the content displayed to, individual users of those services. Objections to and acclimations of this type of profit model collectively call into question whether these practices are legal, sustainable, and most importantly, ethical.

In order to assess the validity of both viewpoints regarding this contentious question, it is important to have a fundamental understanding of data. As defined by a prominent data operations center, data is “a collection of facts (numbers, words, measurements, observations, etc) that has been translated into a form that computers can process” (Import.io). Data can take a variety of forms, and as just mentioned by Import.io, often manifests itself numerically through general statistics and other numerical trends. Recently, however, as technological computing powers have drastically increased over the past decade, the data has been thought of increasingly, and rightly so, as the unique, personalized information of individual people rather than merely the broad statistical behavioral trends that it used to be. Equally important as understanding the basic meaning of data is understanding the means by which it is collected, from the time that a “user” performs a documentable action all the way to the time those actions are used in changing that user’s experience. According to QuestionPro, a trusted service offering implementable surveys and other feedback methods to companies like Google, Microsoft, Amazon, and Disney, “data collection is defined as the procedure of collecting, measuring and analyzing accurate insights for research using standard validated techniques” (Bhat). Collection of data often focuses on the user base of a specific product or service, under the concept that this knowledge will result in improvement of that product or service in relation to those individual users’ interaction with it.

Data collection can take place through explicit activities like surveys or through more discreet methods such as usage monitoring of online websites and applications. As stated by Wired, “Even seemingly benign activities, like staying in and watching a movie, generate mountains of information, treasure to be scooped up later by businesses of all kinds.” (Matsakis). All of this data, according to Business News Daily, falls under one of four categories of data: personal data, engagement data, behavioral data, and attitudinal data (Freedman). Examples of these types of data collection occurring on a day to day basis include customer satisfaction surveys being issued to the clients of a local business to improve the processes of the business, ER patient statistics, such as injury types and severities, being tracked by a hospital in order to create campaigns that raise awareness and decrease incidents, and search queries being tracked by a search engine both to analyze general trends and to personalize the searching experience of individual users.

These practices of data collection serve many uses in civilization, yet there is also an abundance of cases through which the possession of large amounts of user data creates problems, both involving the company’s ethical and legal use of that data as well as the increasingly looming risks of data breaches. According to The Verge, for example, Facebook has admitted to allegations of harvesting and storing the email addresses of users’ contacts stored on their phone. While these emails were said to have been collected for the purpose of recommending friends, they allegedly were collected without the consent of the user. Situations such as this are usually agreed to be unethical due to the lack of consent provided by the user for Facebook to use their contacts in this manner. Data breaches are another frequent incident that presents the possibility of a large amount of user data being compromised and used in illegal activities such as ransomware and identity thief. One recent count of this was a data breach of Facebook and Instagram earlier in 2021 in which more than 200 million users had personal identifiable information such as their phone numbers, email addresses, and locations leaked onto the internet (Security Magazine).

While it is certain that there are risks that come along with the widespread collection of data by commercial entities, data is a resource that can prove invaluable to many companies whose success hinges on their understanding of the customers they serve. This possession of data, as stated by the Ontario Human Rights Commission, can help to proactively address issues, measure progress and capitalize on opportunities (Ontario Human Rights Commission). Having the right information is a useful resource in determining which products are most popular, for example, and then deciding to increase production of that product at the expense of less popular items. Accordingly, customer behavior is an important thing to assess for most businesses and is a driving force behind both the relevance and usefulness of many companies in existence today. Tasil, a company heavily involved in the data management space, lists the following as the four primary benefits that businesses receive from effective data collection: “a better understanding of your customers, easier identification of areas for improvement, prediction of future trends, and better personalization and targeting” (Goworek). It is undisputed that many companies would not be nearly as successful as they are today had it not been for these benefits for data collection and analyzation. Overall, data collection creates a number of benefits for the companies that utilize them, while simultaneously alarming many who believe their information should not be collected by other parties.

Even aside from the traditional methods of data collection and organization, both of which have existed for centuries, there is a more recent development in the sphere of data management: the use of algorithms in both deriving meaning from data and using those automatically interpreted results to output immediately implementable changes, a process which often times requires no human intervention or oversight whatsoever. Algorithms add a unique set of questions and concerns to the equation, by taking the data that was already collected and processing it to bring about a variety of outcomes. Once again, understanding the definition of this concept is important. According to My Coding Place, an educational institution in Texas, an algorithm is “a set of step-by-step procedures, or a set of rules to follow, for completing a specific task or solving a particular problem.” (Pirzada). Algorithms are used in a number of situations to perform computational tasks on a scale that humans would simply be incapable of doing, especially with any significant amount of accuracy. In any case, the unique characteristic of algorithms is that they carry out a sequence of events in order to achieve some goal that they were programmed to achieve. One common example of the use of an algorithm in processing data is the existence of “You might also like…” features on many media services like Netflix, YouTube, IMDb, and most recently, even TikTok, which gather previously collected data and use it to output recommendations through the algorithms that have been created by those services. Many users are in fact pleased with these types of algorithms since they appear to save time searching for additional content that matches their interest, and the only purported downsides to this particular type of algorithms, according to Gizmodo, is that they sometimes are inaccurate and in need of improvement and that at other times they are prone to creating a “filter bubble” whereby users are not being exposed to content of other types, topics, and viewpoints (Dvorsky).

While there are practical and efficient uses for algorithms that make life easier for many people by reducing, and in some cases even eliminating, the need for human labor, many people take issue with the seemingly “divine trust” that is often granted to these algorithms, and of debate is whether or not algorithms should be expanding their reach to determining factors that are critical to people’s standard of living. For example, a study conducted by Princeton University uncovered the tendency of search results to have embedded forms of inconsistent results that were based in a foundation of intrinsic bias, specifically dealing with the socioeconomic backgrounds of those using the systems (Lee). Negative byproducts such as these that go along with the development of any new technology are usually substantial enough to merit adjustments, and in the case of algorithms, many of those changes are currently on the table. Regardless, however, of how these adjustments are made, it is inevitable that enhancements in data collection, and furthermore, in the algorithms which are built to manage these vast amounts of data, will continue to grow and advance to a large extent. The extent to which these measures are regulated will continue to be largely up to legislators. Regulations such as the GDPR, CalOPPA, CCPA, and the New York Privacy Act are early examples of where this is already the case, and as technologies improve, these regulations are certain to expand to keep up.