"Ethics in Generative A.I." is a personal exploration of what ethical implications may surround the development, use, and impact of generative artificial intelligence technologies. I dive into what caused rapid advancements in A.I., its widespread adoption in both commercial and personal environments, and the various challenges to ethics that have arised from how artificial intelligence is being used. Artificial intelligence is quite a complex topic, but don't worry. I will break down many of the aspects of A.I. development, how A.I. is reshaping industries and our personal lives, and the critical issues which have spurred debates on proper attribution and data usage. Furthermore, I will tackle common ethical considerations in A.I. deployment, how current laws and rights apply to A.I. technologies, and some strategies to mitigate potential misuse. We must understand the ethical landscape of generative A.I., and foster better informed discussions and use of these rapidly advancing technologies, otherwise, information in the public and private domains will end up like the Wild West.
Over the recent years, development of generative A.I. has grown exponentially. Most of this growth can be attributed to refinement of deep learning algorithms, exponential increases in processing power, and the vast amount of data that has been composed into large datasets.
A myriad of factors has pushed development forward since the breakthrough of AlexNet in 2012. AlexNet was a deep learning neural network which too the ImageNet competition by storm. Neural networks are a computer representation of how data is ingested and "learned" by the use of neurons in the human brain.
As AlexNet provided a great example of how neural networks can be used and improved, it sparked a renewed interest in neural networks and artificial intelligence. Quite a few key elements also helped to contribute to this renewed interest.
With the internet becoming exponentially vast and digital data taking the world by storm, data became increasingly easy to obtain and became the main focus for training A.I. By 2020, the amount of digital data created, captured, copied, and consumed reached roughly 64.2 zettabytes, according to IDC. To put this in perspective, one zettabyte is equivalent to 1,000,000,000,000 gigabytes. That's right, each zettabyte is one-billion gigabytes. The typical modern hard drive is currently around 1024 gigabytes for comparison.
Alongside the increase in digital data, hardware advancements, particularly GPUs (graphics processing units), drastically increased processing capabilities which are needed in the training process. NVIDIA introduced CUDA in 2006, an architecture used in graphics processors, which allowed GPUs to be used in general-purpose computing applications. In essence, this allowed your GPU to act as another, more powerful, processor in your system.
Breakthroughs in machine learning algorithms, mainly in deep learning, allowed for more efficient training in neural networks. Techniques like backpropagation and Convoltional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) became the mainstream algorithms. These algorithms focus on how data is ingested and trained. In more understandable terms, just like humans have various ways of learning, these concepts are applied to the training of A.I.
Training A.I. is an extremely complex process which involves exposing machine learning models to the large amounts of data. These neural networks begin to recognize patterns in the data and can make predictions based on these patterns.
There are various techniques used to monitor and train A.I. Supervised learning involves training models on datasets which contain labels. For example, OpenAI's GPT-3, a large language model (LLM), was initially trained on a text dataset which was 570GB. This dataset included books, articles, websites, and publicly available content. LLMs have been trained to predict the next word in a sequence which allow them to generate human-like text responses.
Unsupervised training is used when data is not labeled. For example, using vast amounts of consumer data which can vary across thousands of categories for hundreds of millions of consumers.
Reinforcement learning is a more modern and "simplified" approach to training. Much like a rat in a lab being trained for specific tasks, reinforcement learning applies a penalty and reward system to the training regime. This allows for extreme precision in its specific tasks for which it was trained to complete.
Of course, there are other various methods being used. These methods apply to specific cases and are not as popular as the ones I have covered already. Training A.I. is a very expensive and time-consuming process. Organizations and individuals dedicate extensive investments into data gathering, organization, computing hardware, and training time.
Artificial intelligence has transformed many commercial industries and has revolutionalized operations. Applications of A.I. are very diverse and far-reaching. Many industries have shifted a vast amount of workloads from humans to specifically trained A.I. models, thus increasing efficiency and lowering overhead costs.
The financial, healthcare, customer service, and auto industries have seen the most use and benefits of implementing A.I. technologies. The models used can take repetitive tasks and large amounts of data to provide expedited services to these organizations.
The financial sector has used A.I. extensively for fraud detection, risk assessments, and stock market trading. For instance, JPMorgan Chase has implemented an A.I. system called COiN (contract intelligence) which uses machine learning to ingest legal documents. This model can review over 12,000 commercial credit agreements within seconds. This task would have previously taken over 360,000 hours of work from lawyers and loan officers on an annual basis.
Healthecare has also seen an extensive use of A.I. When COVID-19 began to spread out of control, healthcare research organizations used artificial intelligence to aid in finding a vaccine for the virus. Cancer research has also seen aid from A.I. IBM's Watson model for Oncology can analyze a patients medical records and provide specialized treatment recommendations. A study which was published by the Annals of Oncology showed that some A.I. models can detect skin cancers with the same accuracy as expert dermatologists.
Furthermore, retail and customer services has found use for various A.I. technologies. From handling customer services calls to keeping track of consumer records for targeted advertising, these models have drastically changed marketing strategies.
Last but not least, the auto industry has expanded technologies which are being used in modern self-driving vehicles. Tesla, the most notable automobile manufacturer, has implemented an extremely well trained and precise model which is used for their various vehicles to self-drive. This model uses inputs from cameras and sensors on the vehicle to make decisions on how to drive safely and can notice dangers substantially faster than humans.
You may not be aware, but many devices available in the modern home, uses A.I. to function. For example, home assistants like Alexa and Google Home use models trained specifically to aid in common household tasks. Although, the main focus for these home assistants is targeted advertising, they have been expanded to tackle other need by the average person.
Other personal assistants are available on mobile devices and computers which help with productivity in many areas of normal life. For example, musicians can use a model which helps with generating music ideas, authors can get aid when having writers block, and students can get condensed research completed in a short amount of time.
Models for many aspects of life have been trained and made available for the average user. Health and fitness applications, personal finance tracking, photography, and many others have became a normal implementation in modern applications. These all aim to provide aid and guidance based on the data which they were trained on.
Additionally, even for individuals who are not tech savvy, it has become quite easy to train your own model from scratch for any particular use case you could imagine. From image processing, to audio and video creation. Given you have compiled a dataset or have access to datasets, you can train your own model quickly and easily. This has made it widely accessible to the average individual.
As I have discussed previously, the training of an A.I. model is key to its functionality. With the ability to be trained on various data, and in massive quantities, there can be an A.I. made for practically any need. Each day, the amount of data available to the public grows exponentially. Users upload images, articles, videos, and many other forms of media in mind-boggling amounts. Any data which is available in the public domain can be used. The main question is, should it be used? I will cover that question in another section.
This is where A.I. technology development gets tricky. Over the past decade, many tools and research has been accomplished to scrape as much public and private data as possible. It could be any usable form of data or media. Again, this data is compiled into large datasets and an A.I. model is trained by various methods on this data.
There are many organizations and individuals who have used this data to train A.I. models and used them for commercial means. This is where our ethics challenges begin to take shape. For example, let's say we have trained our own private model on fantasy books from a specific author with the goal of the model to provide story writing. With this model being trained on the authors intellectual property, all it will know is the writing style and idioms the author uses. In essence, if you were to have your model write you a chapter of a book, it will be heavily based on the author of the books it was trained on. Of course, this model may write a completely new story but it is using that specific authors property to do so.
Individuals and organizations have done this with extensive amounts of media and provided models which generate media. For example, Suno, a music generation model, has been trained on millions of various song and artists. When you use this technology, the A.I. may have generated a unique song, but if you dig deeply into the music it has generated, you will find a vast amount of similarities to the artists of the genre.
So how do you give credit? This is where many debates have taken place. Although much of this data is available publicly and can be easily accessed, that does not grant explicit permission to use the creators content. Of course you can cite and credit the creators but does that mean you are free to use their data? Not necessarily.
Going back to the previous example of training a model on a specific authors works, let's investigate. Let's say you used a web scraping service or tool to gather an authors works and information about the author. How can you determine that you have not gotten paid or copyrighted content for free and used it in your training process? Have you explicitly asked the author for consent to use their works? Did you get licensed material illegaly? This is a prime example of copyright infingement which has been extremely difficult to track when applied to A.I. technologies.
Ethics is a very important aspect in AI development and deployment due to the technology's widespread implementations and security risks. Some of the basic considerations include fairness and bias mitigation, transparency and explainability, privacy and data protection, accountability, safety, and long-term societal effects. As AI systems make increasingly controversial decisions, ensuring they align with our human values and rights becomes extremely important for us. We must implement ethical frameworks to help guide AI's design to respect our human autonomy, avoid harm, and promote better outcomes for humanity in the long-term.
Here are a few ways which we could use AI in a more ethical sense. Mostly, it boils down to, does it feel right or wrong? Your morals should give you a solid answer to this question.
Ensure diversity in training data and mitigate any bias
Prioritize transparency for AI applications and decision-making
Implement privacy and security measures which protect user data
Maintain accountability and oversight of AI applications
Perform in-depth testing to identify any unintended use cases
Move to design AI systems which reflect human values
Provide clear disclosures of when AI is being used
Ensure that humans maintain meaningful control and oversight
Consider any long-term impacts
Monitor and audit AI applications for ethics compliance
As AI technology becomes more integrated into various aspects of human life, it raises a question of how your rights as a citizen apply to AI. This is a very important and difficult aspect of creating ethical AI applications. While the U.S. government has recently started to address this intersection, many protections are still in the early developmental stage.
One key document I found which outlines how AI should respect American rights is called the "Blueprint for an AI Bill of Rights." This document was introduced by the White House in 2022 and set forth five key principles which were aimed at guiding the development and deployment of AI applications. These principles include ways of ensuring safety, protecting against algorithmic discrimination (bias), preserving our data privacy, providing clear and understandable explanations of an AI's decisions, and maintaining human oversight in the AI decision-making processes.
Although this blueprint highlights quite a few great topics, it is not legally binding, it serves as more of a guideline than enforceable law. Despite the efforts of the federal government, there is currently no overarching federal legislation which is governing AI or protecting the citizens from potential harm posed by AI. Some states have started to fill this gap by enacting laws which target specific AI-related issues, such as Illinois’ restrictions on AI in hiring processes, which I personally support, and California’s regulations on using AI chatbots. However, this has led to a puzzled landscape of how to regulate AI in the U.S., with could vary depending on where you live.
Censorship of AI often revolves around the control of information and training data. This limits AI technologies to prevent certain types of content from being generated. No need to go in-depth on this specific topic but individuals have found ways to create derogatory content.
Some of the censorship concerns include a potential for AI systems to reinforce and existing biases, especially when it is used for moderating content or in use by government organizations. AI systems which are used by law enforcement, for example, have made errors in their decisions that has disproportionately affected marginalized communities. The raises questions about the fairness and civil liberties involved with AI. Furthermore, AI has been used to perform survelliance and monitor individuals or organizations. This cannot go unchecked, especially if the AI technology being used is in the hands of large organizations.
As we continue to evolve and press forward with AI technologies, ongoing debates about ethical implications of content censorship and the need for regulations will more than likely intensify, this is especially true as more Americans become familiar of how these technologies may affect their civil rights.
AI is being leveraged by cybercriminals to expedite and create more complex attacks. One of the primary ways AI is utilized by cyber attackers is through by developing polymorphic malware with the help of AI programming technology. This type of malware will constantly alter its code to which helps it remain undetected by anti-virus software.
By using common techniques such as encryption, code mutation, and obfuscation, this malware can bypass signature-based detection methods, which is a common way malware is detected. Additionally, AI-generated phishing attacks have become more convincing in comparison to the old Nigerian Prince attempts of the recent decades.
Another significant threat comes from deepfake technology. Cybercriminals continue to use AI to create realistic, yet fake, audio and video content. They impersonate public figures, such as CEOs or government officials. The sophistication of these attacks makes them extremely difficult to detect without advanced tools. Deepfakes have came a long way from the recent years. Now it is quite difficult to tell a real from a fake unless you are extremely experienced in the nuances of AI generated content.
Cyber threats exploit AI's capabilities for various malicious purposes beyond the traditional cybercrime. One trend is the use of AI to automate and enhance social engineering attacks. With AI, attackers can quickly generate convincing emails that precisely mimic official communication which makes it easier to deceive their victims.
Moreover, AI tools can analyze our behavior and time these attacks exactly when a target is most likely to respond. AI is also being used in for targeted cyber operations. For example, state-affiliated threat actors have utilized AI to support espionage and influence operations.
To combat these threats, cybersecurity professionals are increasingly turning to AI to implement defenses and increase security measures against these attackers. These systems can dynamically detect and respond to threats by analyzing behavior and patterns in real-time.
Comments