Artificial intelligence technologies are perplexing! The responses of AI are often surprising. Yet the sheer complexity of some algorithms (billions of parameters in some cases), and the size and composition of the training datasets used to "train" the machine, mean that their modus operandi is far from transparent, and raise questions as to the ethical and legal implications of these technologies.
It is undeniable that artificial intelligence raises many ethical questions! These issues have been addressed in a number of books, including " Le mythe de la singularité " and "Servitudes virtuelles " by Jean-Gabriel Ganascia, professor at Sorbonne University and researcher at LIP6, who was also chairman of the CNRS ethics committee. He addresses the major ethical issues surrounding artificial intelligence and breaks down certain myths. These books are an excellent gateway to these subjects.
With AI, ethical issues arise at every level, particularly in the creation of training datasets...
Indeed, in the ever-evolving world of artificial intelligence, one persistent problem remains: bias in algorithm training data. These biases, which can be societal, cultural or statistical in nature, have a significant impact on the performance and fairness of AI systems. For example, an AI trained on data predominantly from a certain category of population may not perform as well for other groups. Similarly, existing preconceptions in training data could cause an AI to reproduce those same preconceptions, creating a negative feedback loop. It is crucial for AI designers to take these biases into account when collecting training data in order to develop fairer and more effective AI systems.
To avoid bias, we need to ensure that the data is representative of the diversity that we find in the real world (gender, religion, culture, etc.).
In extreme and relatively rare cases, moral problems may arise in the use of the algorithm itself.
The most telling example is assuredly the case of the Deadbot (Project December), which aims to train a language model with the textual data of deceased people to design a chatbot that mimics the way these people express(ed) themselves. We can immediately understand the issues that this initiative throws up.
In technology companies, ethics committees are increasingly being set up to put in place the appropriate safeguards. Some states are seeking to regulate AI, as in Europe, where the emphasis is on the need for algorithms to be explicable (why was this decision made?) and for users to be non-discriminatory.
From protecting individuals...
In a report published on 26 April 2023, the "CNIL Digital Innovation Laboratory" (the LINC) reaffirmed fundamental principles that must apply to artificial intelligences, including generative ones: "There are certain fundamental rules (such as copyright or personal data protection) that apply to the design and use of content-generating AI systems".
Indeed, major technology companies very often use user feedback to improve the performance of their AIs. However, this is often done in the dark, and AI users are often, unwittingly, shadow workers for these technologies.
This is where problems of digital sovereignty can arise, because under the "Patriot Act", a US law passed in 2001 in the wake of the September 11th terrorist attacks, US intelligence agencies have been given expanded powers to monitor electronic communications on American soil. This law was reinforced in 2018 by the Cloud Act, which applies overseas (the o in "cloud"). They can ask major platforms for any information on a company or an individual for anything pertaining to their homeland security. Hosting data on French or European infrastructures is one way of preserving our digital sovereignty, provided that the company operating these infrastructures is not subject to US law. We are dealing here with intelligence, and the commercial and industrial risks remain limited.
Beyond the Patriot Act and the Cloud Act, and despite ethical regulations in Europe, there is nothing to prevent a European supplier from using our data to enrich its AI or conduct intelligence. The only effective protection is data encryption.
In addition, the question of unlawful uses of generative artificial intelligence is a major source of concern. Ill-intentioned individuals could use ChatGPT to create and propagate malicious information or fake news with the aim of manipulating public opinion. This threat could have devastating consequences for society, most notably in the fields of politics and finance.
Finally, cybercriminals have no qualms about using AI to generate ever more realistic and persuasive content to trick recipients, or create Internet browser plugins or smartphone Apps. Under the pretext of making our lives easier, being more user-friendly, and allowing better access to AI, these can be veritable Trojan horses, legitimately installed by users, which capture all the sensitive data of the users who use them.
… to protecting businesses’ information assets
The entire ecosystem revolving around AI is a hotbed for particularly malevolent activities that overexpose companies' information assets through individual behaviour.
As we have seen in previous episodes, the use of AI potentially exposes Egis' data or that of our clients. There is no guarantee that the data transmitted will remain confidential. It could be used by the companies that own these AIs for commercial purposes (resale of information), or transferred by the AI to other users. All these data breach risks - sensitive, innovation, safety-related data for example, which might be protected by regulations or a contractual clause - could have a significant impact on a company's reputation.
The inadvertent use or lack of knowledge of these rogue plugins or Apps can lead to the leakage or theft of sensitive data (such as technical connection information), providing cybercriminals with valuable information for breaking into a company's Information System, taking the opportunity to steal and then resell the data on the Darkweb, and when they can, blocking the entire Information System.
The development of artificial intelligence also raises a number of legal issues.
The first problem concerns legal liability. With AI, it is hard to find a responsible party. If an autonomous car "makes a decision" that affects a human being, who is responsible? The algorithm and its designers? The quality of the training data? The manufacturer? The driver?
This is why the explicability of the algorithm's decision is so important. Unfortunately, in many cases, the number of parameters is so high that it is almost impossible to clearly explain an AI "decision".
Generally speaking, legal issues arise both upstream and downstream of artificial intelligence. Upstream first, with training data. Has it been collected or used in an uncontrolled way (which seems to be the case for ChatGPT), i.e., in breach of copyright under the terms of the European definition?
Another key question: who is the author of a text produced by ChatGPT or an image generated by Midjourney? Various countries are currently trying to define (or not define) copyright protection rules. At the time of writing, the text produced by ChatGPT belongs to the person who wrote the prompt (ChatGPT is considered to be the artist's brush). And images generated by Midjourney are not protected by copyright.
As always in the field of technological innovation, legislation and ethics are an afterthought... We need to know what is being used to be able to prevent abuses. However, this legal approach based on usage is less and less in line with the reality of new AI solutions, which have a multitude of possible uses.