< Go Back

Balance of Intellectual Property and AI

For at least 500 years now, we've been speculating about intellectual property. In turn, each new round of scientific and technological progress has only intensified the debate on this controversial topic. Mankind has developed rights to musical works for composers, and even photography as a man-made art can also be protected. In the last century, with the widespread availability of music recordings and VHS, the issue of rights to sample and distribute copies of movies became even more acute. The advent and rapid development of generative AI makes the situation more confusing than ever, and likely requiring new solutions.

Challenges of Balancing Intellectual Property and AI

Very soon we will have simple apps for our gadgets in widespread use, and each user will have the opportunity, for example, to sing any song in the voice of a certain famous artist. At first glance, it would seem a cool opportunity for entertainment, wouldn't it? However, a number of obvious questions are bound to arise. What about intellectual property? What benefit will the right holder get? By the way, record companies are already actively discussing this topic with Google.

The thing is that none of us can perfectly recreate someone's voice, but if you listen a hundred or a thousand times to a track, you may get something like "pastiche". In this case we would not need to pay royalties to the author. At the same time, the question remains open as to what to do in a similar situation, but using AI.

It's important to note that this problem is relevant not only to music, but to art in general. For example, Midjourney's request to create an image in the style of a famous artist can be perceived in different ways. Some people would consider it stealing, but when visiting galleries in lower Manhattan or Mayfair, most people would take the exact opposite stance. For example, if you make an image "in the style" of a certain famous artist, fans of that artist's work will not accuse you of plagiarism.

There is another problem with the use of AI. Obviously, no one is going to demand a reward from some news resource for posting a link to their news on their Facebook profile. The same newspaper, by the way, pays nothing to a cafe for writing a review of the establishment. At the same time, if you ask ChatGPT for an explanation of a popular article for corporate publication, the news resource has grounds for indignation, because now it turns out that the company is directly using their news. Therefore, the blocking of the ChatGPT web crawler by news sites is understandable.

Now imagine how an intern in the company will read a number of articles and present their summary in writing. Both in this case and in the examples described above with ChatGPT, the content itself is not actually reproduced. It is likely that the stated terms of service may be violated, whereas the summary (not to be confused with excerpts) usually does not fall under copyright. In fact, no one ever suggests that a certain newsletter may infringe the copyrights of the source sites.

The Essence of AI

Today, AI is creating data for use on a scale never before seen. It is this difference in scale that is especially important in current realities. Therefore, it is important to understand what the law should be that regulates the interaction between generative AI and intellectual property.

However, the real mystery isn't whether you can point ChatGPT to the actual headlines. The question is rather that all the headlines are somewhere in the training data, while not being present in the model.

It is not known what exactly OpenAI is using today, or whether it is using pirated books, but it is clearly using Common Crawl in part . And that's a double-digit percentage of the entire Internet. However, training data is not a model, while LLMs cannot be considered databases. By trawling through, massive amounts of textual information they only identify certain patterns in the language, without preserving them. For example, ChatGPT can analyze a lot of information from different editions without saving it. Thus, the goal of LLM is to identify patterns in the product of all human intelligence, but in no way to determine the content of any story.

OpenAI should not be compared to Napster, because it does not steal texts, much less give them away for free, because it does not need a specific text at all. In addition, OpenAI is capable of retraining ChatGPT without any newspaper excerpts if necessary. It will probably lose the ability, for example, to answer detailed queries about the top of the best public places with reference to a certain location, but such functionality was not originally intended for it. In this vein, "intelligence" is rather defined as a result of analyzing indirect indicators of people's thinking.

It turns out that OpenAI does not use any particular textual material, but "all" textual material at once. So it is not critical for it if someone deletes, for example, several separate articles.

The Present and Future of AI

That's where a whole host of questions come into play, about whether we should change the laws regarding fair use of AI if this core technology for the next decade is actually depends on all of us.

This technology functions because the result of introducing a huge amount of open AI data has exceeded all possible expectations. That being said, there is no way to re-introduce an equally large amount of data today. Given the scale and cost, all research will now be focused on improving algorithms using the least amount of data possible.

However, the output result is no less interesting than the question of what the model includes. You can use an engine assembled from music of the last half century to create something completely unique and new. Despite the mass of questions voiced above, there is one obvious thing - these are tools that can be used to create works of art or just all sorts of pictures. Anyone can buy a professional camera and take pictures, but that's not enough to be called an artist. In reality, the art of photography is to choose the right angle, to have your own vision of the process and the result. At the same time, any of us can punch a query into Midjourney or ChatGPT without any skills, but the problem is that it's hard to get something good out of the output. For now, they're just in the beginning stages, but eventually people will use them to create art objects.

The Collision of Intellectual Property and AI

Meanwhile, for example, Spotify today has a lot of "white noise" that is being dropped to listeners through recommendation algorithms, while getting paid comparable listening fees to world-famous musicians. This is where the question arises: how are we going to select really good music from the pile of content, if suddenly mass production of "music in the style of anyone" begins.

In this regard, it is also useful to consider an illustrative example from the past. In London there is a copy of Dürer's engraving entitled "Saint Anne under the golden gate in Jerusalem, after Dürer" executed in the 16th century by the hands of Raimondi, a pupil of Raphael. So it is known that Durer was extremely dissatisfied with such a copy and tried to sue about it. The trial ended with a decision according to which Raimondi was allowed to make copies, but without Dürer's logo. Here is a good example of an intellectual property court ruling that shows the different views on the idea of authenticity. The picture is essentially the same these days, as some of us don't like ChatGPT and others don't care at all.

Source: “Generative AI and intellectual property”, Benedict Evans.

We are happy to talk with you!

Connect with us via social messenger, chat or other legal method!

Last Posts

Let’s connect✌️

You can fill form or write directly to


    We will contact you soon as possible !