As generative AI tools have become more mainstream, it’s no surprise that we consumers are facing a slew of products supercharged with the power of artificial intelligence (AI). That said, like any new technology, its uses range from game-changing (I’m looking at you, meeting transcribers) to downright questionable.
I recently had a conversation where I learned about an AI content tool I found rather dodgy. In essence, the AI model generates content and then rewrites it four or five times so (according to this individual) it “wouldn’t be considered plagiarism,” meaning that the writing wouldn’t be flagged by plagiarism checkers or AI detectors.
As I listened to this person gush about how a tool like this would help my content team, all I could think was, actually, this would hinder my team and any other creators who hold themselves to the high standard of producing unique, high-quality content.
That said, the claims piqued my interest because they brought up questions like, “Where is the line between AI-generated content and plagiarism?” and “What are the ethical implications of relying on AI in the content creation process?”
Ultimately, while AI tools can be helpful in generating ideas and streamlining parts of the content creation process, I believe we, as creators, have an ethical duty to ourselves and each other not to overly rely on these tools to the point where we lose originality and human insight. Here’s why.
Defining Plagiarism: What You Need To Know Before Assessing AI-Generated Content
According to the Oxford English Dictionary, the definition of plagiarism is, “The act or practice of taking someone else’s work, idea, etc., and passing it off as one’s own.”
In practice, there are a few types of plagiarism that appear most frequently:
- Direct plagiarism: Copying another person’s work verbatim without giving credit or using quotations to distinguish their words.
- Mosaic plagiarism (also known as “patchwriting”): Taking phrases from a source without crediting them or copying someone else’s ideas and replacing some words with synonyms without crediting them.
- Self-plagiarism: Republishing segments of your own previous work in a new piece. Self-plagiarism in itself is not inherently good or bad. It’s typically only an issue if you’re being purposely deceptive.
- Accidental plagiarism: Low standards of care that include failing to cite sources, citing sources incorrectly, and neglecting to provide citations for paraphrased information.
In short, when you’re taking and using someone else’s words or ideas, you need to provide correct citations. That means quoting direct text and providing attribution for direct text and paraphrased ideas. If you’re not doing this, it’s considered plagiarism.
Now, before I dive into the discussion of the potential plagiarism of AI writing, it’s important to understand how AI chatbots produce content.
How AI Generates Content
To generate content with an AI tool, you first have to provide instructions through a prompt. The AI tool interprets the prompt and generates content based on information in its training dataset.
Training datasets vary between tools, and we often don’t know exactly what content was used to train each tool. For example, OpenAI says its various ChatGPT models were “trained on vast amounts of data from the internet written by humans” but fails to elaborate any further.
Often, large language models (LLMs) like ChatGPT rely on information gathered by data scraping, which is the use of web crawlers to collect information from third-party websites.
While scraping data isn’t inherently wrong or illegal, in many cases, the data powering LLMs is obtained without explicit consent from the source or payment for its use.
For users,
this means it's possible for your AI-generated content to contain phrases or ideas that were plagiarized from the source material.
Is Using AI Plagiarism?
If your AI-generated text copies words from another source verbatim, paraphrases too closely, or presents another creator’s original thoughts or ideas without proper citation, it is plagiarism.
As humans, we often perform research and use the work of other creators to inform our own, ideally with proper credit and citation. The difference is that each human processes and interprets their research differently and uses it to create something new.
When an AI references human work to generate content, that same level of cognitive processing doesn’t happen. In addition to lacking citations, AI-generated content is simply a regurgitation of the data it’s been trained on without adding a unique and individual perspective.
So, the question becomes, what sources are these AI content generators using to create content? As we saw with OpenAI, these companies aren’t necessarily disclosing the details.
That said, lawsuits like the one brought against OpenAI by the New York Times have begun to shed light on some of the sources that make up ChatGPT’s training data.
The New York Times lawsuit claims that some of ChatGPT’s responses included “near-verbatim excerpts” from Times articles. If these claims prove true, it’s quite possible that tools like ChatGPT are plagiarizing the authors of its training database content by taking their ideas and words without citation.
As a result, authors and brands that use AI tools to create and publish content may unintentionally plagiarize other creators. At the end of the day, if it’s your name on the published piece, you can be held liable for plagiarism and copyright infringement.
Training Your Own AI Tools: The Potential Exception
One way to avoid issues with plagiarism is by taking more control over what information your AI application uses. For instance, you can use your brand style guide or previous articles to train AI-powered tools like Grammarly Custom Style Guides and custom GPTs.
Wondering how to build your own personal GPT? You can check out my step-by-step guide that shows how to create a brand editor based on your style guidelines.
By limiting the scope of applications to accessing just your own work, you can streamline your content process without the risk of plagiarizing someone else.
Ethical Considerations for AI-Generated Content
As content creators and marketers, it’s crucial that we hold ourselves and our community to a high standard for content quality.
Here are the standards we need to consider when using AI to support consistent content creation.
Content Quality/E-E-A-T
The internet isn’t suffering from a content shortage. As creators, we must ensure we’re adding value, not simply recycling and reformatting what’s already there. If we sink to the level of, “This won’t get caught by an AI or plagiarism detector, then it’s fine,” we do ourselves and our audiences a great disservice.
The SEO community might argue that Google hasn’t explicitly forbidden the use of AI content. To be fair, it doesn’t say you should, either, but let’s look at the AI-generated content guidelines and read between the lines.
AI may be able to generate content that sounds as if it’s written by a human, but it lacks real-life experience, expertise, or new ideas to add to the conversation. Not only are these the cornerstones of Google’s E-E-A-T guidelines, but they’re the elements that resonate with readers and make your content worth the time it takes to read or view.
That’s why we don’t use AI to generate content at The Blogsmith. We help our clients build meaningful connections with their audiences by getting to know and understanding each brand we work with and creating content that delivers value for their target audiences.
We’re also seeing a clear trend away from AI-generated content reflected in Google’s algorithm updates. Specifically, the helpful content guidelines steer creators away from creating low-quality AI content, with the March 2024 core update resulting in manual deindexing of websites built on irrelevant/low-quality AI content. One report analyzed 14 publicly revealed manually deindexed websites that published AI-generated content.
Fact-Checking and Google AI Overviews
AI writing tools may be able to craft text that sounds human, but the programs don’t go through the same process of fact-checking and editing that human experts use. Most modern AI tools (as of publishing this article) are liable to confidently deliver incorrect information, oftentimes without citing any sources.
At the March 2024 Build a Better Agency Summit, Jim Sterne of Targeting.com described AI as being useful for generating ideas but unreliable for determining facts. This straightforward and fundamental framework has stuck with me as I consider potential applications for AI in my work process.
For example, Google’s AI Overview feature has been delivering answers that range from odd to downright wrong or dangerous, including responses based on old forum discussions or satirical articles. Some more entertaining responses included confirming that astronauts have encountered cats on the moon, but one result suggested the relaxing benefits of bathing with a toaster (absolutely do not try this at home!).
While Google is working on fixes for nonsensical answers (after a significant rollback in AI Overview appearances in the SERPs), issues with bias or discrimination can be more subtle and difficult to identify. That said, like any other tool, AI isn’t inherently good or bad. The more productive discussion focuses on finding the most responsible and ethical ways to integrate AI tools in the content creation process.
Given that, if we use that information without any additional human due diligence, it’s not the machine’s fault. As content creators, the responsibility for fact-checking, verifying claims, and citing sources still lies with us.
Disclosure, Reputation, and YMYL
At the very least, if you’re publishing an article written entirely by an AI tool, you should be disclosing that fact along with any steps you took to fact-check or verify the information.
One example of this disclosure in action is CNET’s attempt at using AI writing to help with certain articles in late 2022 and early 2023. The posts originally came with a pop-up disclosure explaining that posts authored by “CNET Money” used automatically generated content.
While CNET went on to make the disclosure accessible and added information about the team member who edited the piece, this particular AI experiment received backlash early, providing some valuable lessons for creators.
Notably, the brand had to issue corrections on over half of the AI-supported articles when readers found blatant plagiarism and confident incorrectness.
Slide from my June 2024 Denver Digital Summit Presentation about the volatility of SERP ranking.
In this situation, it appears that AI may have become too comfortable of a crutch. Unfortunately, the confident incorrectness that resulted created a negative impact on the brand’s reputation as a trustworthy source of personal finance expertise.
Trust is the most important thing you can have with readers and customers. These examples beg the question, “Why would you risk that?”
Furthermore, the topics that CNET was covering fall under Google’s “Your Money or Your Life” (YMYL) category, as they can impact the reader’s finances. Any articles that qualify as YMYL are evaluated even more strictly to ensure they meet E-E-A-T standards. While using AI may have increased some efficiency, the reputational risk with readers and search engines hardly seems worth the time savings in this case.
All that said, the spirit of disclosure that CNET displayed is a solid idea for ethically using AI. Brands can use this example and take it further by being more vigilant about editing for accuracy and originality and using disclosures that clarify whether the model was trained on the company’s internal content or undisclosed web-scraped data.
Certainly, readers deserve to know whether posts were generated by an AI tool or written by a human, especially when content is of an educational nature. And, at the end of the day, if you think that disclosing your AI usage would drive readers away from your content, it’s worth asking yourself, “Should I be using AI this way?”
Ultimately, if you post an article that claims to be written by a human but was generated by an AI tool, that’s a shaky foundation for building trust in important audience relationships.
Final Thoughts: Is Using AI Plagiarism?
There’s a place for AI in the world of content creation, and it’s best when used to help you get from step one to two, not skip from one to 10. Tomorrow’s most valuable workers are those who use human and artificial intelligences for their respective strengths.
If you go for one of the extremes — avoiding it entirely or relying on it completely — you can quickly become irrelevant. Either you refuse to embrace the new technologies, and you’re left in the past, or you become out of practice and lose your skills because you’re offloading them to a machine.
As content creators, we have an ethical duty to ourselves and each other to not overly rely on AI content creation because we can be hurting someone’s ability to earn or build a reputation on their original work.
At The Blogsmith, we see the value of AI as a support system for content creators, especially for transcribing interviews with experts or training tools that operate off our body of work and brand style guidelines. That said, we also hold ourselves to a high standard of quality and originality in everything we create. Even if AI content isn’t a verbatim copy of someone else’s work, ultimately, we find it to be plagiarism, and it’s not something we use in our process.