ChatGPT For Content and SEO?

Here are six things to know about ChatGPT before using it for SEO and content

Highlights:

  • Algorithmic watermarking may reveal ChatGPT content
  • How to use ChatGPT for SEO
  • Research on detecting AI content
  • ChatGPT is an artificial intelligence chatbot that can take directions and accomplish tasks like writing essays. There are numerous issues to understand before making a decision on how to use it for content and SEO.The quality of ChatGPT content is astounding, so the idea of using it for SEO purposes should be addressed.

    Let’s explore.

    Why ChatGPT Can Do What It Does

    In a nutshell, ChatGPT is a type of machine learning called a Large Learning Model.

    A large learning model is an artificial intelligence that is trained on vast amounts of data that can predict what the next word in a sentence is.

    The more data it is trained on the more kinds of tasks it is able to accomplish (like writing articles).

    Sometimes large language models develop unexpected abilities.

    Stanford University writes about how an increase in training data enabled GPT-3 to translate text from English to French, even though it wasn’t specifically trained to do that task.

    Large language models like GPT-3 (and GPT-3.5 which underlies ChatGPT) are not trained to do specific tasks.

    They are trained with a wide range of knowledge which they can then apply to other domains.

    This is similar to how a human learns. For example if a human learns carpentry fundamentals they can apply that knowledge to do build a table even though that person was never specifically taught how to do it.

    GPT-3 works similar to a human brain in that it contains general knowledge that can be applied to multiple tasks.

    The Stanford University article on GPT-3 explains:

    “Unlike chess engines, which solve a specific problem, humans are “generally” intelligent and can learn to do anything from writing poetry to playing soccer to filing tax returns.

    In contrast to most current AI systems, GPT-3 is edging closer to such general intelligence…”

    ChatGPT incorporates another large language model called, InstructGPT, which was trained to take directions from humans and long-form answers to complex questions.

    This ability to follow instructions makes ChatGPT able to take instructions to create an essay on virtually any topic and do it in any way specified.

    It can write an essay within the constraints like word count and the inclusion of specific topic points.

    Six Things to Know About ChatGPT

    ChatGPT can write essays on virtually any topic because it is trained on a wide variety of text that is available to the general public.

    There are however limitations to ChatGPT that are important to know before deciding to use it on an SEO project.

    The biggest limitation is that ChatGPT is unreliable for generating accurate information. The reason it’s inaccurate is because the model is only predicting what words should come after the previous word in a sentence in a paragraph on a given topic. It’s not concerned with accuracy.

    That should be a top concern for anyone interested in creating quality content.

    1. Programmed to Avoid Certain Kinds of Content

    For example, ChatGPT is specifically programmed to not generate text on the topics of graphic violence, explicit sex, and content that is harmful such as instructions on how to build an explosive device.

    2. Unaware of Current Events

    Another limitation is that it is not aware of any content that is created after 2021.

    So if your content needs to be up to date and fresh then ChatGPT in its current form may not be useful.

    3. Has Built-in Biases

    An important limitation to be aware of is that is trained to be helpful, truthful, and harmless.

    Those aren’t just ideals, they are intentional biases that are built into the machine.

    It seems like the programming to be harmless makes the output avoid negativity.

    That’s a good thing but it also subtly changes the article from one that might ideally be neutral.

    In a manner of speaking one has to take the wheel and explicitly tell ChatGPT to drive in the desired direction.

    Here’s an example of how the bias changes the output.

    I asked ChatGPT to write a story in the style of Raymond Carver and another one in the style of mystery writer Raymond Chandler.

    Both stories had upbeat endings that were uncharacteristic of both writers.

    In order to get an output that matched my expectations I had to guide ChatGPT with detailed directions to avoid upbeat endings and for the Carver-style ending to avoid a resolution to the story  because that is how Raymond Carver’s stories often played out.

    The point is that ChatGPT has biases and that one needs to be aware of how they  might influence the output.

    4. ChatGPT Requires Highly Detailed Instructions

    ChatGPT requires detailed instructions in order to output a higher quality content that has a greater chance of being highly original or take a specific point of view.

    The more instructions it is given the more sophisticated the output will be.

    This is both a strength and a limitation to be aware of.

    The less instructions there are in the request for content the more likely that the output will share a similar output with another request.

    As a test, I copied the query and the output that multiple people posted about on Facebook.

    When I asked ChatGPT the exact same query the machine produced a completely original essay that followed a similar structure.

    The articles were different but they shared the same structure and touched on similar subtopics but with 100% different words.

    ChatGPT is designed to choose completely random words when predicting what the next word in an article should be, so it makes sense that it doesn’t plagiarize itself.

    But the fact that similar requests generate similar articles highlights the limitations of simply asking “give me this. ”

    5. Can ChatGPT Content Be Identified?

    Researchers at Google and other organizations have for many years worked on algorithms for successfully detecting AI generated content.

    There are many research papers on the topic and I’ll mention one from March 2022 that used output from GPT-2 and GPT-3.

    The research paper is titled, Adversarial Robustness of Neural-Statistical Features in Detection of Generative Transformers (PDF).

    The researchers were testing to see what kind of analysis could detect AI generated content that employed algorithms designed to evade detection.

    They tested strategies such as using BERT algorithms to replace words with synonyms, another one that added misspellings, among other strategies.

    What they discovered is that some statistical features of the AI generated text such as Gunning-Fog Index and Flesch Index scores were useful for predicting whether a text was computer generated, even if that text had used an algorithm designed to evade detection.

    6. Invisible Watermarking

    Of more interest is that OpenAI researchers have developed cryptographic watermarking that will aid in detection of content created through an OpenAI product like ChatGPT.

    A recent article called attention to a discussion by an OpenAI researcher which is available on a video titled, Scott Aaronson Talks AI Safety.

    The researcher states that ethical AI practices such as watermarking can evolve to be an industry standard in the way that Robots.txt became a standard for ethical crawling.

    He stated:

    “…we’ve seen over the past 30 years that the big Internet companies can agree on certain minimal standards, whether because of fear of getting sued, desire to be seen as a responsible player, or whatever else.

    One simple example would be robots.txt: if you want your website not to be indexed by search engines, you can specify that, and the major search engines will respect it.

    In a similar way, you could imagine something like watermarking—if we were able to demonstrate it and show that it works and that it’s cheap and doesn’t hurt the quality of the output and doesn’t need much compute and so on—that it would just become an industry standard, and anyone who wanted to be considered a responsible player would include it.”

Leave a Reply

Your email address will not be published. Required fields are marked *