My Experience Using ChatGPT to Perform Testing Activities

ChatGPT is something you’re probably familiar with. Its plastered all over my LinkedIn feed and was a hot topic of conversation during my holiday gatherings. In the off chance you’re not in the know, ChatGPT is a machine learning model that will essentially perform a variety of tasks, provide information on endless topics, or just have conversation with you. It’s pretty cool, albeit a little frightening too. With such power comes great risk.

As I played around with it, I seriously started considering how this might change my day to day, or my team’s day to day, as testers. Could it speed up some of the mundane tasks we don’t enjoy? Could it add depth and creativity to our testing plans? Could it help people up-skill and learn test automation? These questions, and more, led me to this blog post. How can I use ChatGPT to increase our efficiencies and value as testers?

Writing Test Scenarios

I provided ChatGPT with an acceptance criteria: As a user I want to be able to recover the password of my account so that I will be able to access my account in case I forget the password. I then asked it to generate test scenarios for me based on the acceptance criteria provided. Here is the output I received:

“Here are some tests that could be used to validate the acceptance criteria you provided:

  1. Verify that the password recovery feature is available on the login page.
  2. Verify that the password recovery feature prompts the user to enter their email address.
  3. Verify that the system sends a password recovery email to the user when the correct email address is entered.
  4. Verify that the password recovery email contains a link to reset the password.
  5. Verify that the user is able to successfully reset their password using the link provided in the email.
  6. Verify that the user is able to login with their new password.
  7. Verify that the password recovery feature is only available to registered users.
  8. Verify that the password recovery feature is not available to unregistered users.
  9. Verify that the password recovery feature does not allow the user to reset the password of another user’s account.
  10. Verify that the password recovery feature is secure and does not allow unauthorized users to reset the password of an account.”

My Thoughts: I’m honestly quite impressed with the robustness of the test scenarios. Imagine being a beginner tester and having access to something powerful like this. Pushing it even further, this could greatly speed up an experienced testers ability to churn out test cases. What would have taken me an hour or two to think through and document, ChatGPT created these test scenarios in 10 seconds. This could leave more time for my team to do exploratory testing!

Though, I would argue that the acceptance criteria given was a pretty straightforward and simple example. It would be interesting to feed it something more complex. Let’s try!

Acceptance Criteria: As a user I want to be able to place a $20 win/place/show bet on a horse so that I can gamble on horse racing.

“Here are a few scenarios that could be used to validate the acceptance criteria you provided:

  1. Verify that the user can select a horse from a list of available horses and place a bet on it.
  2. Verify that the user can enter the amount of the bet (in this case, $20) and choose the “win-place-show” option.
  3. Verify that the user is able to review and confirm their bet before it is placed.
  4. Verify that the user’s account is charged the correct amount for the bet.
  5. Verify that the user is able to view their bet history and see their past win-place-show bets.
  6. Verify that the user is able to cancel a bet before it is placed.
  7. Verify that the user is unable to place a bet if they do not have sufficient funds in their account.
  8. Verify that the user is unable to place a bet if the horse they have selected is no longer available for betting.
  9. Verify that the user is unable to place a bet if the race has already started.
  10. Verify that the user is unable to place a bet if the race has already finished”

My Thoughts: That’s actually pretty solid! It broke down the various positive and negative scenarios and declared before race and after race expected results.

Writing Automation Scripts

Next, I took the first test scenario (Verify that the password recovery feature is available on the login page) and asked it to write me an automation script to test this using Java and Selenium. Here is the output I received:

Copy codeimport org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.testng.Assert;
import org.testng.annotations.AfterTest;
import org.testng.annotations.BeforeTest;
import org.testng.annotations.Test;

public class PasswordRecoveryTest {
  private WebDriver driver;

  @BeforeTest
  public void setUp() {
    // Set up the web driver
    System.setProperty("webdriver.chrome.driver", "/path/to/chromedriver");
    driver = new ChromeDriver();
  }

  @Test
  public void testPasswordRecoveryAvailability() {
    // Navigate to the login page
    driver.get("http://www.example.com/login");

    // Verify that the password recovery feature is available
    WebElement passwordRecoveryLink = driver.findElement(By.id("password-recovery-link"));
    Assert.assertTrue(passwordRecoveryLink.isDisplayed());
  }

  @AfterTest
  public void tearDown() {
    // Close the web driver
    driver.quit();
  }
}

This test does the following:

  1. Sets up the web driver and navigates to the login page.
  2. Uses the findElement method to locate the password recovery link on the page.
  3. Uses the isDisplayed method to verify that the password recovery link is visible to the user.
  4. Tears down the web driver after the test is complete

My Thoughts: Clearly there is some additional cleanup that would need to be done to actually use this (element id and URL need to be personalized, for example). In addition, there are quite a few assumptions made in this script. However, it would be really nice for someone learning Java and Selenium to be able to have this example of the test script. I don’t think ChatGPT, on its own, is a viable solution for test automation as is. Nevertheless, I firmly believe that AI and machine learning will continue to push codeless testing capabilities and test automation will become more achievable for anyone.

Writing Defects

I gave ChatGPT a defect summary and asked it to write my defect: Write a defect for software where the password recovery feature isn’t working. I purposely left it very generic and didn’t provide details of how it wasn’t working. Here is the output I received:

“Title: Password recovery feature not working

Description:

The password recovery feature is not working as expected. When a user clicks the “Forgot Password” link on the login page and enters their email address, they do not receive a password recovery email. As a result, they are unable to reset their password and access their account.

Steps to reproduce:

  1. Go to the login page (http://www.example.com/login).
  2. Click the “Forgot Password” link.
  3. Enter a registered email address and click “Send Recovery Email”.
  4. Check the email inbox for a password recovery email.

Expected result:

The user should receive a password recovery email with a link to reset their password.

Actual result:

The user does not receive a password recovery email.

Notes:

This issue is affecting all users who are trying to use the password recovery feature. It is preventing them from accessing their accounts and is causing frustration for our users.”

My Thoughts: I think it did a pretty good job writing the defect and providing details. It even made up its own details using assumptions, one of the fallacies of machine learning software. For example, a human would have asked more questions to ensure they were writing a defect that captured the true issue. ChatGPT just made assumptions and went with it.

What Else Can ChatGPT Be Used for in Software Testing?

Finally, I asked ChatGPT what it thinks it’s best use cases for software testing are. Here is the output I received:

“ChatGPT is a natural language processing (NLP) model developed by OpenAI that can generate human-like text based on a given prompt. ChatGPT can be used in software testing in a few different ways:

  1. Generating test inputs: ChatGPT can be used to generate test inputs that cover a wide range of possible scenarios. This can be useful for testing applications that have to handle different types of user input, such as chatbots or natural language processing systems.
  2. Testing chatbots: ChatGPT can be used to generate test cases for chatbots by providing prompts that are designed to test different aspects of the chatbot’s functionality. For example, you could use ChatGPT to generate test cases that check whether the chatbot can handle common requests or handle errors in a user’s input.
  3. Testing NLP systems: ChatGPT can be used to generate test cases for NLP systems by providing prompts that are designed to test the system’s ability to understand and respond to different types of input.
  4. Generating test documentation: ChatGPT can be used to generate test documentation, such as test plans or test reports, by providing it with prompts that describe the testing objectives and expected results.

Overall, ChatGPT can be a useful tool for generating test cases and test documentation in the software testing process. However, it is important to note that ChatGPT is a machine learning model and is not a replacement for human testers. It is recommended to use ChatGPT in combination with manual testing to ensure that the software application is thoroughly tested.”

My Thoughts: When I read the final paragraph I let out a sigh of relief. I remember when test automation really took off and became the defacto end goal. People thought that the art of testing could just be replaced by automation scripts. Manual testers were going to be an ancient artifact. I’m hopeful that people don’t jump on AI and machine learning and see it as the new “quick fix” trend. Using it as a compliment to a human, and their ability to critically think, is reasonable. Using it as a replacement is not reasonable. In addition, I thought the list of ways ChatGPT could be useful to testing was pretty limited and high level. Perhaps I need to ask my question differently to expect a more detailed answer.

Conclusion

Taking the time out to consider how machine learning tools like ChatGPT could add value to my day-to-day was fun and entertaining, to say the least. I was impressed by its ability to take simple sentences and requests from me and turn them into valuable outputs. Things that would typically take hours took ChatGPT seconds. There’s something to be said for efficiency gains like that.

However, this is a difficult conclusion for me to write. As much fun and value as I know ChatGPT can provide, it shouldn’t be considered a replacement for actual testing or a human’s ability to think for themselves. Sometimes people look for the easy way. They leave behind critical thinking and deductive reasoning in search of quick solutions or answers. ChatGPT can certainly provide insight and ideas to consumers of the tool. Where it gets risky is when people start relying on the tool for all of their insight or ideas. Keeping it as a tool in your toolbox isn’t a bad idea, one that I would advocate for. Nonetheless, It should never be a replacement for human thought and critical thinking.

Something that stood out to me throughout this experiment was ChatGPT’s inability to think critically. When you’re in a conversation with another person you’re both able to listen critically (well, most of us). If that person asks you a question, you’re able to take the literal question and pair it with the context of the conversation or problem you’re trying to solve and provide a response. Tools like ChatGPT are very literal. If I ask it a question it takes my literal words and provides output. If I rearrange the words in my question, still keeping the same meaning, I may get a different response. When dealing with ChatGPT, words, and sentence structure, are never more important and can greatly impact the response you get.

Finally, when asked about its own limitations and biases, ChatGPT listed the following as some of its main limitations: limited knowledge based on information received during training (no up-to-date information), no real knowledge of how the world actually works, no ability to carry out tasks in the real world, just provide information and respond to questions. I’ll leave it there.

4 comments

  1. Johanna, I love this one. I’ve been experimenting with ChatGPT quite a bit but hadn’t gone to testing yet.
    Great forward thinking!
    Mitzi

    Like

Leave a Reply