Quality Assurance for AI Products

Revolutionizing QA Practices for AI
A Workshop for Everyone in the AI Product Lifecycle

Introductory Video

Summary:
This workshop, hosted by a DLAI alumnus, helps bridge the communication gap between business stakeholders, software QA teams, and AI developers. We’ll cover QA strategies specific to AI, handling unexpected outputs, microservice architecture for better collaboration, and a hands-on project to solidify your understanding.

About Me:
I’m Ammar Mohanna, a DLAI alumnus with hands-on experience building an AI services company. During my time in the field, I saw first-hand the communication challenges that can arise within AI product development teams. That inspired me to create this workshop, drawing from my DLAI learnings and real-world experience.

What We’ll Learn:

  • Key distinctions between traditional software QA and AI product QA.
  • How to identify and address unexpected AI model outputs.
  • Improve cross-team communication for better collaboration.
  • Utilize microservice architecture for efficient versioning and testing.
  • Gain practical experience through framework explanations, real-world case studies, and a hands-on project.

Framework Introduction:
I’ll present a tailored framework emphasizing adaptive strategies for the unique challenges of AI development. Informed by my DLAI background and practical experience, this framework benefits all stakeholders involved in the AI product lifecycle.

Framework Explanations/Examples:

  • Explore the complexities of unexpected AI outputs and retraining processes.
  • Understand Internal Endpoints (IEP) and External Endpoints (EEP) for streamlined collaboration.
  • Examine real-world scenarios demonstrating successful framework implementation.
  • Facilitated discussion: “Enhancing Collaboration in Your AI Product Lifecycle”

Hands-On Project: Face Detection

  • Step-by-step project to apply the learned QA strategies practically.
  • Sample dataset and basic face detection code provided.
  • Improve QA by identifying unexpected outputs, suggesting retraining strategies, and using IEP/EEP architecture – informed by my DLAI learnings.
  • Collaborative discussion on challenges and solutions.

Exercises for Further Practice:

  • Design scenarios with unexpected AI outputs for technical and non-technical teams.
  • Interactive session demonstrating IEP/EEP implementation for diverse roles.

Feedback & Community Engagement:
I invite participants of all backgrounds to share their experiences, ask questions, and continue the learning process on the community.deeplearning.ai forum. Let’s build on the foundation DLAI has provided us!

19 Likes

Thank you for this workshop! Very important topic.
In my experience, companies struggle in navigating AI QA because of the nature of AI never being 100% accurate.
I believe that AI QA should try to break the system instead of finding its weaknesses, if that makes sense.
Do you have any advice on how to allow QA/higher-ups navigate what is an acceptable weakness and what is a bug?

3 Likes

Hello. I am interested in the workshop. When is the workshop and how long is it going to be?

2 Likes

Hello. I am interested in the workshop

2 Likes

Interested !

2 Likes

Interested!

2 Likes

Interested!!

2 Likes

Hello @Jana_kabrit!
Welcome to DeepLearning.AI community.

Thank you for sharing your insights! You raise an excellent point about how companies grapple with the inherent uncertainty in AI systems.

I wholeheartedly agree that AI QA should adopt a mindset focused on “breaking” the system. This means proactively seeking out edge cases and scenarios where the model might fail spectacularly. Here’s why this approach is crucial:

  • Understanding Risk: Pushing AI models to their limits helps us map out the range of potential errors and their severity. This knowledge then informs risk-based decision-making on where to set the acceptable performance thresholds.
  • Transparency with Stakeholders: Being upfront about potential failure modes fosters trust between technical teams and business stakeholders. It allows for more meaningful conversations about trade-offs and the cost of achieving a particular level of performance.
  • Proactive Retraining: “Breaking” the system provides insights into the kinds of data needed to improve the model and reduce its tendency to fail in critical ways.

Here’s how to help QA and leadership distinguish between acceptable weakness and a bug:

  • Define Severity: Classify errors based on their impact on the system and on downstream business processes. A minor visual glitch in a recommendation engine might be acceptable, while a systematic bias in a medical diagnosis model would be an unacceptable bug.
  • Set Performance Metrics: Have clear expectations and metrics for precision, recall, F1-scores, etc., in line with the domain and use case. Use these to establish baselines and track improvements, providing visibility on acceptable ranges.
  • Stress Testing: Routinely subject the AI models to stress tests designed to expose unexpected behaviors. Log and analyze these failures as a vital part of the QA cycle.

Remember: AI QA is an iterative process. Open communication, continuous monitoring, and a willingness to retrain and refine models are essential for navigating this evolving landscape. I encourage everyone involved to embrace a “seek to break” mentality for better, more trustworthy AI systems.

Let’s keep the conversation going!
Do you have real-world examples of setting acceptable performance thresholds for their AI products?

3 Likes

Hello @Shamiso,

Thank you so much for your interest in the workshop! I’m excited to hear you’re interested in this topic.

We’re still in the planning stages for the workshop, and your input is incredibly valuable. Gathering insights from the community helps us tailor the content to address the most pressing challenges people face.

Would you mind sharing a bit about:

  • Your role: Are you a software QA professional, a developer, a business stakeholder, or something else?
  • Your biggest challenge: What’s the most difficult aspect of QA for AI products that you encounter?

This will help me make sure the workshop covers topics relevant to your needs. We anticipate it being about one hour long.

I’ll keep you updated on the workshop date. In the meantime, let’s keep the discussion going!

4 Likes

Hello!

Thank you for being interested in the workshop! To make it the best it can be, could you tell me a little bit about yourself?

  • Your role
    What’s your job?
    Are you in software testing, a developer, work on the business side, or do something else entirely?

  • Your biggest challenge
    What’s your biggest challenge with testing AI products?
    What’s the most difficult aspect of QA for AI products that you encounter?

Your answers will help me make sure the workshop covers the things that matter most to you!

@Oksanna @CHITHRA @Vasili_Reikh @Shanjay_Nithiin

3 Likes

Thank you Ammar. a good short video which helps explain your direction. I would love to join any future sessions.

I am CEO of a very small UK based charitable organisation. I am not a coder or even an advanced AI practitioner.

I am just someone who sees the benefit of helping small “not for profit” organisations harness and utilise AI and emerging technologies, for social good. I like to think, jokingly that I am an “inventor of solutions” to solve “societies problems”.

Charitable sector organisations all over the world are always in the frontline in our communities, helping economically and socially disadvantaged families, yet they are under funded, under resourced and yet the demand for their help is constantly rising.

I am not an advanced AI techy person, but already I am trying, with mixed results to develop 2 web bots both aimed at helping our communities. I will not launch however without feeling comfortable with them providing accurate and up to date information.

One is an “Domestic Energy Advice” bot aimed at giving up to date information about how to reduce energy consumption to save money and help our environment, and the other an assistant chatbot specifically designed to help “not for profit” organisations evaluate client data and help writing better funding bids etc.

I am always worries however Ammar that the information is up to date and accurate, so I am learning how to better prompt engineer the bots and even learning as quickly as possible the ups and downs of using knowledge based, website based or combination bot which can access both.

Already have encountered issues with web bots not being able to read what I would consider easy lists of PDF data on uploaded documents where the bot reads half correctly then stops. Other times the bots response start at wrong date?

So I would welcome the opportunity to collaborate, learn more and talk to others on this learning site.

2 Likes

software QA
moving towards AI QA

2 Likes

I, too, am interested in this workshop

2 Likes

I am interested

2 Likes

I am interested too.

2 Likes

Hello!
In my very humble experience it’s sometimes the vague data that is given by the client that affects the interpretation of the results and leads you to have more questions than answers .

Is someone who is still training eligible to attend the workshop ?

2 Likes

Hello @MFangler1,

Thank you so much for your interest and insightful comment! It’s wonderful to see the drive for using AI for social good within the charitable sector. Your commitment to helping under-resourced organizations truly resonates with me.

I’d love to learn more about your experience with developing web bots. I’m particularly interested in the following:

  • Specific challenges: Can you elaborate on the issues you’ve faced with bots misinterpreting PDF data and providing responses based on the wrong dates? Are there any patterns you’ve noticed?
  • Data sources: What types of data sources do you primarily rely on for your “Domestic Energy Advice” and funding assistance bots (e.g., static knowledge bases, live websites, PDFs, a combination)?
  • Collaboration interest: Would you be open to exploring ways we could potentially collaborate or share knowledge to help refine your bots’ accuracy and address those challenges?

I believe your experience offers valuable insights for everyone interested in applying AI for positive impact. Let’s keep the conversation going!

2 Likes

Hello!

Thank you for being interested in the workshop! To make it the best it can be, could you tell me a little bit about yourself?

  • Your role
    What’s your job?
    Are you in software testing, a developer, work on the business side, or do something else entirely?
  • Your biggest challenge
    What’s your biggest challenge with testing AI products?
    What’s the most difficult aspect of QA for AI products that you encounter?

Your answers will help me make sure the workshop covers the things that matter most to you!

@Lenny_Tevlin @LuvsCurves @Kalyani_G @Girijesh

2 Likes

Hello @olasadek, welcome to the deeplearning.ai community forum!

You bring up an excellent point! Vague or incomplete data can absolutely throw a wrench into the best-laid AI plans. It highlights the importance of clear communication from the beginning of any project.

As for your question, absolutely! This workshop is designed to help everyone involved in the AI product lifecycle, regardless of their current training level. We’ll be focusing on communication, collaboration, and practical strategies – all essential skills even for those still building their technical expertise.

Do you have any specific examples of how vague client data has caused issues in past projects? I’d love to hear more and see how we can incorporate these insights into the workshop.

3 Likes