Clarification Needed on Techniques for Handling Structured vs. Unstructured Data in AI

YuanX · August 16, 2024, 6:04am

Hi everyone!

I’m currently taking the course “[AI for Everyone]” (Week 1: What is Data), the link is https://www.coursera.org/learn/ai-for-everyone/lecture/dLSWR/what-is-data.

I came across a few statements that I’m hoping to get some help understanding:

“The techniques for dealing with unstructured data are different than the techniques for dealing with structured data. Germs of AI today are used primarily to generate unstructured data, rather than structured data. Supervised learning can work very well for both structured and unstructured data.”

I’m curious about the techniques used for unstructured data. Are these techniques meant to transform unstructured data into a structured format before processing? Is supervised learning the best way to apply structured and unstructured data?

Also, I don’t understand the sentence " Germs of AI today are used primarily to generate unstructured data, rather than structured data ". Could someone explain what that means?

Thanks in advance for any insights you can provide!

TMosh · August 16, 2024, 6:32am

To answer part of your question:

“Germs of AI today…” makes no sense to me.

I believe there was a mistake in editing the video, and part of the word was clipped-off by accident.

I do not know what the intended word might have been.

TMosh · August 16, 2024, 6:33am

If anyone else wants to try to decode the intended dialog, it’s at time mark 10:16 in the linked video.

YuanX · August 16, 2024, 6:35am

Thanks @TMosh

ai_curious · August 16, 2024, 4:20pm

Using the housing price example from the video, the position of a particular value in the data has meaning. You can think of the data as having columns, which convey the meaning, and rows, which are the values (conceptually true regardless of the exact mechanism for storing the data). The data is said to have a structure, because there is a regular pattern. This is why the spreadsheet analogy is useful. The columns tell you the structure and the rows all have the same pattern. Another word used in this context is schema.

Unstructured data might be the pixels in an image, the words in a document, or the frequency in an audio file. Unlike in structured data, there is no required or explicit regularity, so the position of a value carries no meaning. Any pixel value, word, or sound can occur in any order.

Generally there is no reason, ability, or purpose to attempting to impose such regularity, so my answer to your question is ‘No, you are not generally attempting to transform unstructured data into a structured format before processing.’

If you’re just starting your machine learning and AI journey (my assumption) it may be premature in this thread to talk in any detail about how the techniques of supervised and unsupervised learning differ; plenty of opportunity and resources available elsewhere to go deep on this topic. But I would say the decision of whether to use supervised or unsupervised learning depends more on what kind of meaning or story you want to extract from the data, and less on whether the data itself is structured or unstructured. Both supervised and unsupervised learning have broad applicability to both structured and unstructured data. Hope this helps.

Regarding the ‘Germ…’ sentence, I have no idea what is going on there. I find it odd that Prof Ng talks about generating data, because the remainder of the video focusses on the type of data that AI consumes. Modern AI is certainly capable of generating content, but that just isn’t what this video is about. Sorry, can’t help here.

Topic		Replies	Views
Data contains both structured data, unstructured data Neural Networks and Deep Learning	1	498	June 18, 2022
W1_Quiz1_Q6_&_7 Neural Networks and Deep Learning	3	277	December 7, 2023
Can AI/ML summarize a combination of unstructured and structured data in a meaningful way? AI Discussions	3	38	March 22, 2023
Supervised learning & structured data Generative AI for Everyone week-1	1	412	November 29, 2023
Best Approaches for Classification with Limited Data: Handling Structured & Unstructured Data in a Generalized Model AI Discussions ai-discussions	0	17	February 2, 2025

Clarification Needed on Techniques for Handling Structured vs. Unstructured Data in AI

Related topics