AI cannot do well on a training set

Mamdouh_Dabjan · July 31, 2023, 12:17am

Hey people, I am doing an audio ML project and would love your feedback on how to progress.
I am working on an AI that takes in Mel spectrograms of audio recorded by an old software that has the shape (128, 128, 1) and enhances its quality.
To make sure that the model is working, I am taking the same sound that is pulled from Youtube and passing it as both the x and y (input and output).
I have a deep model that is similar to the U-net architecture where there is an encoder that reduces the representation of the Mel spectrograms and a decoder that increases it to its original form.
From my understanding, the model should be able to almost perfectly reconstruct this audio if the model is properly defined but it seems like it is producing terrible results in that the audio isn’t even speech anymore.
I have been working on this for a couple of weeks and would love to hear your feedback.
Note: The loss function I am using is the SI-SDR added to 0.5 the MSE(I tried either alone and it was equally as bad)
What do you think is causing this? Would love to hear new perspectives!

Topic		Replies	Views
Project Breakdown AI Discussions ai-discussions	1	85	December 8, 2023
Course Advise - Post PhD AI Discussions feedback , ai-discussions	2	142	March 30, 2023
[Classic ML] Article AI Discussions	3	78	November 5, 2023
Finetuning logs AI Discussions	0	99	September 22, 2023
Not getting better in AI AI Discussions feedback , ai-discussions , careers , project	5	131	May 5, 2025

AI cannot do well on a training set

Related topics