Hey, I am relatively new to DL, but have couple of hands-on experience with ANNs, CNNs, LSTMs. but for my new project I am looking to build an image de-noising model. which would fairly improve resolution of an image, or for an real life scenario, making an image legible (Ofc I am not expecting It to make totally blurred image an exceptional one). But idk where to start. I want to train model from scratch using tensorflow. I don’t want to use any pertained model or fine-tune some another model. (main purpose of this project is to learn not commercial use). so I would be glad if any of you can let me know some topics, resources or pipeline that I should look into. idk where to start, I have never generated image as an output of any model. so any help would be appreciated. Thank you
I do not have any domain knowledge on the particular topic you mention, but here are a couple of thoughts triggered by your “where to start” question:
- Have you done any “googling” on the topic? I’ll bet that there is quite a bit of previous work on this in Signal Processing and Image Processing. There are ways to add and remove noise from input signals that are purely mathematical, as opposed to based on ML.
- From a purely ML perspective, a critical “where do I start” question, is “where are you going to get good and plentiful training data?” Even if you come up with a great model architecture, you won’t get anywhere without a (probably) large amount of good quality training data.
Just throwing out ideas here, but maybe question 1) is actually relevant to question 2): you could start with the project of building an image processing function that adds noise to input images and use that to generate training data. There are some inherent risks in that strategy: your “deblurring” model may learn only how to handle the specific types of noise that your “make it more blurry” function creates. Of course that is another version of the question “what does real training data look like?”
Basically I was going for images in motion blurred format which are then de-noised or enhanced into a fairly static image, which could be further preprocessed. for example:
- If one need to check license plate, and got a blur motion image of car, with blurred license plate (ofc legible by humans, but not by OCR models) I was hoping to somehow get a clearer image of the same so that one can pass it into the OCR model.
- some portrait that was smudged or blurred or was taken with low res camera, and one need to convert to higher res
Ofc they are two different scenario and different models are supposed to be used, but this is what I have in mind.
I don’t think data would be much of an issue, I am not looking for making a commercially available product that has to be significantly better, just for the learning and understanding purpose.
Well, even if you are just doing this for personal education, it’s going to require at least a few hundred images to run training and validation. And the point is that you need the “labels” as well, right? When you’re doing “supervised learning”, you need the blurry inputs and you need the “sharpened” outputs. Otherwise what does the model learn from? How do you tell it what your goal is?
Of course as I mentioned above, there may well be non-ML approaches to this which use purely mathematical algorithms, in which case perhaps you would not need the labeled data.
That’s not going to work. De-noising cannot create missing information.
De-noising processes need a good model of the noise source you’re trying to reverse. Low-res isn’t a noise process - it’s just insufficient data.
I see your point, but what my initial thought process was: We use central tendency to fill out missing information in a dataset right? like if we have missing data entry we can replace it with appropriate values based on the other features. similarly why can’t we train a model that can give me a value that would be good enough to increase the resolution?
ofc the image would look kinda sharpened would increase some noise to overall image. but, would it be good enough to call as pre-processed image?
Ofc this sounds like a long stretch, but I am here for opinions and discussion.
I mean, we do have abundance of those data available online already classified, that would not be a problem.
and for the heads up on the mathematical approach, Thank you I will definitely look into it, and see what I can find.
That will increase the number of pixels, but those pixels do not include new information. So it’s a false increase. Essentially you’d just be making a larger version of the same image.