Hybrid Database

Any ideas of using a deep neural network to link two databases together? The closest concept that I know is a siamese network. Please give me some suggestions or papers to look into!

What do you mean by?

This is normally done in data preprocessing (before entering the model), either in sql or some other dataframe maybe pandas or even python itself.

1 Like

Source I produces Case A, B, C, D.
Source 2 produces Case A, B, C, D.
I would like to let the neural network train the distance between case (1-A --1-B), (2-A – 2-B), … to be closer while case (1-A – 2-B, 2-C, 2-D), (1-B – 2-A, 2-C, 2-D) … to be longer.
Or identify case A, B, C, D, regardless of the source.

How should I do this? Should I just mix all the sources of data in a batch?

Why can’t your problem be treated as a multiclass classification since we want to categorize the input as A or B or C or D?

I do not understand the question? If it would clarify the exact application that I’m doing, here it is:
I am building a multi-class classification network with 6 classes. I have three main source of information (one produced by simulation, two produced by on-site experiments) I would like to establish the connection between the simulation-two experiment types, which would produce the accurate estimation of the 6 classes.

Sorry about that. The reply is fixed. I missed a 't

What are the inputs to the NN? I assume the output is class probabilities for the 6 classes.

A few things to clarify here,

  1. Your data can be manipulated to provide compound data that contains better info to your model

  2. You need to have clear labels to train the model

With these 2 you can produce compound features and you can also make compound labels if you dont want exact measuraments, but you must have them to train the model.

The inputs are 600 pictures of a spectrogram with 6 classes, 100 cases each! Currently they come from the same source with the same sample frequency (1600 Hz).
But due to my implementation the other source has a much lower sample frequency (~30 Hz), therefore it is not similar but the classes are the same.

  1. Could you clarify “compound data”, does it mean having two types of data in a training case?
  2. What are your recommendations of having clear labels? I use the keras API so that it writes the labels in directly.

For example for data x1 and x2 you can produce a data x3= x1*x2 and so on any other transformation that can be helpful, especially if data is dependent on each other there is no need to use many different data, you can create one variable.

By. clear labels I mean, to have labels that are correct and non ambiguous.

1 Like

Since your goal is to classify the 30 Hz signal, how about downsampling the 1600 Hz signal to 30 Hz frequency?
This way, your training set becomes the transformed 1600 Hz signal & test set is the 30 Hz signal from the other source.

I have gradually leaned on this idea more recently and I’ve also consulted my mentor, this should be the way forward, thank you!

What an interesting concept! By “*” is it element wise multiplication or convolution? This could also be worth a shot!