Multivariate array classification using neural networks

Dear Forum,

I really enjoyed the course so far. Inspired by the “Practice Lab: Neural Networks for Handwritten Digit Recognition, Multiclass (C2_W2_Assignment)” I would like to apply its concepts in real life and have two questions:

  1. Is it possible to get a neural network to accept more than one array as its input variables e.g. an array and a feature which itself is also an array?

To give a little background to what I am trying to accomplish:
I would like to recognize abnormal time-domain waveforms from a collection of many waveforms and classify them as either pass or fail.
As the main input I have a time series, which represents a waveform X_1 captured on an oscilloscope. This could be represented as a numpy array as is the case in the Practice Lab. Secondly, I would like to include feature X_2, itself a time series which has an influence on every X_1.
Is the neural network in the Practice Lab the right approach for this use case or do I need to learn more advanced concepts?

  1. How can I combine both scalar(e.g. boolean) and vector(array) features for classification using a neural network?

Thank you all in advance.

Yes, you can have more than one input source.

However, you cannot effectively model a time-series using just a simple flat NN. A flat NN does not understand event sequences. These are outside the scope for what the MLS course teaches.

Time series require a different machine learning model, one that uses Recurrent Neural Networks.

Those are covered in the Deep Learning Specialization, Course 5 “Sequence Models”.

Hello, @sugeosan,

I believe that both questions are about how to form an input with everything you get. If you have time series X_1, time series X_2, scalars X_3, X_4, …, you may do it like below:

While you may form the input like above and feed it to train a ordinary NN presented in the assignment, I second Tom that the better and more advanced approaches are only covered in the Deep Learning Specialization (DLS).

However, if you are eager to give it a try with what you have already learned, I don’t see why not, and one possible concern is that, if you have one pretty long time series (not to mention two), you are going to have a pretty large weight matrix for your first neural network layer. More trainable weights usually need more training samples (which are also usually limited), or you will need to apply some regularization, but I am sure that will also be a good exercise to follow Andrew’s lectures on Bias and Variance and see how well your model will do on some validation sets.

If you will move on to the DLS, your work today may well be a baseline with which, in the future, your more advanced model may compare.

Cheers,
Raymond

Hello @TMosh,
thank you so much for confirming. I do plan to continue with the Deep Learning Specialization after I am done with MLS. For my specific application, do you think it is necessary to first complete Courses 1-4 of DLS in order to understand Course 5?

@rmwkwok Thank you for your detailed answer as well.
I am thinking about the constraints posed by the long time series I have (50000 samples).

If I concatenate all the features I have into one array, will the impact of scalar features on the outcome be negligible because they are only single elements in an array of 100000+ elements?

Will such a long time series also be an issue using advanced models of DLS?

1 Like

Hello, @sugeosan,

That could happen. You may use different architectures, such as multiple branches and assign them to time-series and scalars separately. You may train these branches separately. You may apply different regularizations to these branches, and so on.

You probably wouldn’t go through any examples of multiple branches in Course 1 and 2 of the MLS, but you would in Course 3 Week 2.

They are interesting exercises and I encourage you to give them some trials.

You have time-series and non-time-series features. Your question is about the time-series.

With ordinary neural network that you learned in MLS, for each neuron, you need one separate weight for one time step. However, with the advanced recurrent neural network, for each neuron, you need one set (a small number) of weight for all time steps.

Here, RNN saves us many weights.

Besides, RNN takes temporal order of data into account by incorporating results from previous time-steps to the current input when producing an output. In contrast, ordinary NN just flattens out all the time steps and forget about which existed first.

Here, RNN must respect temporal order.

However, as you get to the Transformer architecture, we will give up the first advantage above while still preserving the second one using a different technique. You get back to a bulk network with many weights, but turns out to do a great job in language modeling.

Both RNNs and Transformer are in Course 5 of the DLS, and both can be applied to sequential problem like yours.

In DLS C5, You will learn about the challenges to RNNs brought by lengthy inputs. 50,000 is quite long to RNNs, but if, for example, by your understanding of the data, you knew it didn’t need all 50,000 or it didn’t need that much resolution, then you could reduce the problem in an informed way. As for Transformer architecture, input length of an order of magnitude of four isn’t surprising in language models, but still, it would be better for you to justify 50,000 first.

Cheers,
Raymond