Event/Activity Detection on Timeseries Data

Hello everyone,

I’m working on a project to detect an event in a manufacturing process. The process is tracked using a photodiode. So my data will be a time series which I have to use to detect the event happening in real time or in as little of a delay as possible.

So far they used a rolling average and just triggered, once the value was below a certain limit for a specific time.

I used some 2D CNN Networks for image classification. For this project it was recommended to also use some CNN model and compare it with some more traditional Signal analysis techniques.

Are there any kind of 1D CNN models that I could use? Or use a different type of model? Any recommendations on which kind of output layer to use?

It would really appreciate getting any recommendations or in which directions to look at.

Thank you.

1 Like

One common way to deal with time series data is to use a Sequence Model or RNN. Are you familiar with those? If not, they are covered in DLS Course 5 here.

But maybe that’s putting the cart before the horse. It would help if you told us a bit more about what your data looks like and what this “event” is that you’re trying to detect. From your initial description it sounds like you’re just getting a single numeric value out of the photodiode. And what you described as the current detection method sounds pretty straightforward. What kind of performance (accuracy) do you get from the current “moving average vs threshold” method? Why is that not good enough and how much better does it need to be to satisfy the system requirements? Or to put it another way, why do you believe a neural network will do a better job if all you’re trying to find is just the average value of some signal over the last n occurrences?

1 Like

@mike89 I would agree with Paul that more information is needed here.

Also as he seems to suggest, a point with experience I can stress-- Yes, neural networks can do some amazing things-- But they are not an ‘all purpose sledge hammer’ for all occasions. In the end being a good Data Scientist, in my mind, is not just harnessing the latest and greatest techniques, but knowing which is the best/right tool for the job.

As an example, as part of a Capstone project previously from another course outside of here, before I started the Deep Learning Specialization I decided to work on a binary classifer for malware detection, based on something like 128 features and X = ~4500.

Part of the requirement was that we perform the analysis on two different models. I hadn’t worked with neural nets yet at that time, though I wished to learn, so originally chose it as my ‘second’ model (wrongly, as we will see expecting the results would be superior). So for the first, I just took a guess amongst the subset of traditional ML classification models and ran an SVM (Support Vector Machine).

Lo and behold, with just the SVM I was getting an accuracy of 98% on my test set ! I was rather blown away. I mean maybe I might have been able to achieve 99% with a Neural Net (or maybe not), but all the over head of a Neural Net just didn’t seem worth it at that point. Instead I ran a KNN (K-Nearest Neighbors, though I didn’t do as an extensive of a hyperparameters search and as expected achieved slightly less accuracy at 97%-- but still pretty good).

Further both these models ran really fast.

Plus, in the case of CNN’s, I might be wrong, but their structure an application really doesn’t seem like an appropriate fit for time-series analysis (assuming you’re acquiring ADC values from the photodiode)-- I mean a CNN seeks to ferret out features, and at the same time condense or compress them-- Not exactly something you want to be doing with values that have a direct relationship to time which is obviously a regular/fixed parameter.

Finally, on that vein here is another good video I found at the time that speaks in detail why bigger !always = better:


Thank you for the great input. I’m not too familiar with Sequence or RNN models. I will take a look at the course for sure.

The event we are trying to detect is the moment the tool has finished piercing the material. With previous machines, it was very clear when the piercing was done, since the signal dropped significantly. With the upcoming machines and new materials this is not the case anymore. Often the signal just drops very slightly and it is a lot more varied over all, that is rising and falling a lot more randomly during the whole process. In these new cases the current method doesn’t work at all in most situations.

So the goal is to find a solution where we can detect when the machine has finished the piercing process in these new edge cases. This should also work in real time and with little delay.

We will have labeled time series data with the moment the piercing has actually finished.

Thanks a lot for all your help.

Hello @mike89,

I think the following study is going to be useful for your development:

  • find a few graphs of typical time series that has signal(s) to detect
  • find a few graphs for each of the false positives and false negatives under the current approach
  • find out the minimum time between two signals
  • find out the maximum tolerable delay

Finding out those time scales help you in testing whether your model’s response time is acceptable. For the graphs, it may tell you what preprocessing steps are needed. You may also need them to decide what the sample length is.

Personally I believe simple approaches like rolling average should still has its place even in the era of neural network, even when we are talking about making improvement. There is a chance that non-neural-network approach can be more effective and efficient. I had done R&D in manufacturing environment making systems with motors and sensors, and even if I had to redesign what I had done, I would still have used those fast and simple algorithms, given the stringent requirements on response time and budget control.

If you would like to share some of those graphs, we could take a look at them together!

As for the 1D CNN, I have never seen readily available model for use or for transfer learning, but they should be fairly easy to train from scratch, and I suppose you can get as much data as you like to.



Thanks for the more detailed description of the problem. As Raymond suggests, having a good visual representation of the graphs of the signals might give you some insight about what you need to do here. There is a general “rule of thumb” if you’re going to use an image recognition style algorithm like a CNN that if you can’t detect the pattern with the human eye/brain combo, then it’s not that likely that the CNN will be able to either. But that is not an ironclad rule: there is at least one really famous case in which an image algorithm has learned to recognize things that even highly trained human doctors can’t. That’s the famous Retinal Scan algorithm from Google AI that can recognize the sex of the patient from a retinal scan image. Opthalmalogists and radiologists had previously believed that was not possible.

If you’re dealing with numerically controlled machine tools here, then of course there’s yet another whole approach: why can’t you get information directly from the machine about the instantaneous position of the drill head or whatever it is?

1 Like

@mike89 Thanks for more info. Hmmm. So from a non stats/ML/Deep Learning, but more engineering type position I am curious how exactly you were capturing this event with a photodiode. Obviously the tool must be obfuscating the ambient light somehow (or the tool gives off light when contact happens). Obviously I don’t know what either the tool or the contact material are made of, yet it seems likely once it makes contact with the substrate you’d likely have a measurable change in resistance in the tool itself (especially if you feed it a low voltage).

Might you be able to measure that in your application instead ?

(* Or a prox, hall-effect, or TOF sensor, perhaps-- I mean measuring light with just a photodiode is tricky because it can come from anywhere and thus be noisy, plus, at least ‘unmodified’ they are not very directed components)


Sorry for not being more clear about this. It is a laser based cutting process. So we basically measure the energy emissions of the material with the photodiode in the cutting head. We will mount a second diode to detect the moment we cut through and for labeling the data. It’s unfortunately impossible to use this second diode during production.

Thanks for all the very helpful input. So far I don’t have any actual data from the experiments. It will take a few more weeks until we will be able to transfer the data from the machine.

1 Like

A good thing about the manufacturing environment is that it is under control, so we definitely need to leverage that. It seems to me that the objective you want to achieve with the first photodiode (in the cutting head) may be achieved by the second diode. I would expect a very strong signal-to-noise ratio in the second diode. Why not just mount the second diode permanently and use that, @mike89?


1 Like

@mike89 Oh no, that’s okay-- I mean for all we know the process could be something entirely proprietary, so we are not like ‘Show us the plans !’ :grin:

Though, speaking only for myself, I am a firm believer in:

  1. Whatever ML/AI model you choose in the end is not completely divorced from having a complete understanding of the problem itself. To this end I know there are many data scientists that will just breakout the ‘cookbook’ of general methods, but if you don’t have a good grounding on the problem at hand you may very well miss certain crucial variables/features that are perhaps not obvious, yet crucial to providing a successful solution.

  2. When it comes to actually ‘getting things done’ I am rather more agnostic to the methods used. To that end, rather than cooking up your own model have you looked into OpenCV ? They have a bunch of CV features already baked in and can readily handle streaming video for convolution type operations.

Yet, lastly, now that the problem is more clear, my previous suggestions for sensors just wouldn’t work (nor straight away would just video unless you wanted to turn the place into a rave and fill it with smoke).

But, an IR sensor might and would be much less affected by noise than the diode. I mean the beam itself would only have a nominal heat signature, but once it hits the work piece temps will suddenly spike.

Further, and if you really wanted to get fancy and run a true convNet they now make inexpensive (like sub $100-- which is cheap if you consider these things used to be crazy expensive) thermal sensors which will provide you an output in a grid array (basically an ‘image’)-- Rather than the ‘1-bit’ you spoke of, this is the perfect size for a convNet.



The issue is that the environment of the second diode is very harsh. So it will only survive for a very short time and needs to be cleaned constantly. That’s why it’s only used in experiments, but not in production.

1 Like

So it’s a problem of the distance? If the second diode was farther away, would that be better?

Was the first diode’s operating frequency not matching with the emission?

Was the first diode not fully exposed to the emission? Like its angle or something? If the first diode is embedded in the cutting head and you can’t just take it off and investigate, what about mounting another diode next to it and cross-check? (The check can be done when the cutting head is configured to be not moving, if it is movable)

I mean, as far as data is concerned, we need to explore all possibilities that can increase the signal-to-noise ratio, and there are a few more weeks to try more things before the data is ready.

I think you’re faced with just collecting a lot of data, and see whether you can create a model that will detect the event you’re looking for.

Your need for real time and low latency may create a need for a very high sampling rate and a very fast computation loop.

Given your description of the problem to be solved, I am now less inclined to think of this as a sequence model situation.

However it may be the case that a very fast sequence model might be able to learn the signature of the transients that are characteristic of the event you are trying to detect.