Assumption: In order to check wether or not we are meeting a satisficing constraint, we need to “port the code to the target devices to evaluate if your model meets or exceeds satisficing metrics.”
The quoted part in the above assumption is from a quiz question in Week 1 of the Course 3.
At which stage would that be a good time point to check statisficing constraints (metrics)?
E.g.
The Quiz question mentions the following stage:
human perf: 0.1 %
train error: 2.0 %
dev error: 2.1 %
Also given is that we have a satisficing metric:
memory <= 10 MB
Obviously, avoidable bias is much larger than the room to improve variance, and hence, we need to prioritise actions to decrease bias (compared to decreasing variance).
However, wouldn’t this be a good point of time to check wether or not we fulfil this satisficing constraint. This would tell us which actions we can consider to decrease the bias as the next step. E.g. if we are already violating the 10 MB memory constraint, than we would know that we shouldn’t train a bigger model, which would further increase the memory consumption, but we should look for other options to decrease bias.
I know that 2% error might be too much for certain applications, but without knowing the application, in general, 2% error can be considered to be not too bad, and at this stage I’d have a look at the satisficing metrics, before I continue further with reducing the avoidable bias.
In general, if it is not trivial to check satisficing metrics for your model (e.g. you need to port the model into target devices and check the memory consumption),
at what stage would you do this check?
I think your question isn’t about whether “Porting the code to the target devices” should be chosen as a correct answer in the quiz or not, but your question is more about when to do that action.
First, why do we want to do that action? Here are 2 possibilities:
We want to know if real-world data is different from our training data, because “porting” means we will collect additional data
We want to verify our performance metrics with real-world data.
For (1), I think it is good to start scheduling regular tests once a model is ready, because it is dangeous to lock ourselves in a room from the real-world. If something changes, we need to know it and we need enough time to react to it. It is also a good way to keep reporting our progress.
For (2), that will be after we have a mature model, and that will be both before and after we deliver our model to the client.
The timing for real-world tests I discussed above can overlap, but the scale of those tests can be different. It really depends on the resources we have and what kind of problems we will come across over the course of development and delivery.
I think the 1st aspect is also important, thanks for discussing that one.
I was mainly considering the 2nd aspect, and especially those satisficing metrics:
Satisficing metrics (constraints) would affect the next actions we can take to improve the optimizing metrics.
E.g. if we are already (close to) violating the 10 MB memory constraint, than we wouldn’t train a bigger model.
So, I think if I have such a memory constraint, and if I consider e.g. training a deeper network, at any stage, I’d check the memory size of my model.
Nevertheless, I also think, in most cases porting the code would not be needed for this evaluation.
I guess in most cases we can just check the memory consumption in the development system, which is trivial and simple. Only if the expected memory consumption in target system is close to the satisficing constraints, then we can port the code to target devices and do the exact evaluation in target devices.
This is also just brainstorming motivated by the Quiz question.
Actually, I would try some models > 10MB at least myself, because I want to know how much I need to trade off for the required performance. I think we need that number in our pocket. Of course, we should also look into other solutions that doesn’t or does compromise the 10MB line
Yes, that’s true, it would always be nice to know, if not necessary, how much of accuracy we compromise because of a 10 MB constraint.
So, I would also try models > 10MB, if I have time to do so. But, I would prioritise trying models that fulfil the 10MB constraint.