Can it be Data mismatch?

Hi everyone,

I am currently working on a project where I deal with audio samples. Multiple audio sample has been taken in different buildings on different floors. I’m tasked to classify on which floor an audio has been taken.

Currently, what I’ve assigned some (different) buildings to my test/dev sets.

I’m having a hard time getting a good correlation on dev and test error.

I am wondering if it’s not because a data mismatch problem and whether I should use same buildings (with different audio corresponding to floors) in dev and test sets.