C4W1_Assignment differenced_moving_average

Hi, I am facing problems in doing the correct slicing after calculating the differenced moving average.

I cannot understand how to do that.

Please help me. I am attaching a screenshot.

1 Like

There are 2 things to track:

  1. When computing diff_series, you subtracted the current data point from the data point from the previous year. The first year worth of points were disregarded as there were no points before them.
  2. Moving average takes window_size worth of points and produces the summary as the value at timestep window_size + 1.

Looking at the reverse direction, if I want to find the value at a timestep, it means that I should’ve taken the moving average of the previous window_size timesteps. Take it one more step back where the diff series was computed with a point 365 days back before performing the moving average. So, how many steps should I move back?

But after the diff_moving_avg variable is made by subtracting previous year’s data and taking a moving average by choosing a window size of 50, how do we slice it to match the validation period.

I am not able to understand that.

Did you read this part?

Answering your question of how many steps should we move back, I think the answer is 365 + 50 (window_size).

If the answer is correct, how does it translate into slicing the diff_moving_avg variable for validation period?

Do I slice those many values from the start or the end of the diff_moving_avg variable? And why?

I need more clarity on this. Please guide me here.

The 1st forecast in moving_average_forecast (i.e. at index 0) is 365 + 50 timesteps ahead of the input time series.
Now that we’ve figured out the offset, to get the forecast for SPLIT_TIME, you’ll have to start from the same number of timesteps before SPLIT_TIME inside moving_average_forecast.
Once you understand that, slicing is straightforward.

According to what I understood from your explanation, I did diff_moving_avg[SPLIT_TIME - 365 - 50] and the result graph was same as the expected output.

Thanks for your help there.

In the next cell, we have to add the past values from t - 365 to diff_moving_avg to get the actual values.

To do that, I think we need to do SERIES[: -365] because we want to get the data of one year earlier.

Please tell me where am I wrong here.

1 Like

You should move window_size steps backward as well.

SERIES[: -365 -50]. Is this what you are suggesting ?

From the SPLIT_TIME, go back 365 plus 50 steps in the back and iterate till end of series.

SERIES[:SPLIT_TIME] will give us data till SPLIT_TIME. So, going 365 + 50 steps from SPLIT_TIME means we wanna do SERIES[:SPLIT_TIME - 365 - 50].

Where am I wrong ?

Please excuse me for asking so many questions but this is not getting through me.

The starting point should be the position you’ve specified. Starting from that offset, go till end of series.

SERIES[SPLIT_TIME - 365 - 50:]. This right ?

The range is correct. You are looking inside the wrong series. What’s the point of looking into the original series when you have to look inside the series created using diff and moving average?

Are you sure? Because we want to add past values to the differenced values. So, why would be looking into the series created using diff and moving average to get past values?

To get the past value, shouldn’t we be looking into the original series?

I want you to make use of the result created using diff followed by moving average and look at the range you pointed out.

Okay, It got resolved. Thank you very much!

@Kaleem_U_Allah

train_val_split is the 1st graded cell in the assignment and the grader says that there’s a shape issue.

image

It’d help for you to add the print statement shown below after calling train_val_split to see the details:

print(f"""
len(TIME)={len(TIME)}
len(time_train)={len(time_train)}
SPLIT_TIME={SPLIT_TIME}
time_train.shape={time_train.shape}
series_train.shape={series_train.shape}
time_valid.shape={time_valid.shape}
series_valid.shape={series_valid.shape}
""")

Expected output:

len(TIME)=1461
len(time_train)=1100
SPLIT_TIME=1100
time_train.shape=(1100,)
series_train.shape=(1100,)
time_valid.shape=(361,)
series_valid.shape=(361,)

When sharing grader feedback, please expand all feedback points and share it as a screenshot or as text.