C2W3 - Exercise 12

Exercise 12

Here is a summary of the instructions:

  1. Get id of input artifact
  2. Get events associated with input artifact
  3. Get execution id for events subset: OUTPUTS only
  4. Get events for these ids?
  5. Get subsets of these events: INPUTS only
  6. Get artifacts for these ids

Could someone help to verify these instructions?
Specifically:
step 4 – we have output events in step 3, so I’m wondering what kind of events are we retrieving in step 4?
Step 5 – this is an empty set if I use the same events in steps 3 and 4. (Step 6, empty list)

Thank you

For additional context, there are no Input events in execution_id_events:

[id: 16
type_id: 13
properties {
key: “component_id”
value {
string_value: “Transform”
}
}
properties {
key: “custom_config”
value {
string_value: “null”
}
}
properties {
key: “module_file”
value {
string_value: “/home/jovyan/work/cover_transform.py”
}
}
properties {
key: “pipeline_name”
value {
string_value: “interactive-2021-06-07T22_18_37.439749”
}
}
properties {
key: “pipeline_root”
value {
string_value: “./pipeline”
}
}
properties {
key: “preprocessing_fn”
value {
string_value: “None”
}
}
properties {
key: “run_id”
value {
string_value: “2021-06-07T22:43:54.572707”
}
}
properties {
key: “splits_config”
value {
string_value: “None”
}
}
properties {
key: “state”
value {
string_value: “new”
}
}
create_time_since_epoch: 1623105835036
last_update_time_since_epoch: 1623105835467]

Versus the artifact_id_events, it has an Output event:

[artifact_id: 15
execution_id: 16
path {
steps {
key: “transform_graph”
}
steps {
index: 0
}
}
type: OUTPUT
milliseconds_since_epoch: 1623105835478]

This was a typo by me - used get_executions instead of get_events. :slight_smile:

Hi @tbucci1

just to be sure I have understood: so, dit it solve the problem for you?

Happy learning

@luigisaetta thanks for the follow up.

Actually, I would still appreciate explanation of the events we are retrieving in step4. Because it seems we get events (step2), then subset them (step3), then get the id (step3), and then use the id to get more events?

You know what I think would really help, is a relational diagram, like ERD or topic map, you know? I recall there was a table presented in the lesson but I think a different graphic could be more helpful for understanding.

And yes, changing the function from get executions to get events in step 4 did result in the correct solution.

Here’s a graphic, I see. So to check my understanding,

Artifacts and Executions each have events, correct? Events are the relations, yes?
Does every event have both an artifact and an execution?
Is there a list of all event, artifact and execution types?

Thanks

image

@tbucci1
If I understand it correctly, you’re talking about the Data Model of the Metadata Store (MLMD).
You can find a more detailed description here: https://www.tensorflow.org/tfx/guide/mlmd

  • An execution is a record of a component run or a step in an ML workflow

  • An event records a relationship between artifacts and executions.

For example: artifacts used, artifacts produced by the executions. They’re (or can) be used to do tracking (to identify which step has used or produced an artifact).
So, basically, you’re right: an event has associated an execution and an artifact, but this is because by design an event records the relationship.

Yes, great thank you. We’re on the same page.