Coursera Analyze Datasets and Train ML Models using AutoML

Week 1 Question 3.2
Simple select syntax error , plz clarify how to overcome with this.
thanks
SL

Hi @priti,

You are running python3 in your kernel. The statements in cell 43 and 53 are SQL language statements. That’s why python is throwing the error. Initial code for cell 43, you are just printing the SQL statement not running the SQL statement.

Best Regards,
A. Sriharsha

Oh, how to run that particular part of SQL in notebook.

Hi @priti,

You are passing that SQL statement to Athena for executing the SQL statement. You cannot run SQL statements independently in a python environment. You need to use third-party libraries to execute the SQL statements. In this case, it’s Athena. You can also use Amazon Glue.

Best Regards,
A. Sriharsha

Hi, @sriharsha0806
Amazon Glue is installed but how to use, I search this page , but didn’t knew what to do.
https://aws.amazon.com/blogs/machine-learning/run-sql-queries-from-your-sagemaker-notebooks-using-amazon-athena/

Hi @priti,

I hope this link helps.

Best Regards,
A. Sriharsha

More confusion

from pyspark.context import SparkContext
from awsglue.context import GlueContext
from pyspark.sql import SQLContext

glueContext = GlueContext(SparkContext.getOrCreate())
spark_session = glueContext.spark_session
sqlContext = SQLContext(spark_session.sparkContext, spark_session)

DyF = glueContext.create_dynamic_frame.from_catalog(database="{{database}}", table_name="{{table_name}}")
df = DyF.toDF()
df.registerTempTable('{{name}}')
df = sqlContext.sql('{{your select query with table name that you used for temp table above}}')
df.write.format('{{orc/parquet/whatever}}').partitionBy("{{columns}}").save('path to s3 location')

how to find s3 location, and many more things didn’t get.
can u plz solve

statement_count_by_sentiment = """
SELECT sentiment, COUNT(sentiment) AS count_sentiment
FROM reviews
GROUP BY sentiment
ORDER BY sentiment
"""

print(statement_count_by_sentiment)`

Hi @priti,

You can solve the exercise problem by passing the sql statement to Athena in the very next cell.
I have provided the link to understand how to use GLUE. you don’t need AWS GLUE for this exercise.

Best Regards,
A. Sriharsha

Hi, @sriharsha0806 1. matplotlib is installed but still facing error.

  1. is this correct

Hi @priti,

The matplotlib error, I have seen twice of late. I believe it’s an issue from our end. Try to restart the kernel or before executing the cell, execute the installation line for matplotlib.

Regarding 2nd one, Read the exercise statement once again. Did the statement asks for a category column(spelling of category in wrong in the SQL statement).

Best Regards,
A. Sriharsha

around 8-9 hr ago my notebook exist but right now id didn’t and when running notebook it show specified bucket not exist.

now no notebook

no bucket
bucketnot

from yesterday I face lot of problem…

is there no simple way to perform all task.

Hi @priti,

Sorry to hear that you are facing these issues. As I mentioned in this post, your resources will be cleaned after the session expires. So, maybe that’s why you are facing these issues.

Best Regards,
A. Sriharsha

But I restart session from start, I delete jupyter notebook, and it still shows bucket not exist.

bucket exist

I upload above screenshot after 2 hr, as restriction from community

Hi @priti,

What is the output of the cell 2.1? Can you post the variable you are passing to path variable in Exercise 2? Also can you post the full screenshot of the error?

Best Regards,
A. Sriharsha

after delete all bucket and data set and python notebook I restart console again and again , then finally it run.

score view

Bar plot show 5/15 but I got all 4 bar plot

Hi @priti,

congratulations on passing the assignment. Check whether CSV file is located in the s3 bucket. Can you post the screenshot of barplot?

Best Regards,
A. Sriharsha





1

Yes CSV also in s3

@priti If you are facing the matlab problem, could you please send us the lab item link and the AWS Account number so we can investigate it:


Hi @priti,

It seems your product category sql statement is wrong. Your average sentiment plot should look like this. Hope this helps.

Regarding the test 2 failure. Is your submission report stating the following condition ““Test 2 failed: The command ‘aws s3 ls’ was not correctly used to view the dataset files in the S3 bucket. Please try again.” for the failure.

Best Regards,
A. Sriharsha

As u suggest I correct product category sql statement. but my score for bar plot still comes 5/20.

and I post other bar pics previous.
thanks