ChatGPT - Summarizing web article

Hi - I am wondering how can i give to chatgpt ( in python) an URL with a article that i want to summarize.
I tried the following but he is not summarizing my article but doing something else.

a_twc = “Global Markets Weekly Update | T. Rowe Price

prompt = f"“”

Your task is to generate a short summary of the article in the website below.

Summarize the article below between triple backticks in at most 5 sentences, and focusing on any aspects \

that are relevant to US banks. please put each sentence on a new line with a bullet point.

summerize: {a_twc}

“”"

response = get_completion(prompt)

print(response)

does anyone has an idea what is wrong.

1 Like

I think ChatGPT cannot access the URL. You have to provide the text, not a link.

Hi @rieman,
Just copy and paste your article inside the prompt. As of now, ChatGPT still doesn’t have access to the internet.

Thanks for your reply…not the easiest solution knowing that the article can sometimes be quite long.
Thanks,

1 Like

You should think ahead about two things. First, how to strip out HTML tags, so they aren’t included in what you send into chatgpt. Second, what tokens mean and how to deal with the limitations.

There is a section on managing tokens here:

and some examples of dealing with rate limits here:

1 Like

Thanks! quite useful.

import requests

url = 'https://my_url_here/'
r = requests.get(url)

web_text = r.text

prompt = f"""
Your task is to generate a short summary of a this article from a web site. 

Summarize the text below, delimited by triple backticks, in at most 100 words. 

Review: ```{web_text}```
"""

response = get_completion(prompt)
print(response)

that will get you started, but it doesn’t strip the tags and doesn’t handle long articles

Thanks - as you correctly pointed out then you need to manage the token size.
i will try to split the text into x chunks.

1 Like

I did a quick a dirty test a few days ago and just grabbed the first 4000 characters out of the article’s text. It did produce a summary of what I gave it, but obviously limited value since it didn’t include the entire article. It isn’t clear to me if the HTML tags are stripped automatically from the inbound text or if they count in the token total. I also didn’t think much about workarounds for the total size limit. Maybe if you’re willing to actually pay for their service you get more? IDK, but let us know what you figure out?

EDIT: here’s one reason you want to do some processing on the HTML before sending it to chatGPT. Modern web pages are FULL of this kind of stuff…

and that is from inside the <body> </body>. The header is even worse. There is some discussion and some code in this stackexchange thread that I just used successfully to strip out most of the HTML and css code from my test web page:

still had to do a little manual inspection and slicing, but it allowed me to summarize the entire article using my existing free openAI account and key, which I couldn’t do with the HTML junk in it. HTH

Try this. you need chatGPT-4 to read URLs. You could open an account in open AI website and use the playground. It is chargeable after the initial free use. I have using it for a month and no charges so far
Scan [URL] and summarise the article. I got this summary
The T. Rowe Price Global Markets Weekly Update provides an overview of the markets for the week ending April 2, 2021. Global stocks were mixed, as investors remained worried about the outlook for the global economy. The US markets were mixed, with the S&P 500 closing slightly higher for the week, while the NASDAQ Composite and Dow Jones Industrial Average both closed lower. In Europe, the Euro Stoxx 50 index gained ground, while the UK’s FTSE 100 index fell. In Asia, the Nikkei 225 index in Japan rose, while the Hang Seng index in Hong Kong and the Shanghai Composite index in China both fell. Oil prices rose on the week, while gold and silver prices fell. The US dollar rose against a basket of major currencies. Overall, investors appeared to be taking a wait-and-see approach to the markets, as they weigh the risks of a potential economic downturn against the potential for further economic growth.

3 Likes

Take a look to this HF Space using newsletter3k to “read” the text.

1 Like

use Bard, it will do it

2 Likes

I’m just brainstorming here. If you can pass a webpage to chatGPT, then could you just use bash function cat to forward the text to chatGPT?

ChatGPT cannot access URLs. Instead try providing text or specify a few things.