How to use AI for extracting text highlights?

adilevin1972 · January 30, 2024, 6:37am

I am looking for a way to extract highlights from a long body of text. The input is a full document, and the output is a sequence of references into the text (e.g. lines 100-130 and lines 200-220, …), that represent the most interesting/relevant/important parts of the document (based on some subjective relevance criterion that can be trained for specifically, or expressed in a text prompt).

Can this be achieved with GPT-4 by some clever prompt engineering? Are there any task-specific models for this?

gent.spah · January 30, 2024, 6:48am

Possibly can be done with chatgpt with prompt engineering, give it a try and let us know as well!

gent.spah · January 30, 2024, 6:54am

Also you may check Generative AI for Large Language Models course, it might give some relevance to you!

marconi · January 30, 2024, 11:18am

MIT and Columbia researchers found that the best way to obtain a high-quality summary of a text, is to ask the LLM to improve its own output by adding information that corresponds to certain criteria. All of this is done by applying one, single prompt.

Article: [Insert article here]

You will generate increasingly concise entity-dense summaries of the above article. Repeat the following two steps 5 times.

Step 1: Identify 1-3 informative entities (delimited) from the article which are missing from the previously generated summary.

Step 2: Write a new denser summary of identical length which covers every entity and detail from the previous summary plus the missing entities.

A missing entity is:
- Relevant: to the main stories.
- Specific: descriptive yet concise (5 words or fewer).
- Novel: not in the previous summary.
- Faithful: present in the article.
- Anywhere: located in the article.

Guidelines:

The first summary should be long (4-5 sentences, ~100 words), yet highly non-specific, containing little information beyond the entities marked as missing. Use overly verbose language and fillers (e.g., “this article discusses”) to reach ~200 words.
- Make every word count. Rewrite the previous summary to improve flow and make space for additional entities.
- Make space with fusion, compression, and removal of uninformative phrases like “the article discusses”.
- The summaries should become highly dense and concise, yet self-contained, e.g., easily understood without the article.
- Missing entities can appear anywhere in the new summary.
- Never drop entities from the previous summary. If space cannot be made, add fewer new entities.

Remember: Use the exact same number of words for each summary.

H/T Azeem Azhar and EV team

marconi · January 30, 2024, 11:20am

Prompt 1:
My goal is to understand the most interesting/relevant/important parts of the document.
Write 10 different prompts for me to reach this goal, and the outputs that are generated from these prompts.
Then, evaluate these prompts out of 100 (100 is highest quality, 0 is lowest), according to the following criteria: relevance to a professional in the [field], tangibility, and clarity.

_After result, try this second prompt:
Prompt 2:
Based on this output, generate 3 prompts and their outputs, aiming to maximise the score in all three criteria.

This method offers multiple advantages:

Reduces your cognitive load: You can start with only a vague idea of what you want, and you’ll only need to write the starting prompt,
Speed: It drastically accelerates how fast you can find great prompts,
Quality: The LLM will experiment and learn how to design prompts to get the best results.

H/T Azeem Azhar and fellows Team at EV

haroldc · February 3, 2024, 8:08pm

Take a look to RAG techs. With that, you can have the pieces of text that are relevant to your query.

Topic		Replies	Views
For Better Answers, Generate Reference Text: AI-generated reference text improves LLM output AI Discussions the-batch , ai-discussions	1	63	May 23, 2023
Handling large number of tokens ChatGPT Prompt Engineering for Developers	4	194	April 29, 2023
Why only Text Summarisation? Generative AI with Large Language Models feedback , week-1 , week-2 , week-3	2	309	February 27, 2024
What ml architecture is needed for a story making model? AI Discussions ai-discussions	2	75	February 8, 2024
Instruction finetuning dataset Generative AI with Large Language Models week-2	1	406	July 22, 2023

How to use AI for extracting text highlights?

Related topics