Thank you for reply and sorry for late answer. I was retesting all codes I tried so far, but it seems like Im not on right path. I did not post anything on Git since I dont have anything good. If code does not work, I post error msg to chatgpt and try to solve that way.
When I didnt get outputs I wanted, following given codes I tried to get at least summary from input, but that was also not good. Most of times output is none or mostly whole input.
Example code I tried to get summary with model Bart. There is no summary, just part of inputs “copy/pasted” in output.
from transformers import BartForConditionalGeneration, BartTokenizer
model_name = "facebook/bart-large-cnn"
tokenizer = BartTokenizer.from_pretrained(model_name)
model = BartForConditionalGeneration.from_pretrained(model_name)
input_text = "Two happy dogs, Max and Bella, bounded across the green meadow, tails wagging furiously. With tongues lolling out and eyes sparkling with joy, they chased each other playfully, their laughter echoing through the crisp morning air."
input_ids = tokenizer.encode(input_text, return_tensors="pt", max_length=1024, truncation=True)
# Generate summary
summary_ids = model.generate(input_ids, max_length=50, min_length=10, num_beams=4, early_stopping=True)
summary_text = tokenizer.decode(summary_ids[0], skip_special_tokens=True) # Decode the summary tokens back to text
print("Summary:", summary_text)
I get no answer from Bert model:
from transformers import BertForQuestionAnswering, BertTokenizer
import torch
# Load the question answering model and tokenizer
model_name = "bert-large-uncased-whole-word-masking-finetuned-squad"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForQuestionAnswering.from_pretrained(model_name)
# Define the input text and question
input_text = "Once upon a time, there was a little prince who lived on a tiny planet called B612."
question = "Where did the little prince live?"
# Tokenize and encode the input text and question
input_ids = tokenizer.encode(input_text, add_special_tokens=True)
question_ids = tokenizer.encode(question, add_special_tokens=False)
# Prepare inputs for the model
inputs = {
"input_ids": torch.tensor([input_ids]),
"token_type_ids": torch.tensor([[0] * len(input_ids)]),
"attention_mask": torch.tensor([[1] * len(input_ids)])
}
# Perform inference
with torch.no_grad():
outputs = model(**inputs)
# Extract and decode the answer
start_index = torch.argmax(outputs.start_logits)
end_index = torch.argmax(outputs.end_logits)
answer_tokens = input_ids[start_index:end_index + 1]
answer = tokenizer.decode(answer_tokens)
print("Answer:", answer)
Then another example of Bert. Seems ok on shorter input, but not on longer.
# Bert
from transformers import pipeline
qa_pipeline = pipeline("question-answering", model="bert-base-uncased")
passage = "Once upon a time, there was a little prince who lived on a tiny planet called B612."
question = "Where little prince lived?"
answer = qa_pipeline(question=question, context=passage)
print("Answer:", answer['answer'])
I tried about 20 codes like those (models Bert, Bart, Albert, Roberta, Google Language API) that were generated by chatgpt, but none worked as expected. While end goal is to add whole book as input, for now I try to give small input like 2-4 sentences.