Advice Needed: Evaluating Thesis Topics in RAG

khanhh · December 28, 2025, 1:02pm

I’m in the process of choosing a topic for my thesis and have shared two drafts. I would be grateful if anyone could review them and point out which might not be promising or suggest additional aspects

Thesis_topic.pdf (379.4 KB)

g15713 · December 31, 2025, 10:07pm

Copilot responds:

High‑Level Verdict

Both topics are strong, but Topic 1 (Enhanced RAG + Semantic Cache) is much more feasible for a master’s thesis and has a clearer evaluation path.

Topic 2 (GraphRAG for Cybersecurity) is more ambitious and novel, but significantly heavier in data engineering, graph construction, and evaluation complexity.

If you want:

A strong, clean, finishable thesis → Topic 1
A research‑heavy, riskier, more novel thesis → Topic 2

Topic 1: Enhanced RAG with Semantic Search & Caching

Strengths

Very clear problem statement
Uses well‑known datasets (StackOverflow)
Straightforward pipeline: embeddings → FAISS → RAG → semantic cache
Easy to measure improvements (latency, cost, retrieval quality)
Strong engineering + research balance
Highly reproducible and publishable

Why it’s promising

This topic is practical, scoped, and finishable. It aligns well with current industry interest in:

RAG optimization
Latency reduction
Semantic caching
LLM‑as‑Judge evaluation

Potential Weak Spots

Needs a clear novelty statement (e.g., “We evaluate how semantic caching reduces latency in technical QA RAG systems.”)
Must avoid becoming “just an implementation project” — the evaluation section helps prevent that.

Suggestions to strengthen it

Add ablation studies:
- No cache vs. cache
- Different similarity thresholds
- Different embedding models
Add cost analysis (tokens saved)
Add cache eviction strategies (LRU, semantic clustering)

Topic 2: Graph‑Based RAG for Cybersecurity

Strengths

More novel and research‑oriented
Multi‑hop reasoning is a hot topic
Cybersecurity domain adds real‑world relevance
Knowledge graph + LLM pipeline is academically strong

Why it’s interesting

This topic has higher research value and could lead to a publication if executed well.

Major Risks

Building a high‑quality cybersecurity knowledge graph is non‑trivial. It requires:

Entity extraction
Relation extraction
Graph schema design
Cypher query generation
Multi‑hop evaluation

Evaluation is also harder and less standardized. Overall, it’s significantly more time‑consuming.

Suggestions to strengthen it

Narrow the domain (e.g., OWASP Top 10)
Use an existing MITRE ATT&CK graph instead of building everything manually
Reduce multi‑hop depth to 1–2 hops
Focus on evaluation of graph‑grounded reasoning

Which Topic Is More Promising?

Topic 1 is the safer, cleaner, more achievable thesis.
Topic 2 is more ambitious but riskier.

If you’re early in your research journey, Topic 1 is the better recommendation.

khanhh · January 9, 2026, 11:19am

Thank you for sharing! I also think Topic 2 is quite risky…

Topic		Replies	Views
Retrieval Augmented Generation RAG-related ideas for Master Thesis AI Discussions careers , project	1	291	September 30, 2025
Graph RAG after RAG Retrieval Augmented Generation week-module-1 , dl-ai-learning-platform	1	56	April 17, 2026
Unclear Whether Lesson 6 Demonstrates True Vector Search + Knowledge Graph Enrichment Knowledge Graphs for RAG	0	24	August 15, 2025
Scaling Knowledge Graphs in RAG Systems AI Discussions ai-discussions , rag	0	23	May 27, 2026
RAG pipeline for 10K reports AI Discussions ai-discussions , langchain , data-centric	0	418	April 29, 2024

Advice Needed: Evaluating Thesis Topics in RAG

Topic 1: Enhanced RAG with Semantic Search & Caching

Strengths

Why it’s promising

Potential Weak Spots

Suggestions to strengthen it

Topic 2: Graph‑Based RAG for Cybersecurity

Strengths

Why it’s interesting

Major Risks

Suggestions to strengthen it

Which Topic Is More Promising?

Related topics