I’m in the process of choosing a topic for my thesis and have shared two drafts. I would be grateful if anyone could review them and point out which might not be promising or suggest additional aspects
Thesis_topic.pdf (379.4 KB)
I’m in the process of choosing a topic for my thesis and have shared two drafts. I would be grateful if anyone could review them and point out which might not be promising or suggest additional aspects
Thesis_topic.pdf (379.4 KB)
Copilot responds:
High‑Level Verdict
Both topics are strong, but Topic 1 (Enhanced RAG + Semantic Cache) is much more feasible for a master’s thesis and has a clearer evaluation path.
Topic 2 (GraphRAG for Cybersecurity) is more ambitious and novel, but significantly heavier in data engineering, graph construction, and evaluation complexity.
If you want:
A strong, clean, finishable thesis → Topic 1
A research‑heavy, riskier, more novel thesis → Topic 2
Very clear problem statement
Uses well‑known datasets (StackOverflow)
Straightforward pipeline: embeddings → FAISS → RAG → semantic cache
Easy to measure improvements (latency, cost, retrieval quality)
Strong engineering + research balance
Highly reproducible and publishable
This topic is practical, scoped, and finishable. It aligns well with current industry interest in:
RAG optimization
Latency reduction
Semantic caching
LLM‑as‑Judge evaluation
Needs a clear novelty statement (e.g., “We evaluate how semantic caching reduces latency in technical QA RAG systems.”)
Must avoid becoming “just an implementation project” — the evaluation section helps prevent that.
Add ablation studies:
No cache vs. cache
Different similarity thresholds
Different embedding models
Add cost analysis (tokens saved)
Add cache eviction strategies (LRU, semantic clustering)
More novel and research‑oriented
Multi‑hop reasoning is a hot topic
Cybersecurity domain adds real‑world relevance
Knowledge graph + LLM pipeline is academically strong
This topic has higher research value and could lead to a publication if executed well.
Building a high‑quality cybersecurity knowledge graph is non‑trivial. It requires:
Entity extraction
Relation extraction
Graph schema design
Cypher query generation
Multi‑hop evaluation
Evaluation is also harder and less standardized. Overall, it’s significantly more time‑consuming.
Narrow the domain (e.g., OWASP Top 10)
Use an existing MITRE ATT&CK graph instead of building everything manually
Reduce multi‑hop depth to 1–2 hops
Focus on evaluation of graph‑grounded reasoning
Topic 1 is the safer, cleaner, more achievable thesis.
Topic 2 is more ambitious but riskier.
If you’re early in your research journey, Topic 1 is the better recommendation.
Thank you for sharing! I also think Topic 2 is quite risky…