Posts tagged with "AI"

Document Haystack: A Vision LLM Benchmark for Multimodal Document Understanding in Long Contexts

2025-08-13

New

This paper proposes a new benchmark, 'Document Haystack,' which measures the ability to find specific information from long documents up to 200 pages long. This benchmark evaluates how accurately a Vision Language Model (VLM) can find intentionally embedded text or image information ('needles') within a document. The experimental results reveal that while current VLMs perform well on text-only documents, their performance significantly degrades on imaged documents or when handling information that combines text and images. This highlights future research challenges in the long-context and multimodal document understanding capabilities of VLMs.

Vision Langu...

Benchmark

Robust and Fair Top-k Recommendations via Efficient and Responsible Adaptation of Large Language Models

2025-08-08

This paper proposes a hybrid Top-k recommendation system that combines traditional recommendation methods with large language models (LLMs). Users are categorized as "active users" and "weak users," with LLMs employed to improve recommendation accuracy and fairness for the latter group. At the same time, the model controls LLM computational costs to ensure practical feasibility.

Recommendati...

LLM