24 EMNLP On Fake News Detection with LLM Enhanced Semantics Mining

cc • 2024-11-14 05:23 • 杂文

LLM-enhanced semantic features （新闻、实体、话题）
Generalized Page-Rank model
Learning criterion for mining the local and global semantics
LLMs 能否有效输出新闻表示，用于虚假新闻检测？
当假新闻模仿真实新闻的语言风格时，简单使用LLM的方法就失效了。

GAP

image.png

在特定主题的假新闻中，有意义实体的不规则共现。
Simply applying news embeddings from LLMs is ineffective for fake news detection. （本质上是想捕捉假新闻的偏离模式：high-level semantics among named entities and topics, which reveal the deviating patterns of fake news, have been ignored.）
“Irregular co-occurrence”指的是在特定话题的假新闻中，有意义的实体（如人物、地点、事件等）之间的不规律或不一致的同时出现。这可能意味着这些实体在文本中出现的方式或频率与预期或常规模式不符，可能反映出信息的混乱或误导性。例如，如果一篇假新闻中频繁提到某个不相关的人物或事件，而这些人物或事件在真实报道中并不常见，这就可以被视为一种不规则的共现。这种现象可能是识别假新闻的重要线索。

围绕以下两个问题展开：

P1. How can we apply LLMs to explore high-level news semantics? （summarized topic-to-graph）
P2. How can we identify the irregular semantics in fake news? （local and global news semantics）

Idea

We propose a topic model together with a set of specially designed prompts to extract topics and real entities from LLMs and model the relations among news, entities, and topics as a heterogeneous graph to facilitate investigating news semantics. We then propose a Generalized Page-Rank model and a consistent learning criterion for mining the local and global semantics centered on each news piece through the adaptive propagation of features across the graph.

Generalized PageRank (GPR)，
Global and Local Semantics Mining: (small step = 2 $/rightarrow$ local semantic; larger step = 20 $/rightarrow$ global semantic.)
$L_{ce}$ : Cross-entropy loss for labeled data, $L_{con}$ : KL-divergence loss for unlabeled data.
$/hat{y}_i = 1$ : Fake；otherwise：True.

image.png

Datasets

MM COVID, ReCOVery, MC Fake, LIAR, PAN2020

Experimental Results

image.png

pair-wise t-test at a 95% confidence level (a = 0.05).
potential data contamination
Silhouette Score

参考文献

Yuchen Zhang, Xiaoxiao Ma, Jia Wu, Jian Yang, and Hao Fan. 2024. Heterogeneous subgraph transformer for fake news detection. In WWW.

版权声明：
作者：cc
链接：https://www.techfm.club/p/168983.html
来源：TechFM
文章版权归作者所有，未经允许请勿转载。

THE END

语言

二维码

美10月CPI同比加速增长至三个月高位，但符合预期，华尔街认为12月降息稳了

< <上一篇

多家银行宣布：单日限额提升！

下一篇>>

搜索内容

24 EMNLP On Fake News Detection with LLM Enhanced Semantics Mining

GAP

Idea

Datasets

Experimental Results

参考文献

取消回复

共有 0 条评论

Ads