仅使用标签名称进行文本分类：一种语言模型自训练方法

Name: 仅使用标签名称进行文本分类：一种语言模型自训练方法
Rating: 4.5 (25 reviews)
Author: grasp_57750

上传者：grasp_57750 2021-01-22 01:50:41上传 .PDF文件 710.64 KB 热度 25次

当前的文本分类方法通常需要大量带有人标签的文档作为培训数据，这在实际应用中可能既昂贵又难以获得。人类可以执行分类而不会看到任何带标签的示例，而只能基于描述待分类类别的少量单词。..

Text Classification Using Label Names Only: A Language Model Self-Training Approach

Current text classification methods typically require a good number of human-labeled documents as training data, which can be costly and difficult to obtain in real applications. Humans can perform classification without seeing any labeled examples but only based on a small set of words describing the categories to be classified.In this paper, we explore the potential of only using the label name of each class to train classification models on unlabeled data, without using any labeled documents. We use pre-trained neural language models both as general linguistic knowledge sources for category understanding and as representation learning models for document classification. Our method (1) associates semantically related words with the label names, (2) finds category-indicative words and trains the model to predict their implied categories, and (3) generalizes the model via self-training. We show that our model achieves around 90% accuracy on four benchmark datasets including topic and sentiment classification without using any labeled documents but learning from unlabeled data supervised by at most 3 words (1 in most cases) per class as the label name.

下载地址

用户评论

更多下载

下载地址

立即下载

用户评论

仅使用标签名称进行文本分类一种语言模型自训练方法

Current text classification methods typically requ...

大小：710.64 KB | 2021-01-22 01:50:41
ChatGPT语言模型训练方法解析

ChatGPT 模型训练：技术细节与应用指南这份指南深入探讨了ChatGPT的训练方法，并提供了实用...

大小：37.56KB | 2024-06-07 15:42:55
LibSVM进行文本分类

包括调用LibSVM进行分类的程序，以及文本预处理部分的程序详情见：http://www.cnblo...

大小：0B | 2019-08-02 06:59:59
文本分类使用scikit learn进行文本分类分类BBC文章源码

文字分类使用scikit-learn将BBC文章分类为几类这个怎么运作有两个数据集。带有12...

大小：5.68MB | 2021-02-06 08:35:45
基于半监督式文本分类的对抗训练方法

基于半监督式文本分类的对抗训练方法，对抗生成模型相关论文

大小：0B | 2018-12-21 07:45:49
基于N元语言模型的文本分类方法

大小：0B | 2019-01-01 17:40:37
泰语语言模型文本分类part1

泰语语言模型,可以用于下游任务,如文本分类,序列标注,out of domian检测等任务,该模型性...

大小：800MB | 2020-09-18 21:25:50
使用TensorFlow 2.0进行文本分类的示例

在这个例子中，我们将学习如何使用TensorFlow 2.0进行文本分类。我们将使用IMDB电影评论...

大小：427.67KB | 2023-05-02 08:54:52
jQuery获取当前对象标签名称的方法

jQuery获取当前对象标签名称的方法

大小：16.45 KB | 2022-01-01 14:24:31
论文研究一种使用未标记样本聚类信息的自训练方法.pdf

为了有效地利用结构信息,提出了一种新的自学习算法,算法中利用聚类方法从自标记样本中选择可信度高的样本...

大小：377KB | 2020-08-05 01:46:53
textflow使用Metaflow和AWS进行文本分类源码

文字流使用和进行自然语言处理的培训管道。学习任务: Word2vec预培训: 文字分类: 神经机...

大小：5KB | 2021-04-21 05:29:56
使用pytorch和torchtext进行文本分类的实例

今天小编就为大家分享一篇使用pytorch和torchtext进行文本分类的实例,具有很好的参考价值...

大小：99KB | 2020-11-29 04:47:04
文本分类算法分析一种很好的文本分类算法

一种很好的分类算法，字数还得大于20，你爷爷的。好东西就是好东西。

大小：0B | 2019-05-06 05:08:33
一种改进的文本分类算法

文本分类技术是文本挖掘技术中的研究热点之一,但是传统KNN分类算法的时间复杂度高,在不均匀密度样本下...

大小：1.1MB | 2021-02-01 09:42:46
论文研究一种结合云模型的文本分类方法.pdf

为了降低在传统的文本分类方法中自然语言的不确定性对分类效果的影响，提出了一种结合云模型的文本分类方法...

大小：475KB | 2020-07-22 19:28:37
adversarial_text半监督文本分类的对抗训练方法代码源码

半监督文本分类的对抗训练方法规范此代码重现用。设置环境请安装和。您可以使用此轻松设置环...

大小：25KB | 2021-04-25 05:31:16