Skip to content

CogNLP/CogAGENT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


*CogAGENT: A Multimodal, Knowledgeable and Controllable Toolkit for Building Conversational Agents

CogAGENT: A Multimodal, Knowledgeable and Controllable Toolkit for Building Conversational Agents

Demo system and more information is available at https://synehe.github.io/CogAGENT/

A short illustration video is at https://youtu.be/SE0SEeiAmXI

Description

CogAGENT is a toolkit for building multimodal, knowledgeable and controllable conversational agents. We provide 17 models and integrate a variety of datasets covered above features. We decouple and modularize them flexibly to make users more convenient for development and research.

This package has the following advantages:

  • A multimodal, knowledgeable and controllable conversational framework. We propose a unified framework named CogAGENT, incorporating Multimodal Module, Knowledgeable Module and Controllable Module to conduct multimodal interaction, generate knowledgeable response and make replies under control in real scenarios.
  • Comprehensive conversational models, datasets and metrics. CogAGENT implements 17 conversational models covering task-oriented dialogue, open-domain dialogue and question-answering tasks. We also integrate some widely used conversational datasets and metrics to verify the performance of models.
  • Open-source and modularized conversational toolkit. We release CogAGENT as an open-source toolkit and modularize conversational agents to provide easy-to-use interfaces. Hence, users can modify codes for their own customized models or datasets.
  • Online dialogue system. We release an online system, which supports conversational agents to interact with users. We also provide a video to illustrate how to use it.

Install

Install from git

# clone CogAGENT 
git git@github.com:CogNLP/CogAGENT.git

# install CogAGENT  
cd cogagent
pip install -e .   
pip install -r requirements.txt

Quick Start

Programming Framework for Training Models

from cogagent import *
import torch
import torch.nn as nn
import torch.optim as optim

# init the logger,device and experiment result saving dir
device, output_path = init_cogagent(
    device_id=8,
    output_path=datapath,
    folder_tag="run_diffks_on_wow",
)

# choose utterance reader
reader = WoWReader(raw_data_path=raw_data_path)
train_data, dev_data, test_data = reader.read_all()
vocab = reader.read_vocab()

# choose data processor 
# In the training phase, no retriever is selected as the knowledge is provided by dataset
processor = WoWForDiffksProcessor(max_token_len=512, vocab=vocab, debug=False)
train_dataset = processor.process_train(train_data)
dev_dataset = processor.process_dev(dev_data)
test_dataset = processor.process_test(test_data)

# choose response generator
model = DiffKSModel()
metric = BaseKGCMetric(default_metric_name="bleu-4",vocab=vocab)
loss = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.0001)

# Use the provided Trainer class to start the model training process
trainer = Trainer(model,train_dataset,dev_data=test_dataset,n_epochs=40,batch_size=2,
                  loss=loss,optimizer=optimizer,scheduler=None,metrics=metric,
                  drop_last=False,gradient_accumulation_steps=1,num_workers=5,
                  validate_steps=2000,save_by_metric="bleu-4",save_steps=None,
                  output_path=output_path,grad_norm=1,
                  use_tqdm=True,device=device,
                  fp16_opt_level='O1',
                  )
trainer.train()

AVAILABEL MODELS OF COGAGENT

Modal Category Reference
SUMBT Fundamental SUMBT: Slot-Utterance Matching for Universal and Scalable Belief Tracking
SC-LSTM Fundamental Semantically conditioned lstm-based natural language generation for spoken dialogue systems
BERTNLU Fundamental ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems
MDRG Fundamental Towards end-to-end multi-domain dialogue modelling
UBAR Fundamental owards fully end-to-end task-oriented dialog system with gpt-
GPT2 for Chinese chitchat Fundamental Chinese chitchat
TransResNet-Ret Multimodal Image-Chat: Engaging Grounded Conversations
MMBERT Multimodal Selecting Stickers in Open-Domain Dialogue through Multitask Learning
MAE Multimodal MMCoQA: Conversational Question Answering over Text, Tables, and Images
PICa Multimodal An empirical study of gpt-3 for few-shot knowledge-based vqa
LingUNet Multimodal Where Are You? Localization from Embodied Dialog
DifffKS Knowledgeable Difference-aware Knowledge Selection for Knowledge-grounded Conversation Generation
KE-Blender Knowledgeable Knowledge Enhanced Fine-Tuning for Better Handling Unseen Entities in Dialogue Generation
NPH Knowledgeable Neural Path Hunter: Reducing Hallucination in Dialogue Systems via Path Grounding
BERTQA Knowledgeable Dense Passage Retrieval for Open-Domain Question Answering
KEMP Controllable OpenDialKG: Explainable Conversational Reasoning with Attention-based Walks over Knowledge Graphs
RobertaClassifier Controllable On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark

AVAILABEL DATASETS OF COGAGENT

Dataset Category Reference
MultiWOZ 2.0 Fundamental MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling
MultiWOZ 2.1 Fundamental MultiWOZ 2.1: A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines
Chinese chitchat Dataset Fundamental Chinese chitchat
MOD Multimodal DSTC10-Track1
MMConvQA Multimodal MMCoQA: Conversational Question Answering over Text, Tables, and Images
OK-VQA Multimodal Ok-vqa: A visual question answering benchmark requiring external knowledge
VQAv2 Multimodal Making the v in vqa matter: Elevating the role of image understanding in visual question answering
WAY Multimodal Where Are You? Localization from Embodied Dialog
Wizard of Wikipedia Knowledgeable Wizard of Wikipedia: Knowledge-Powered Conversational Agents
Holl-E Knowledgeable Towards Exploiting Background Knowledge for Building Conversation Systems
OpenDialKG Knowledgeable OpenDialKG: Explainable Conversational Reasoning with Attention-based Walks over Knowledge Graphs
DIASAFETY Controllable On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark
EmpatheticDialogues Controllable Towards Empathetic Open-domain Conversation Models: A New Benchmark and Dataset

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages