GitHub

CogAGENT: A Multimodal, Knowledgeable and Controllable Toolkit for Building Conversational Agents

Demo system and more information is available at https://synehe.github.io/CogAGENT/

A short illustration video is at https://youtu.be/SE0SEeiAmXI

Description

CogAGENT is a toolkit for building multimodal, knowledgeable and controllable conversational agents. We provide 17 models and integrate a variety of datasets covered above features. We decouple and modularize them flexibly to make users more convenient for development and research.

This package has the following advantages:

A multimodal, knowledgeable and controllable conversational framework. We propose a unified framework named CogAGENT, incorporating Multimodal Module, Knowledgeable Module and Controllable Module to conduct multimodal interaction, generate knowledgeable response and make replies under control in real scenarios.
Comprehensive conversational models, datasets and metrics. CogAGENT implements 17 conversational models covering task-oriented dialogue, open-domain dialogue and question-answering tasks. We also integrate some widely used conversational datasets and metrics to verify the performance of models.
Open-source and modularized conversational toolkit. We release CogAGENT as an open-source toolkit and modularize conversational agents to provide easy-to-use interfaces. Hence, users can modify codes for their own customized models or datasets.
Online dialogue system. We release an online system, which supports conversational agents to interact with users. We also provide a video to illustrate how to use it.

Install

Install from git

# clone CogAGENT 
git git@github.com:CogNLP/CogAGENT.git

# install CogAGENT  
cd cogagent
pip install -e .   
pip install -r requirements.txt

Quick Start

Programming Framework for Training Models

from cogagent import *
import torch
import torch.nn as nn
import torch.optim as optim

# init the logger,device and experiment result saving dir
device, output_path = init_cogagent(
    device_id=8,
    output_path=datapath,
    folder_tag="run_diffks_on_wow",
)

# choose utterance reader
reader = WoWReader(raw_data_path=raw_data_path)
train_data, dev_data, test_data = reader.read_all()
vocab = reader.read_vocab()

# choose data processor 
# In the training phase, no retriever is selected as the knowledge is provided by dataset
processor = WoWForDiffksProcessor(max_token_len=512, vocab=vocab, debug=False)
train_dataset = processor.process_train(train_data)
dev_dataset = processor.process_dev(dev_data)
test_dataset = processor.process_test(test_data)

# choose response generator
model = DiffKSModel()
metric = BaseKGCMetric(default_metric_name="bleu-4",vocab=vocab)
loss = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.0001)

# Use the provided Trainer class to start the model training process
trainer = Trainer(model,train_dataset,dev_data=test_dataset,n_epochs=40,batch_size=2,
                  loss=loss,optimizer=optimizer,scheduler=None,metrics=metric,
                  drop_last=False,gradient_accumulation_steps=1,num_workers=5,
                  validate_steps=2000,save_by_metric="bleu-4",save_steps=None,
                  output_path=output_path,grad_norm=1,
                  use_tqdm=True,device=device,
                  fp16_opt_level='O1',
                  )
trainer.train()

AVAILABEL MODELS OF COGAGENT

Modal	Category	Reference
SUMBT	Fundamental	SUMBT: Slot-Utterance Matching for Universal and Scalable Belief Tracking
SC-LSTM	Fundamental	Semantically conditioned lstm-based natural language generation for spoken dialogue systems
BERTNLU	Fundamental	ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems
MDRG	Fundamental	Towards end-to-end multi-domain dialogue modelling
UBAR	Fundamental	owards fully end-to-end task-oriented dialog system with gpt-
GPT2 for Chinese chitchat	Fundamental	Chinese chitchat
TransResNet-Ret	Multimodal	Image-Chat: Engaging Grounded Conversations
MMBERT	Multimodal	Selecting Stickers in Open-Domain Dialogue through Multitask Learning
MAE	Multimodal	MMCoQA: Conversational Question Answering over Text, Tables, and Images
PICa	Multimodal	An empirical study of gpt-3 for few-shot knowledge-based vqa
LingUNet	Multimodal	Where Are You? Localization from Embodied Dialog
DifffKS	Knowledgeable	Difference-aware Knowledge Selection for Knowledge-grounded Conversation Generation
KE-Blender	Knowledgeable	Knowledge Enhanced Fine-Tuning for Better Handling Unseen Entities in Dialogue Generation
NPH	Knowledgeable	Neural Path Hunter: Reducing Hallucination in Dialogue Systems via Path Grounding
BERTQA	Knowledgeable	Dense Passage Retrieval for Open-Domain Question Answering
KEMP	Controllable	OpenDialKG: Explainable Conversational Reasoning with Attention-based Walks over Knowledge Graphs
RobertaClassifier	Controllable	On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark

AVAILABEL DATASETS OF COGAGENT

Dataset	Category	Reference
MultiWOZ 2.0	Fundamental	MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling
MultiWOZ 2.1	Fundamental	MultiWOZ 2.1: A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines
Chinese chitchat Dataset	Fundamental	Chinese chitchat
MOD	Multimodal	DSTC10-Track1
MMConvQA	Multimodal	MMCoQA: Conversational Question Answering over Text, Tables, and Images
OK-VQA	Multimodal	Ok-vqa: A visual question answering benchmark requiring external knowledge
VQAv2	Multimodal	Making the v in vqa matter: Elevating the role of image understanding in visual question answering
WAY	Multimodal	Where Are You? Localization from Embodied Dialog
Wizard of Wikipedia	Knowledgeable	Wizard of Wikipedia: Knowledge-Powered Conversational Agents
Holl-E	Knowledgeable	Towards Exploiting Background Knowledge for Building Conversation Systems
OpenDialKG	Knowledgeable	OpenDialKG: Explainable Conversational Reasoning with Attention-based Walks over Knowledge Graphs
DIASAFETY	Controllable	On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark
EmpatheticDialogues	Controllable	Towards Empathetic Open-domain Conversation Models: A New Benchmark and Dataset

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
.idea		.idea
cogagent		cogagent
datapath		datapath
demo		demo
docs/images		docs/images
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

cogagent

cogagent

datapath

datapath

demo

demo

docs/images

docs/images

test

test

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Description

Install

Install from git

Quick Start

Programming Framework for Training Models

AVAILABEL MODELS OF COGAGENT

AVAILABEL DATASETS OF COGAGENT

About

Releases

Packages

Contributors 5

Languages

License

CogNLP/CogAGENT

Folders and files

Latest commit

History

Repository files navigation

Description

Install

Install from git

Quick Start

Programming Framework for Training Models

AVAILABEL MODELS OF COGAGENT

AVAILABEL DATASETS OF COGAGENT

About

Resources

License

Stars

Watchers

Forks

Languages