This blog is based on the paper A Generalization of Transformer Networks to Graphs with Xavier Bresson at 2021 AAAI Workshop on Deep Learning on Graphs: Methods and Applications (DLG-AAAI’21).
We present Graph Transformer, a transformer neural network that can operate on arbitrary graphs.
BLOG OUTLINE:
- Background
- Objective
- Key Design Aspects for Graph Transformer
- Proposed Graph Transformer Architecture
- Remarks From the Experiments
If this in-depth educational content is useful for you, subscribe to our AI research mailing list to be alerted when we release new material.
1. Background
Lets start with the two keywords, Transformers and Graphs, for a background.
Transformers
Transformers [1] based neural networks are the most successful architectures for representation learning in Natural Language Processing (NLP) overcoming the bottlenecks of Recurrent Neural Networks (RNNs) caused by the sequential processing. As the core building block of Transformers, there exists the multi-head attention mechanism, represented by this formula:
Using multi-head attention, a word attends to each other word in a sentence and combines the received information to generate its abstract feature representations.
Graphs
Graphs are ubiquitous data structures. There is a wide range of application domains where datasets can be represented as graphs. For example, molecular graphs in chemistry, interactions among particles in physics, drug protein interactions in medicine, user and their social and commercial connections in social media, problems in combinatorial optimization, etc.
For learning on graphs, graph neural networks (GNNs) have emerged as the most powerful tool in deep learning.
In short, GNNs consist of several parameterized layers, with each layer taking in a graph with node (and edge) features and builds abstract feature representations of nodes (and edges) by taking the available explicit connectivity structure (i.e., graph structure) into account. The so-generated features are then passed to downstream classification layers, usually MLPs, and the target property is predicted.
2. Objective
Now, the target is to generalize transformer neural networks to graphs so that it can learn on graphs and datasets with arbitrary structure rather than just the sequential (as can be interpreted to be done by the NLP Transformers).
To proceed with the objective, we focus on extending the key design principles of Transformers from NLP to graphs in general.
3. Key Design Aspects for Graph Transformer
We find that attention using graph sparsity and positional encodings are two key design aspects for the generalization of transformers to arbitrary graphs.
Now, we discuss these from the contexts of both NLP and graphs to make the proposed extensions clear for Graph Transformer.
Graph Sparsity
In NLP, Transformers consider full attention while building feature representations for words. That is, a transformer treats a sentence as a fully connected graph of words. This choice of full attention can be justified for two reasons:
First, it is difficult to find meaningful sparse interactions or connections among the words in a sentence. For instance, the dependency of a word in a sentence on another word can vary with context, user’s perspective, and application at hand. There can be numerous plausible ground truth connections among words in a sentence and therefore, text datasets of sentences do not often have explicit word interactions available. It thereby makes sense to perform full attention and let the model decide how the words depend on others.
Second, the so-interpreted fully connected graph in NLP often has less than tens of hundreds of nodes (i.e., sentences are often less than tens or hundreds of words). On this size of graphs, attention to every node is computationally feasible in memory and time.
For these two reasons, full attention can be performed in NLP transformers and subsequent works [2,3,4] have shown it to be fruitful in language modeling and several NLP tasks.
However, in the case of actual graph datasets, graphs have arbitrary connectivity structure available based on the application domain, and have node sizes in ranges of up to millions, or even billions. The available structure presents us with a rich source of information to exploit while learning in the neural network, whereas the node sizes practically makes it impossible to have a fully connected graph for such datasets.
On these accounts, it is practical (for feasibility) and advantageous (for utlizing sparse structure information) to have a Graph Transformer where a node attends to local node neighbors, similar to Graph Attention Networks (GATs) [5].
In fact, the local information aggregation is at the core of GNNs, indicative of the fact that sparsity is a good inductive bias for generalization.
Positional Encodings
The attention mechanism in Transformer is invariant to the ordering of the nodes. It does not have any notion of where in the sequence (or the sentence) the word is located. This means that, the Transformer regards a multi-set of words rather than the sequence of words in NLP, as illustrated by the following comparison:
That would mean losing out some information about the ordering of the words, isn’t it?
To avoid this and making Transformer aware of the sequential information, some kind of positional encoding is necessary in the transformer. The original transformer by Vaswani et al. [1] uses sinusoidal positional encoding that is added to each word’s feature vector at the inputs. This helps encode the necessary prevalent (sequential) relationship among the words into the model.
We extend this critical design block of positional information encoding for Graph Transformer. In fact, a line of research in GNNs [6,7,8] has recently shown that positional information improves GNNs and overcomes the failure of GNNs for several fundamental tasks.
We therefore leverage the success of the recent works on positional information in GNNs and use Laplacian Positional Encodings [8] in Graph Transformer. We use precomputed Laplacian eigenvectors [9] to add into the node features before the first layer, similar to how positional encodings are added in the original Transformer [1].
Laplacian PEs are a natural generalization of the sinusoidal PEs used in the original transformer, as the sinusoidal PEs can be interpreted as eigenvectors of the line graph, which is the sentence in NLP.
Hence, sparse graph structure during attention and positional encodings at the inputs are the two important things we consider while generalizing transformers to arbitrary graphs.
4. Proposed Graph Transformer Architecture
We now present the proposed architecture — the Graph Transformer Layer and the Graph Transformer Layer with edge features. The schematic diagram of a layer shown as follows consist of the major components — the input with PEs, the multi-head attention mechanism with attention restricted to local neighbors, and the feed forward module.
Compared to the standard transformer of Vaswani et al. [1], the key differences (or the extensions) which generalizes transformer to graphs to result in a Graph Transformer are:
i) The attention mechanism is a function of the neighborhood connectivity for each node, which is shown by the formula:
ii) Positional Encoding is represented by Laplacian PEs. In particular, the eigenvectors of graph Laplacian are precomputed for every graph before training, and k-smallest non-trivial eigenvectors of a node are assigned as the node’s PE.
iii) The feed forward module employs batch normalization [10] instead of layer normalization [11] that was used in the original transformer [1]. This is supported by our empirical evidences that the use of batch normalization leads to better performance in place of layer normalization.
iv) Graph Transformer is extended to have edge representation (see the Graph Transformer Layer with edge features at right of the architecture diagram). This architecture can be critical to datasets with rich information along the edges, for instances — bond information along the edges in molecular graphs, or relationship types in knowledge graphs. There are two things to note in this edge-extended architecture: the edge features are fused to the corresponding pairwise implicit attention scores, and there is a designated edge feature pipeline at every layer, as shown by the layer update equations as follows:
This concludes the description of the proposed Graph Transformer. We refer to the paper for the complete layer update equations.
5. Remarks From the Experiments
We evaluate Graph Transformer on benchmark datasets and validate our design choices, alongside an attempt to answer some open research questions on: i) whether local attention or full attention for generalization of transformers to graphs, ii) how to encode sparse structure information?, iii) positional encoding candidates.
We remark the following results:
a) As already discussed, graph sparsity is critical and its use in attention mechanism always gives better performance as compared to use full attention. The result that sparsity is a good inductive bias for graph datasets has already been shown by the success of GNNs on several application areas.
b) Among several combination of design choices on attention, use of PEs, normalization candidate, etc., the architecture using i) attention to local neighbors, ii) Laplacian PEs, and iii) batch normalization layer in feed forward module, has the best performance across all datasets used for evaluation. This empirically validates the choice of using these components for the targeted generalization of transformers.
c) Since Laplacian PEs have desirable properties of having i) distance-aware information, ii) distinguishable node features, and iii) being a generalization of original transformer’s PE’s to general graphs, it fares as a suitable PE candidate to be used as Graph Transformer, even empirically, as compared to the PE candidates used in the literature for research on Graph Transformer (the related works are discussed in detail in the paper).
d) Overall, Graph Transformer achieves competitive performance among the GNNs compared with on the evaluated datasets. The proposed architecture performs significantly better than baseline GNNs (GCNs [12] and GATs [5]), and helps close the gap between the original transformer and transformer for graphs. Thus, Graph Transformer emerges as a fresh powerful attention based GNN baseline and we hope can easily be extended for future research, provided its simplicity and straightforward generalization from transformers.
The tables of the numerical experiments are in the paper, the code implementation is open-sourced on GitHub, and an accompanying video presentation is on YouTube, with the corresponding links as follows:
Paper: https://arxiv.org/abs/2012.09699
GitHub: https://github.com/graphdeeplearning/graphtransformer
Video: https://youtu.be/h-_HNeBmaaU?t=240
[1] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I. (2017). Attention is all you need.
[2] Devlin, J., Chang, M.W., Lee, K. and Toutanova, K., (2018). Bert: Pre-training of deep bidirectional transformers for language understanding.
[3] Radford, A., Narasimhan, K., Salimans, T. and Sutskever, I., (2018). Improving language understanding by generative pre-training.
[4] Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A. and Agarwal, S., (2020). Language models are few-shot learners.
[5] Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P. and Bengio, Y. (2018). Graph attention networks.
[6] Srinivasan, B. and Ribeiro, B., (2019). On the equivalence between positional node embeddings and structural graph representations.
[7] You, J., Ying, R. and Leskovec, J., (2019). Position-aware graph neural networks.
[8] Dwivedi, V. P., Joshi, C. K., Laurent, T., Bengio, Y., and Bresson, X. (2020). Benchmarking graph neural networks.
[9] Belkin, M. and Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation.
[10] Ioffe, S. and Szegedy, C., (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift.
[11] Ba, J.L., Kiros, J.R. and Hinton, G.E., (2016). Layer normalization.
[12] Kipf, T. N., and Welling, M. (2016). Semi-supervised classification with graph convolutional networks.
This article was originally published on Towards Data Science and re-published to TOPBOTS with permission from the author.
Enjoy this article? Sign up for more AI updates.
We’ll let you know when we release more technical education.
iiqgzsvzp says
Graph Transformer: A Generalization of Transformers to Graphs
[url=http://www.g09d07ajg5g6yyk2rleb85442fr0p673s.org/]uiiqgzsvzp[/url]
iiqgzsvzp http://www.g09d07ajg5g6yyk2rleb85442fr0p673s.org/
aiiqgzsvzp
Wholesale Washed Streetwear Oversized Embroidered Long Sleeve Round-neck Men's Set Summer Jam T Shirts says
EPRO PR6423/010-010 CON021
EPRO MMS 6620
China High Quality New Fashion Drop-shoulder Printed Men's Plus Size Drop-shoulder Long Dress T Shirts
EPRO CON-011
Pick Up My Parcel
EPRO A6370
Vintage High Quality Oversized Cotton T Shirt Drop-shoulder Printed Men's Plus Size Long Dress T Shirts
Problems With Last Mile Delivery
China Hot Sale Custom Striped Breathable Loose Contrast Color Couples For Men Summer New Fashion Tshirt
Parcel Locker Send Parcel
EPRO EPRO PR6423/004-010
Storage Parcel
Residential Parcel Drop Box
worksp.sakura.ne.jp
Casual Y2k O-neck Clothing Custom Breathable Men's T-shirts Vintage Washed Cotton T Shirt Daily Wear Men Tshirt
Wholesale Washed Streetwear Oversized Embroidered Long Sleeve Round-neck Men's Set Summer Jam T Shirts
Men's Hoodie With Bold Custom Logo Text Front Statement Piece says
Blood Fridge
River Sand Suction Dredger
Thick Cotton Streetwear Essential
River Dredger With Cutter Head
Unisex Fit
Small Dredger for Sand Dredging
Blood Freezer
Ice Lined Refrigerator For Vaccines
River Sand Cutter Suction Dredger
Biological Refrigerator
Street Fashion
alajlangroup.com
Cell Centrifugation
River Small Dredger
Custom Logo Black Hoodie Men's Oversized Cotton Streetwear
Men's Hoodie With Bold Custom Logo Text Front Statement Piece
Men's Casual Black Hoodie Fleece Winter Pullover Streetwear Essential Soft Comfortable Cotton Rich Urban Wear says
LED Light Ultrasonic Air Humidifier
Men's Grey Sweatshirt Casual Logo Print Streetwear Hoodies Men's Fashion Essential Comfortable Cotton Soft Pullover
Men's Black Quilted Hoodie Innovative Design Warm Comfortable Winter Hoodies Unisex High-quality Streetwear Durable
Pedal Switching System
Commercial Anion Air Purifier
Keyboard Foot Switch
Brake Switch
http://www.xinpian.net
Portable Humidifier
Foot Inductive Switch
Portable Mini Air Purifier
Radiology Switch
Men's Solid Black Hoodie Casual Cotton Pullover Comfortable Heavyweight Streetwear Classic Fit Soft Athleisure Essential
Portable Aroma Essential Oil Diffuser
Men's Navy Blue Pullover Hoodie Comfortable Heavyweight Cotton Streetwear Classic Casual Essential Warm Urban Athleisure Top
Men's Casual Black Hoodie Fleece Winter Pullover Streetwear Essential Soft Comfortable Cotton Rich Urban Wear
Men's Light Grey Sweatshirt Soft Cotton Pullover Comfort Fit For Streetwear And Casual Looks says
All Electric Scooter
Stylish Women's Tracksuit Hoodie – Cotton Sweatshirt For Customization Unisex Streetwear Fashion Clothes Women
Custom Logo Men's Classic Hooded Sweatshirt Casual Fit Hoodies For Men No Pockets All-in-one Leisure
Compression Gland Packing
Men's Hoodies & Sweatshirts Unisex Cotton Blend Plain Hoodies In Neutral Colors For Streetwear
Men's Fashion Crewneck Sweatshirt Stylish Solid Color Top For Urban Streetwear Fleece Sweatshirt With Men
Compression Sheets
zubrzyca-gorna.misiniec.pl
Electric Pro Scooter
Industrial Sealing Gaskets
Industrial Sealing Gaskets
Dot Motorcycle
150cc Motorcycle
Dot Motorcycle
Flange Insulation Gasket Kits
Men's Light Grey Sweatshirt Soft Cotton Pullover Comfort Fit For Streetwear And Casual Looks
A Transformer says
soscovid.univ-skikda.dz
240*320 8bit-MCU ST7789V 24 Pin
Sf6 Gas Switchgear
Customization High Quality Summer Multi Color Round Neck Vintage Washed Short Bleached Women T Shirt
Customization Summer Cotton Womens Sports T-shirt Breathable Technology Solid Crop Top Womens T-shirt
Customization Summer Cotton Women's T-shirts Crop Top Multi Color Quality Slim Fit Womens T-shirts
Good Quality Women's Organic Cotton T Shirts Designer Round-neck Breathable Women's Tight T-shirt
11kv Dry Type Transformer
Hot Sale Factory Direct Supply Vintage Printed Casual Short Sleeve Drop-shoulder Tee Shirt For Women
Residential Substation
320*240 SPI and RGB 40 Pin
2.4 Inch TN TFT Display
2.8 inch 240*320 TN TFT LCD module
15kv Breaker
2.4 inch 240*320 TN TFT LCD module
A Transformer
A3 Model UV Printer says
UV Printer
Strontium Aluminate Based Luminous Pigment
Original Plus Size Summer Shinny Boxy Cropped Short Sleeve Mens T Shirts Streetwear Funny Last Fashion
Plasmolifting Tubes
bigstar.co.jp
Printing Landscape Painting
Heavyweight T-shirt Blank Cotton Tshirts Printing Design Black Tee Custom Logo Men T Shirts
Summer Men Fashion Cotton Plus Size Anti Cardio Club T-shirt Multi Color Cotton Casual Men Tshirt
Fashion Soft Cotton T Shirt Hip Hop Graphic Custom Mens Casual T Shirt Vintage Sleeveless T-shirt For Men
Prp Tube Sodium Citrate
High Quality Cotton Short Sleeve T-shirt Solid Color Printed Cropped Plus Size Drop Shoulder T Shirts For Men
Prp Tube Biotin
Tube Prp Avec Gel
UV Printing Machine
UV Digital Inkjet Printing Supplier
A3 Model UV Printer
Nylon Elastic Band says
Lifting Insert With Cross Pin
Men's Hoodies With Strass 2024 Cartoon Dog Character Print Streetwear Hoodie Unisex High-quality Sweatshirt
China Factory New Design Patchwork Multi-colored Striped Short Sleeve Casual Fashion T Shirt For Women
Streetwear Style Soft And Breathable Plain Design Versatile Fashion Ideal For Custom Logo Wholesale Availability Custom Size
http://www.osuszaniegdansk.com
Stretch Ribbon
Special Lifting Loop
Summer Logo Custom Printed Embroidery Pattern Tshirt Cotton Slim Fit High Quality White Color Women Tshirt
1 Inch Nylon Webbing
Bolt Anchor Socket
Lifting Eye
Lifting Loop
Satin Lace Ribbon
Versatile And Lightweight Suitable For Sports And Social Events Easy Care Available In Plus Sizes Embroidered Detail For A Touch
Round Shoelaces
Nylon Elastic Band
Luxury Massage Chair says
Men's Vintage Style Brown Hoodie – Distressed Oversized Sweatshirt
Chair Portable Massage
Blind Rivets
Office Chair Massager
Pins
Rivet Nut
Men's Essentials Hoodie In Oversized Heavyweight Sweatshirt With Drop Shoulder
Custom Logo Option
Men's Black Hoodie With Red Flame Design Heavyweight Streetwear Sweatshirt
Semi-Tubular Rivets
Eye Massager Intelligent Eye Care
Streetwear Ready
Rivet
Recliner Chair Massage
treatec.myjino.ru
Luxury Massage Chair
High-quality Anti-shrink Hoodie Unisex With Custom Logo For Mens says
Led Grow Light 1000w
Misunderstandings of CNC parts processing
Olive Green Zip Up Hoodie Men's Fitness Clothing Casual Outerwear
Statement of plant growth principle of LED plant growth lamp
Men's Black And Grey Pattern Hoodie Streetwear With Custom Design
firesafety.ro
Big Kidsbootsus Size 4
Indoor Slippers
Men's Black Hoodie With White Custom Logo Streetwear Essential
Processing steps of CNC parts
Beige Cotton Hoodie Men's Streetwear With Drawstring Hood
The structure of the flat knitting machine
Sandals
Big Kidsbootsus Size 5
Outdoor Shoes
High-quality Anti-shrink Hoodie Unisex With Custom Logo For Mens
Cool And Comfy Men's T-shirt says
Ceramic Weed Grinder Manufacturers
hcaster.co.kr
Cigarette Ashtray For Outside
Rolling Tray
Wholesale 100% Cotton Acid Wash T Shirt Oversize Custom Printing Embroidery Vintage Men's T-shirt
Isobutyl Alcohol Cas No. 78-83-1
Tobacco Preroll Tray
Cigarette Ashtray For Outside
Iso Butanol
IPA Cas No. 78-83-1
Comfort-fit Cotton Men's Tee For Casual Streetwear
Vintage Fashion Luxury Brand Solid Color Polo Crew Neck T Shirts Men Cotton Clothing Shirt For Men Hip Hop
Isobutanol Cas No. 78-83-1
Cotton Wholesale White Customized Essential Custom Blank Plain Drop Shoulder Tee Shirt Mens T Shirt Printing
Iso-Butanol
Pure Cotton
Men's Hoodies Customizable Logo Cotton Sweatshirt With Striped Cuffs For Streetwear says
Flat Belt Pulley
http://www.radioklub.blansko.net
High-quality Women's Cotton Hoodie – Custom Sweatshirt Blank Hoodies For Printing Streetwear Ready
FS19596 Truck Filters Water Separator for Sinotruk
Torsionally Flexible Couplings
China Fatocy Custom Rhinestone T Shirt 100% Cotton High Quality Women Crystal Logo Cropped Sexy Tshirts
Truck Spare Engine Parts Fuel Filter Bw5073
Cast Iron Pulley
Fuel Filter 31922-2e900 for Korean Car
Truck Diesel Engine Fuel Filter for Volvo Truck Heavy Truck
Excavator Diesel Fuel Filter 1R-0750
Fashionable Men's Hooded Black Sweatshirt – Cozy Sweatshirt With A Modern Fit For Daily Use Fleece Splicing
Non-Marring Shaft Collars
Torsionally Flexible Couplings
Custom Logo Tops Woman Sexy Crop T Shirt Printed Plain Cotton Crops Multi Color Summer Custom Women T Shirt
Men's Hoodies Customizable Logo Cotton Sweatshirt With Striped Cuffs For Streetwear
Health Care Products Cap says
Gas Single Rod Extrusion Heating Furnace
Kids Bright Green Dinosaur Embossed Hoodie Soft Fabric Pullover For Boys And Girls Eco-friendly Casual Wear
O-neck Hoodies Dusty Pink Pullover Sweatshirt With Many Colors Customizable Text Pink Sweatshirt For Women Soft Cotton Pullover
Aluminum Closures
Men's Oversized Plain Dark With Logo High-quality Cotton Custom Color Breathable Sports T-shirt With Men
Plastic Lid
Unisex Zip-up Hoodie – Heavyweight Cotton Sweatshirt For Men And Women
5000T Infrared Aluminum Rod Heating Furnace Equipment
750T Fully Automatic Vertical Aluminum Hot Log Shear
1000T JeT Type Gas Single Aluminum Rod Heating Furnace
Glass Bottles For Liquor
Men Oversized Plain With Logo Pocket High Quality Cotton Premium Custom Color Cotton Men T-shirt With Pocket
http://www.lavidamata.xyz
JeT Type Gas Single Aluminum Rod Heating Furnace
Aluminum Lids
Health Care Products Cap
Premium Telescopic Car Cleaning Brush says
Extendable Duster For Washing Cars
Telescopic Duster For Car Cleaning
Web Slings For Lifting
Wholesale Women's Blank Tshirt Custom Logo Printing Available High-quality Fabric Assorted Color Choices Bulk Order For Branding
Custom Sex Appeal Tshirts For Women Wholesale Blank Tees Premium Fabric Multiple Color Perfect For Personalization Branding
Aluminum Track
Women's Pink Zip-up Hoodie With Embossed Lettering And Star Detail Casual Comfortable Streetwear Fleece Sweatshirt For Winter
Two-piece Set Green Cropped Hoodie With Matching Shorts Women's Sportswear Cotton French Terry Loungewear Summer Outfit
Car Detailing Duster With Extendable Handle
Shoring Bar
Multi-functional Bridge Stick For Pool
Women's Casual Oversized Hoodie With Plush Teddy Bear Print For Winter Comfort In High-quality Cotton
Lashing Strap
Powder Coated E Track
http://www.sporry.com
Premium Telescopic Car Cleaning Brush