投稿時間:2022-05-08 18:13:18 RSSフィード2022-05-08 18:00 分まとめ(18件)

カテゴリー等 サイト名等 記事タイトル・トレンドワード等 リンクURL 頻出ワード・要約等/検索ボリューム 登録日
AWS lambdaタグが付けられた新着投稿 - Qiita AWS Lambda 実践入門 第2版でつまずいたところまとめ https://qiita.com/hk220/items/64c4d41b16331af1994b awslambda 2022-05-08 17:01:29
python Pythonタグが付けられた新着投稿 - Qiita Blenderのちょっとした自作アドオンの紹介 https://qiita.com/SaitoTsutomu/items/5db2c9fbdf5126315aef blender 2022-05-08 17:41:09
python Pythonタグが付けられた新着投稿 - Qiita 【SCC編】AtCoder Library 解読 〜Pythonでの実装まで〜 https://qiita.com/R_olldIce/items/a2c789cebdd098dcb503 atcoder 2022-05-08 17:37:33
python Pythonタグが付けられた新着投稿 - Qiita Python3エンジニア認定基礎試験合格体験記 https://qiita.com/MikaYamasaki/items/9eea862cbdfff45bbb91 認定 2022-05-08 17:24:16
js JavaScriptタグが付けられた新着投稿 - Qiita ワイ「なに!?普通のJavaScriptなのに型が書けるやと!?」 https://qiita.com/Yametaro/items/d86ae59e8b07aff1d4f5 javascript 2022-05-08 17:05:53
AWS AWSタグが付けられた新着投稿 - Qiita Amazon Managed Blockchain で Simple NFT Marketplace を触ってみた https://qiita.com/sugimount-a/items/297724a64ebf05612e5e amazonmanagedblockchain 2022-05-08 17:38:36
AWS AWSタグが付けられた新着投稿 - Qiita Raspberry Pi Zeroを使ってAmazon Kinesis Video StreamsにWeb RTCでカメラ映像送信 https://qiita.com/tetsuoMikami/items/e321a6f386775e353753 amazonkinesisvideostreams 2022-05-08 17:38:00
Git Gitタグが付けられた新着投稿 - Qiita GitHubにてPullRequest後のコンフリクト修正の流れ https://qiita.com/yukomaki/items/cfb514e0a3530fe4ab03 github 2022-05-08 17:12:53
海外TECH DEV Community How do Machines understand language? A look into the architectures behind Natural Language Understanding https://dev.to/ashwinscode/how-do-machines-understand-language-a-look-into-the-architectures-behind-natural-language-understanding-37ni How do Machines understand language A look into the architectures behind Natural Language UnderstandingMachines are rapidly getting better and better at understanding our languages Personal Assistants like Google Assistant Siri and Alexa can effectively understand user prompts and carry out any instructions that they might have been told Google Translate a tool that allows you to translate between several different languages is powered through deep learning techniques And the most impressive of call is Open AI s GPT which has produced unbelievable results Given any text from the user it provides what is known as a completion to this text GPT can complete texts to write its own little movie scripts song lyrics translate between languages write code and so much more Google have also recently announced its vision for its own similar model PaLM which is said to have x times as many the parameters that GPT has which is extremely exciting since we have the prospect of a machine that can produce even more human like text But how do all of these models understand what we say There are quite a few different approaches to understanding language which we will go through today Encoders and DecodersA common theme with all the models that seem to understand what we say well is that they consist of two parts an encoder and a decoder The job of the encoder as the name suggests is to embed encode the user s input into a vector that captures the meaning of the user The job of the decoder is to take the vector produced by the encoder and to decode it into a meaningful output If we think about translating between English and German this would mean that the decoder would produce a sentence in German given the vector embedding of the input English sentence Sequence To Sequence SeqSeq RNN Architecture Sequence to Sequence simply just refers to the fact that this architecture is designed for taking in one sequence and outputting another sequence which is what language translation and question answering is In this architecture both the encoder and decoder are LSTM models but GRUs can also be used If you are unsure at to how RNNs work which LSTMs and GRUs are I recommend researching a bit about them first in order to understand this architecture EncoderFirstly the encoder reads over the input sequence a time step at a time as RNNs always do The outputs of these LSTMs GRUs are discarded and only the final internal state s is preserved LSTMs have internal states while GRUs only have The name given to the final internal state is the context vector which captures the meaning of the input sentence DecoderThe decoder also being a LSTM GRU model takes the context vector as its initial hidden state It is then initially fed a start token and outputs a token the start token essentially acts as a trigger for the sequence outputting This outputted token is then fed back in as an input into the decoder which produces another token This process repeats until the decoder produces a stop token which signals to the decoder that it no longer needs to produce any more tokens All the tokens produced through this process form the output sequence Note A token may be a word or a character in the output sequence but most models usually produce the output sequence at word level With the encoder and decoder together here is how the whole thing would look like While this architecture can produce solid results its reliance on recurrent units means it takes a long time to train This is because data needs to be passed sequentially to recurrent units time step by time step In other words we can not parallel process the way we train recurrent nets and with modern GPUs being designed to parallel ly compute this seems like a lot of missed out potential This is where Transformers come in Attention MechanismBefore we get into how Transformers work we should get to look at what Attention is and how it works since it is core to how transformers work but has also been used with SeqSeq RNN models to improve their results One problem with encoder decoder models in general was that it was difficult to decode long sequences This was due to the fact that the context vector due to the nature of recurrent units ultimately captured information just from the ending of the input sequence instead of the whole thing Attention was introduced to solve this limitation of Encoder Decoder models Attention not only allows for the whole sequence s information to be captured but it also allows the model to see which part of the sequence has more importance than others For example if we take the sentence He is from Germany so he speaks German If we wanted to predict the word German in this sentence we d obviously have to pay more attention to Germany which came earlier on in the sentence In the SeqSeq model that was talked about before Bahdanau Attention is commonly used which performs a weighted sum of all the internal states of all the Recurrent units in order to capture the information of the whole sequence The weights of used in this weighted sum is learned throughout the training process The Attention mechanism used in Transformers however is slightly different Self AttentionSelf Attention is a form of attention that aims to find how much weight each token in a sequence has with other tokens in the same sequence For example let s take the sentence The boy did not want to play because he was tired If we look at the word he it is obvious to us that it is referring to The boy However for a neural network this relationship is not as straightforward Self Attention however enables a neural network to discover such relationships within a sequence How does this work then The aim of self attention is to give each token in the sequence a list of scores with each score corresponding to how much they relate to each token in the sequence it s in Self Attention takes in matrices Query Key and Value Through the matrix each token is essentially given a query vector a key vector and a value vector When scoring a token its query vector is taken and scored against the key vectors of all the other tokens scores range from to The value vector of the other tokens which represents the value of the token itself are taken and multiplied by their respective scores The idea behind this is to keep the values of the tokens that have relevance and to wipe out that tokens that don t have much relevance Here s how the maths looks like for this softmax Q⊗KTd V Zsoftmax frac Q otimes K T sqrt d V Zsoftmax d​Q⊗KT​ V ZQ is the Query MatrixK is the Key Matrixd is the number of dimensions length of each row in the Q K V matrices V is the Value MatrixZ is the output matrix of the tokens scores TransformersNow that we have got Attention down let s look into Transformers Transformers are able to do the same job as SeqSeq models but much more efficiently They also aren t just limited to sequence to sequence modelling but can be used for several classification tasks too As the diagram shows transformers also have an encoder decoder architecture and use a combination of attention mechanisms and feed forward neural networks to produce internal representations of their input sequences Both the encoder and decoder are made of modules that can stack up on themselves as many times as they need to shown by the Nx beside each module Positional EncodingYou may also notice that there are no recurrent units in transformers which is what is special about transformers Its lack of recurrent units means it can train much quicker than SeqSeq RNN models since it allows for inputs to be processed in parallel and there is no backpropagating through time However does this lack of recurrent units mean we can t capture any positional contextual information No Transformers are still able to capture positional information without the use of a recurrent unit As the diagram shows before the input sequences enter the encoder decoder they are first embedded into a vector since neural networks work with vectors and not the words themselves and then positionally encoded The formulae for positional encoding is as follows the first formula is applied to all even positions of the input vectorthe second formula is applied to all odd positions of the input vectorYou may also be wondering how the encoder passes its sequence representation to the decoder without the use of an RNN Well if you look closely at the diagram the output vector of the encoder is passed into the second attention block of the decoder module as the Query and Key vectors Text Generating with TransformersTransformers generate texts just like how the SeqSeq RNN model does The decoder is fed a start token which then produces an output This output is fed back as an input into decoder and this process repeats until a stop token is produced Classifying with TransformersTransformers aren t just for generating text Since transformers end up building their own internal understanding of language we can use the encoder to extract their language representation and use it to classify text For example BERT is a transformer model that consists of the encoder ONLY but can be effectively used for things like sentiment classification question answering and named entity recognition Thank you I hope you ve enjoyed learning a bit about how machines understand language This is by no means the most detailed explanation of how these models work but I hope they provide a solid overview of what goes under the hood If you interested in using transformers in code visit 2022-05-08 08:11:31
ニュース BBC News - Home Elections 2022: Boris Johnson's union headache just got worse https://www.bbc.co.uk/news/uk-politics-61367503?at_medium=RSS&at_campaign=KARANGA chris 2022-05-08 08:10:26
北海道 北海道新聞 国交省の検討会を週内に開催 知床の観光船事故で官房長官 船舶検査の実効性検討 https://www.hokkaido-np.co.jp/article/678171/ 知床半島 2022-05-08 17:32:04
北海道 北海道新聞 高知で「ビキニデー」集会 核兵器のない世界実現を https://www.hokkaido-np.co.jp/article/678176/ 核兵器のない世界 2022-05-08 17:31:00
北海道 北海道新聞 女子は五輪代表の前田がV 仙台国際ハーフマラソン https://www.hokkaido-np.co.jp/article/678175/ 弘進ゴム 2022-05-08 17:26:00
北海道 北海道新聞 専門業者が水中カメラで船体調査 知床・観光船事故 https://www.hokkaido-np.co.jp/article/678134/ 水中カメラ 2022-05-08 17:19:44
北海道 北海道新聞 日本ハム2軍でまたコロナ 上田コーチが感染 https://www.hokkaido-np.co.jp/article/678174/ 上田佳範 2022-05-08 17:15:00
北海道 北海道新聞 広17―3D(8日) 広島が5本塁打17得点 https://www.hokkaido-np.co.jp/article/678173/ 本塁打 2022-05-08 17:15:00
北海道 北海道新聞 中国ソロモン協定に懸念、林外相 パラオ大統領と会談 https://www.hokkaido-np.co.jp/article/678159/ 訪問先 2022-05-08 17:16:06
北海道 北海道新聞 山下が国内四大大会初制覇 サロンパス女子ゴルフ最終日 https://www.hokkaido-np.co.jp/article/678170/ 四大大会 2022-05-08 17:14:04

コメント

このブログの人気の投稿

投稿時間:2021-06-17 22:08:45 RSSフィード2021-06-17 22:00 分まとめ(2089件)

投稿時間:2021-06-20 02:06:12 RSSフィード2021-06-20 02:00 分まとめ(3871件)

投稿時間:2021-06-17 05:05:34 RSSフィード2021-06-17 05:00 分まとめ(1274件)