投稿時間:2022-09-15 18:42:22 RSSフィード2022-09-15 18:00 分まとめ(50件)

カテゴリー等	サイト名等	記事タイトル・トレンドワード等	リンクURL	頻出ワード・要約等/検索ボリューム	登録日
IT	気になる、記になる…	楽天モバイルも｢Apple Watch Series 8｣の発売を延期	https://taisy0.com/2022/09/15/162145.html	applewatchseries	2022-09-15 08:54:53
IT	気になる、記になる…	au、｢Apple Watch Series 8｣を9月17日に発売へ	https://taisy0.com/2022/09/15/162143.html	applewatchseries	2022-09-15 08:51:13
IT	気になる、記になる…	｢Chromecast with Google TV (HD)｣の製品画像が流出	https://taisy0.com/2022/09/15/162139.html	chromecast	2022-09-15 08:45:46
IT	InfoQ	Article: The Hows and Whys of Effective Production-Readiness Reviews	https://www.infoq.com/articles/incidents-prr-psychological-safety/?utm_campaign=infoq_content&utm_source=infoq&utm_medium=feed&utm_term=global	Article The Hows and Whys of Effective Production Readiness ReviewsAt QCon Plus November Nora Jones CEO and founder of Jeli talked about how to build production readiness reviews PRR with emphasis on context and psychological safety Her talk focused on the particulars of a PRR process that relates to incidents By Nora Jones	2022-09-15 09:00:00
IT	ITmedia 総合記事一覧	[ITmedia ビジネスオンライン] 気分が最も落ち込みやすい時間は「午後11時」　そのワケは？	https://www.itmedia.co.jp/business/articles/2209/15/news187.html	itmedia	2022-09-15 17:45:00
IT	ITmedia 総合記事一覧	[ITmedia ビジネスオンライン] VTuberグループ「にじさんじ」を運営するANYCOLOR、順調な四半期決算	https://www.itmedia.co.jp/business/articles/2209/15/news184.html	anycolor	2022-09-15 17:43:00
IT	ITmedia 総合記事一覧	[ITmedia Mobile] マイナポイントで登録した決済サービス、最も人気が高いものは？　オリコン調べ	https://www.itmedia.co.jp/mobile/articles/2209/15/news165.html	楽天カード	2022-09-15 17:30:00
IT	ITmedia 総合記事一覧	[ITmedia Mobile] iPhone 14の価格が必ずしも「高い」といえない理由	https://www.itmedia.co.jp/mobile/articles/2209/15/news151.html	iphone	2022-09-15 17:15:00
AWS	lambdaタグが付けられた新着投稿 - Qiita	スマートスピーカーを使って日々の健康状態を記録してみた	https://qiita.com/tNakka/items/e0faa21e7855ba0de353	健康状態	2022-09-15 17:59:16
AWS	lambdaタグが付けられた新着投稿 - Qiita	EventBridgeとLambdaを使用したバッチ処理	https://qiita.com/lyd-ryotaro/items/ede118eed0766c7aaf17	autoscaling	2022-09-15 17:20:20
python	Pythonタグが付けられた新着投稿 - Qiita	スマートスピーカーを使って日々の健康状態を記録してみた	https://qiita.com/tNakka/items/e0faa21e7855ba0de353	健康状態	2022-09-15 17:59:16
Ruby	Rubyタグが付けられた新着投稿 - Qiita	Gem::ConflictErrorについて	https://qiita.com/tky8522/items/3a701b0eb7e93bb8c2d6	gemconflicterror	2022-09-15 17:22:44
AWS	AWSタグが付けられた新着投稿 - Qiita	ソリューションアーキテクト対策_グローバルNW篇	https://qiita.com/sosat/items/2eb29933216783596f9e	blackbelt	2022-09-15 17:37:54
AWS	AWSタグが付けられた新着投稿 - Qiita	EventBridgeとLambdaを使用したバッチ処理	https://qiita.com/lyd-ryotaro/items/ede118eed0766c7aaf17	autoscaling	2022-09-15 17:20:20
AWS	AWSタグが付けられた新着投稿 - Qiita	ソリューションアーキテクト対策_オンプレからの移行篇	https://qiita.com/sosat/items/a996d269cfc615021f2a	awsapplicationmigratio	2022-09-15 17:12:54
Docker	dockerタグが付けられた新着投稿 - Qiita	【Docker】ユーザーを指定してコンテナに入る方法	https://qiita.com/akko_merry/items/1e08a85a0da67fe7fd65	ckercomposeexecuserrootwo	2022-09-15 17:15:11
Docker	dockerタグが付けられた新着投稿 - Qiita	Kubernetes上で立ち上げられたJMX Server（SpringBoot）に接続する時のTips	https://qiita.com/asmg07/items/c885a6ee0a1dbc3d22b9	jconsole	2022-09-15 17:01:37
Ruby	Railsタグが付けられた新着投稿 - Qiita	Gem::ConflictErrorについて	https://qiita.com/tky8522/items/3a701b0eb7e93bb8c2d6	gemconflicterror	2022-09-15 17:22:44
技術ブログ	Developers.IO	【AWS初心者必見】テクニカルサポートに寄せられる”本当に”よくある質問	https://dev.classmethod.jp/articles/aws-most-frequently-asked-questions/	mhayakawa	2022-09-15 08:11:33
海外TECH	DEV Community	How to fine-tune your embeddings for better similarity search	https://dev.to/meetkern/how-to-fine-tune-your-embeddings-for-better-similarity-search-445e	How to fine tune your embeddings for better similarity searchThis blog post will share our experience with fine tuning sentence embeddings on a commonly available dataset using similarity learning We additionally explore how this could benefit the labeling workflow in the Kern AI refinery To understand this post you should know what embeddings are and how they are generated A rough idea of what fine tuning is also helps All the code and data referenced in this post is available on GitHub What constitutes a better similarity search We are constantly looking to improve our kern refinery where labeling plays a central role There are several ways how we can leverage embeddings to enhance the labeling process One tool we already implemented is similarity search where you can select any record and look for similar records based on cosine similarity of their embeddings Screenshot of the kern refinery data browser with the options to start similarity search left or start a custom labeling session right on a record This can be combined with a custom “labeling session which is just a name for the selection of records that you are presented with during manual labeling That means you can gather the most similar records with similarity search and start labeling them manually We found that the labeling experience gets much smoother if you have less context switches within one labeling session Therefore the goal of fine tuning our embeddings is getting more records of the same class within a similarity labeling session Why fine tune your embeddings Large language models LLM solve a wide variety of tasks like question answering information extraction and sentiment analysis What makes them so good at those tasks is a combination of the right architecture a well designed training procedure and the availability of the whole internet for training data For example a more recent LLM from Google called “LaMDA was trained on trillion words from public forums tutorials Wikipedia web documents and other sources Using these vast amounts of available data an LLM is trained to generalize across several domains which results in a model that is generally really good but lacks domain specific expertise This is where fine tuning comes into play Fine tuning is the process of adjusting your language model to better fit the domain of your data If you for example want to process a lot of legal documents about the building process of an offshore wind farm you might want to specialize your LLM on these kinds of texts Though before fine tuning it yourself you should always take a look at the Hugging Face model database and check if someone already fine tuned a model on data that is similar to yours Similarity LearningThe last prerequisite we want to look at before diving into the experiment is “similarity learning In order to fine tune embeddings we need a task to solve This task could be anything from supervised classification to unsupervised masked token prediction Since we want better similarity search for our labeling sessions we will opt for a task that incorporates class information Open source framework for similarity learning Link to GitHub We discussed internally what we wanted to try and settled on similarity learning because it is easy to set up very fast in training and generally just something new to us we wanted to check out Similarity is in our case defined by the class labels That means two records are similar if they carry the same class label and they are different if they do not carry the same class label The DataWe wanted to take easy to understand and widely available data for this use case so we settled on the “AG News classification dataset which has four classes World Sports Business and Sci Tech Although it is already labeled which helps us in the evaluation later on we will act like it is an unlabeled dataset in order to show the full process Every record has a title a description and the associated label We selected records by random loaded them into kern refinery and labeled manually After creating some labeling functions and active learners we ran the weak supervision and ended up with weakly supervised records We filtered for a confidence score larger than added the manually labeled data and ended up with usable records for our fine tuning pipeline The remaining records with their original labels will be used as a test set in the evaluation later If you want to a closer look at the labeling process or the data itself you can visit the GitHub repository where we documented everything Fine tuning with QuaterionQuaterion is able to use different kinds of similarity information in order to fine tune the embeddings We could use a similarity score pre formed triplets or similarity groups where the group is defined by the class Because the class information is the only similarity measure we have we make use of SimilarityGroupSamples Now that we have the data ready we need a model to train Remember the goal is to learn a mapping from one embedding to another For that we are going to use a pre trained LLM as the encoder and add a SkipConnectionHead on top of it read here why this is preferred over just a linear layer The Linear layer has as many in features as it has out features which are in our case because we use “all MiniLM L v as our base model which produces dimensional embeddings Normally for example in classification you would use a classification head that has as many out features as there are classes To get the gradients required for training the network you could then use an implementation of the cross entropy loss function Because we want to learn similarity in the embedding space we have to employ a different loss function a triplet loss with cosine distance as the distance metric Visualization of what the triplet loss tries to achieve in the embedding space Reducing the distance of an anchor to the positive example and increasing the distance to the negative one Most of the training details are covered by Quaterion for us which uses PyTorch Lightning under the hood The optimizer we chose Adam is specified in the model itself we just need to call the fit method of Quaterion and specify the data loaders for training and validation EvaluationAt the beginning of this blog post we mentioned that we wanted to improve similarity search in kern refinery Because this is very difficult to measure we thought of a metric that captures what we are trying to achieve increase the amount of records of the same class in the most similar records which we will refer to as the “top k metric Because not everyone is going to label a thousand records in a single session we can also identify the amount of records that have to be labeled so this fine tuning can be beneficial Additionally we can check whether our fine tuning also improved classification accuracy on the side The test data consists of the already labeled records that were not used in the training or validation steps Top kFor the top k metric we take random samples from the test data calculate the closest neighbors for each of them and inspect what percentage of them have the same label as the original sample This will be then averaged over the samples to retrieve the top k metric The “raw embeddings refer to the embeddings that are generated when using the same base language model “all MiniLM L v but without applying the learned transformation of the embeddings When looking at the distribution of this metric we can see that fine tuning helped a lot The violin plots show that with the fine tuned embeddings you are more likely to get the same classes in your similarity search guided labeling session which means less context switching and therefore a smoother experience When averaging these values we get same class for the raw embeddings and for the fine tuned ones an improvement of close to Top kBecause labeling sessions are not always drawn out to records we were curious how the top k metric behaves for different values for k the other parameters stay equal to the previous experiment The benefits from fine tuning your embeddings seem to already have an impact on labeling sessions with only records which is good news because that is not a lot From there on out the fine tuned embeddings constantly perform better than the raw embeddings ClassificationA fine tuned embedding with class information could also benefit a classifier trained on that data So after training a LogisticRegression on the embeddings of our training data we evaluated their performance on the test data with the classification report from sklearn Interestingly the fine tuning did not make much of a difference We even lost a tiny amount of performance compared to the raw embeddings which is not significant though That means that our neighbor based similarity improved w r t classification but this linear classification model did not find it easier to separate the classes from one another We will look into this in more detail in the near future What to take away from thisBy sharing our experience in using similarity learning to fine tune embeddings we want to encourage you to try this yourself Quaterion made it really easy to get started and they also offer lots of support if you encounter any difficulties Apply this pipeline to your projects that require a well tuned similarity We took a simple classification dataset but there are many different domains where similarity learning shines For example in e commerce where products are mapped into a vector space Here a fine tuned similarity could drastically enhance the user experience Everything we presented is open source You can start from your raw data load it into the open source kern refinery label and export it and then process it in the Quaterion pipeline Next better separation in the D spaceWe are constantly looking for better ways to visualize and label data Currently we are looking into annotation methods that include a two dimensional plot of the embeddings where the user can label the data by drawing shapes around the points that should be labeled When using basic PCA we found that the embeddings are often not separated well in only two dimensions which makes this kind of annotation process difficult Therefore we are currently working on methods to fine tune embeddings leading to a better separation of classes in the D space Random sample of AG News records after embedding them and reducing their dimensions with PCA The classes are overlapping and therefore cannot be annotated easily in this plot Keep an eye out for future blog posts because we will share our experiences about that with you You could also join our discord for discussions or questions about any of these topics let that be NLP embeddings LLMs labeling or data centric AI in general	2022-09-15 08:07:38
海外科学	BBC News - Science & Environment	Hundreds spot fireball shooting across night sky	https://www.bbc.co.uk/news/uk-scotland-62891265?at_medium=RSS&at_campaign=KARANGA	space	2022-09-15 08:46:20
医療系	医療介護 CBnews	サル痘病変部位から性的接触に伴う伝播の可能性も-感染研が国内外の状況など報告	https://www.cbnews.jp/news/entry/20220915170243	msmmenwhohavesexwith	2022-09-15 17:15:00
金融	RSS FILE - 日本証券業協会	外国株式信用取引の取扱状況	https://www.jsda.or.jp/shiryoshitsu/toukei/foreign-shinyo/index.html	信用取引	2022-09-15 10:00:00
金融	RSS FILE - 日本証券業協会	個人情報の苦情処理に関する実績報告	https://www.jsda.or.jp/shiryoshitsu/toukei/kojn_kujyou.html	個人情報	2022-09-15 09:00:00
海外ニュース	Japan Times latest articles	Patagonia founder gives away the company to help fight climate change	https://www.japantimes.co.jp/news/2022/09/15/business/corporate-business/patagonia-founder-company-transfer-climate/	Patagonia founder gives away the company to help fight climate changeRather than selling the company or taking it public Yvon Chouinard and his family have transferred their ownership of the billion brand to a	2022-09-15 17:25:08
海外ニュース	Japan Times latest articles	Opposition parties split over Abe state funeral as CDP says it won’t attend	https://www.japantimes.co.jp/news/2022/09/15/national/opposition-parties-abe-funeral/	Opposition parties split over Abe state funeral as CDP says it won t attendThe government s failure to address concerns over the funeral s necessity the decision making process for the event and its cost are some of the reasons raised	2022-09-15 17:24:41
ニュース	BBC News - Home	Hundreds spot fireball shooting across night sky	https://www.bbc.co.uk/news/uk-scotland-62891265?at_medium=RSS&at_campaign=KARANGA	space	2022-09-15 08:46:20
ニュース	BBC News - Home	John Lewis customers spend less as inflation bites	https://www.bbc.co.uk/news/business-62911971?at_medium=RSS&at_campaign=KARANGA	items	2022-09-15 08:05:34
ニュース	BBC News - Home	Kwasi Kwarteng considers scrapping bankers’ bonus cap to boost City	https://www.bbc.co.uk/news/business-62906854?at_medium=RSS&at_campaign=KARANGA	bonuses	2022-09-15 08:04:58
ニュース	BBC News - Home	New Zealand bodies in suitcase: Woman arrested in S Korea over children's deaths	https://www.bbc.co.uk/news/world-asia-62910524?at_medium=RSS&at_campaign=KARANGA	charges	2022-09-15 08:02:30
ニュース	BBC News - Home	Overwhelmed mourners in tears at sight of Queen's coffin	https://www.bbc.co.uk/news/uk-62907358?at_medium=RSS&at_campaign=KARANGA	hairs	2022-09-15 08:30:30
ニュース	BBC News - Home	Greg Norman: LIV Golf no longer prepared to negotiate with PGA Tour	https://www.bbc.co.uk/sport/golf/62911559?at_medium=RSS&at_campaign=KARANGA	saudi	2022-09-15 08:40:50
ニュース	BBC News - Home	What time is the Queen's funeral? Who will wear military uniform? And other questions	https://www.bbc.co.uk/news/uk-62844663?at_medium=RSS&at_campaign=KARANGA	daily	2022-09-15 08:20:15
ニュース	BBC News - Home	Queen's funeral guests: Who will - and who won't - attend	https://www.bbc.co.uk/news/uk-62890879?at_medium=RSS&at_campaign=KARANGA	event	2022-09-15 08:52:30
北海道	北海道新聞	東京駅に新バスターミナル　停留所集約、１日６００便	https://www.hokkaido-np.co.jp/article/731550/	京王電鉄バス	2022-09-15 17:28:00
北海道	北海道新聞	連結トラック、３８区間を追加　青森―鹿児島の輸送可能に	https://www.hokkaido-np.co.jp/article/731549/	国土交通省	2022-09-15 17:27:00
北海道	北海道新聞	「ｅスポーツ」日本代表が実演　アジア大会で正式種目に	https://www.hokkaido-np.co.jp/article/731548/	東京ゲームショウ	2022-09-15 17:27:00
北海道	北海道新聞	伊藤忠、中古スマホ事業強化へ　アマゾン経由で買い取り	https://www.hokkaido-np.co.jp/article/731546/	belong	2022-09-15 17:22:00
北海道	北海道新聞	観光列車「ふたつ星」披露　ＪＲ九州、佐賀・長崎を周遊	https://www.hokkaido-np.co.jp/article/731545/	観光列車	2022-09-15 17:22:00
北海道	北海道新聞	石綿被害の文書誤廃棄で国賠提訴　兵庫の遺族、３００万円請求	https://www.hokkaido-np.co.jp/article/731544/	健康被害	2022-09-15 17:19:00
北海道	北海道新聞	日銀・黒田氏、３月退任説　副総裁と同時、後任に配慮	https://www.hokkaido-np.co.jp/article/731540/	任期満了	2022-09-15 17:15:00
北海道	北海道新聞	さっぽろ雪まつり　３年ぶりに会場開催へ	https://www.hokkaido-np.co.jp/article/731534/	開催	2022-09-15 17:12:52
北海道	北海道新聞	関ケ原合戦の日、のろしに鉄砲隊　岐阜、戦国時代体感イベント	https://www.hokkaido-np.co.jp/article/731538/	天下分け目	2022-09-15 17:11:00
IT	週刊アスキー	JR北海道、2024年春に函館エリアおよび旭川エリア計20駅においてICカード「Kitaca」エリア拡大	https://weekly.ascii.jp/elem/000/004/105/4105713/	kitaca	2022-09-15 17:50:00
IT	週刊アスキー	CRASHGATE 吉祥寺店とPARCO ONLINE STOREにて、関家具のゲーミングチェアー「ガンダムモデル」「シャア専用ザクモデル」が発売	https://weekly.ascii.jp/elem/000/004/105/4105703/	crashgate	2022-09-15 17:45:00
IT	週刊アスキー	動く動く…！オープンワールドアクションRPG『鳴潮（Wuthering Waves）』がTGS 2022で約10分間のプレイ動画を初出展	https://weekly.ascii.jp/elem/000/004/105/4105742/	guangzhoukurotechnology	2022-09-15 17:45:00
IT	週刊アスキー	Rakuten NFT、今秋より暗号資産「イーサ」（ETH）による決済に対応	https://weekly.ascii.jp/elem/000/004/105/4105700/	metamask	2022-09-15 17:40:00
IT	週刊アスキー	スティールシリーズ、新開発スイッチにより最短1.0mmアクチュエーションポイントとレスポンスタイム0.2msを実現したゲーミングキーボード「Apex 9 Mini」「Apex 9 TKL」を発売	https://weekly.ascii.jp/elem/000/004/105/4105692/	スティールシリーズ、新開発スイッチにより最短mmアクチュエーションポイントとレスポンスタイムmsを実現したゲーミングキーボード「ApexMini」「ApexTKL」を発売スティールシリーズジャパンは新開発のOptiPoint光学スイッチにより最短mmアクチュエーションポイントとレスポンスタイムmsを実現したゲーミングキーボード「ApexMini」「ApexTKL」を発売した。	2022-09-15 17:30:00
IT	週刊アスキー	BBソフトサービス、Amazon簡単セットアップに対応したLEDシーリングライトの新モデルを販売開始	https://weekly.ascii.jp/elem/000/004/105/4105647/	販売開始	2022-09-15 17:20:00
IT	週刊アスキー	鮮やかな赤に染まる大規模な花のイベント！　里山ガーデン「秋の里山ガーデンフェスタ」開催	https://weekly.ascii.jp/elem/000/004/105/4105669/	鮮やか	2022-09-15 17:20:00

このブログを検索

IT音痴アラフィフおやじのストック記事倉庫

投稿時間:2022-09-15 18:42:22 RSSフィード2022-09-15 18:00 分まとめ(50件)

コメント

コメントを投稿

このブログの人気の投稿

投稿時間:2021-06-17 22:08:45 RSSフィード2021-06-17 22:00 分まとめ(2089件)

投稿時間:2021-06-20 02:06:12 RSSフィード2021-06-20 02:00 分まとめ(3871件)

投稿時間:2021-06-17 05:05:34 RSSフィード2021-06-17 05:00 分まとめ(1274件)