投稿時間:2023-08-07 14:17:31 RSSフィード2023-08-07 14:00 分まとめ(19件)

投稿時間:2023-08-07 14:17:31 RSSフィード2023-08-07 14:00 分まとめ(19件)

- 8月 07, 2023

カテゴリー等	サイト名等	記事タイトル・トレンドワード等	リンクURL	頻出ワード・要約等/検索ボリューム	登録日
IT	ITmedia 総合記事一覧	[ITmedia PC USER] PFU、ScanSnap用モバイルアプリをリニューアル	https://www.itmedia.co.jp/pcuser/articles/2308/07/news113.html	itmediapcuserpfu	2023-08-07 13:33:00
IT	ITmedia 総合記事一覧	[ITmedia News] 科博のクラウドファンディングにアクセス殺到　READYFOR全体がつながりにくい状態に	https://www.itmedia.co.jp/news/articles/2308/07/news112.html	itmedia	2023-08-07 13:32:00
AWS	AWS Japan Blog	AWS Entity Resolution: 複数のアプリケーションとデータストアからの関連レコードを照合してリンクする	https://aws.amazon.com/jp/blogs/news/aws-entity-resolution-match-and-link-related-records-from-multiple-applications-and-data-stores/	awsentityresolution	2023-08-07 04:31:53
Ruby	Rubyタグが付けられた新着投稿 - Qiita	コメントアウトしたコードが動作に影響する場面がある件について(マジックコメント)	https://qiita.com/pe-pe-engineer/items/4621c8a82116d1d95a93	codingutf	2023-08-07 13:31:44
Linux	Ubuntuタグが付けられた新着投稿 - Qiita	カーネルが上がって起動しない場合の意外な盲点	https://qiita.com/Manyan3/items/8e67913462603e2f2397	ubunturecoveringjournal	2023-08-07 13:33:55
Git	Gitタグが付けられた新着投稿 - Qiita	git コマンド	https://qiita.com/ooyy0121/items/1888840cda88c31f6caf	gitinit	2023-08-07 13:22:55
海外TECH	DEV Community	20x Faster as the Beginning: Introducing pgvecto.rs extension written in Rust	https://dev.to/gaocegege/20x-faster-as-the-beginning-introducing-pgvectors-extension-written-in-rust-3d2f	x Faster as the Beginning Introducing pgvecto rs extension written in RustWe are thrilled to announce the release of pgvecto rs a powerful Postgres extension for vector similarity search written in Rust It s HNSW algorithm is x faster than pgvector at recall But speed is just the start pgvecto rs is architected to easily add new algorithms We ve made it an extensible architecture for contributors to implement new index with ease and we look forward to the open source community driving pgvecto rs to new heights Why Rust Pgvecto rs is implemented in Rust rather than C like many existing Postgres extensions It is built on top of the pgrx framework for writing Postgres extensions in Rust Rust provides many advantages for an extension like pgvecto rs Rust s strict compile time checks guarantee memory safety which helps avoid entire classes of bugs and security issues that can plague C extensions Just as importantly Rust provides modern developer ergonomics with great documentation package management and excellent error messages This makes pgvecto rs more approachable for developers to use and contribute to compared to sprawling C codebases The safety and ease of use of Rust make it an ideal language for building the next generation of Postgres extensions like pgvecto rs on top of pgrx Extensible ArchitecturesPgvecto rs is designed with an extensible architecture that makes it easy to add support for new index types At the core is a set of traits that define the required behaviors for a vector index like building saving loading and querying Implementing a new index is as straightforward as creating a struct for that index type and implementing the required traits Pgvecto rs currently comes with two built in index types HNSW for maximum search speed and ivfflat for quantization based approximate search But the doors are open for anyone to create additional indexes like RHNSW NGT or custom types tailored to specific use cases The extensible architecture makes pgvecto rs adaptable as new vector search algorithms emerge And it lets you select the right index for your data and performance needs Pgvecto rs provides the framework for making vector search in Postgres as flexible and future proof as possible Speed and PerformanceBenchmarks show pgvecto rs offers massive speed improvements over existing Postgres extensions like pgvector In tests its HNSW index demonstrates search performance up to x faster compared to pgvector s ivfflat index The flexible architecture also allows using different indexing algorithms to optimize for either maximum throughput or precision We re working on the quantization HNSW now please also stay tuned Persistence and ManagementPrevious work pg embedding did a great job implementing HNSW indexes but lacked support for persistence and proper CRUD operations pgvecto rs adds those two core functionalities that were missing in pg embedding Vector indexes in pgvecto rs are properly persisted using WAL write ahead logging pgvecto rs handles saving loading rebuilding and updating indexes automatically behind the scenes You get durable indexes that don t require external management while fitting cleanly into current Postgres deployments and workflows Getting StartedLet s assume you ve created a table using the following SQL command CREATE TABLE items id bigserial PRIMARY KEY emb vector Here vector denotes the vector data type with representing the dimension of the vector You can use vector without specifying a dimension but be aware that you cannot create an index on a vector type without a specified dimension You can insert data like this anytime INSERT INTO items emb VALUES To create an index on the emb vector column using squared Euclidean distance you can use the following command CREATE INDEX ON items USING vectors emb l ops WITH options capacity size ram storage vectors ram algorithm hnsw storage ram m ef If you want to retrieve the top vectors closest to the origin you can use the following SQL command SELECT emb lt gt AS scoreFROM itemsORDER BY emb lt gt LIMIT Conclusionpgvecto rs represents an exciting step forward for vector search in Postgres Its implementation in Rust and extensible architecture provide key advantages over existing extensions like speed safety and flexibility We re thrilled to release pgvecto rs as an open source project under Apache license and can t wait to see what the community builds on top of it There s ample room for pgvecto rs to expand adding new index types and algorithms optimizing for different data distributions and use cases and integrating with existing Postgres workflows We encourage you to try out pgvecto rs on GitHub benchmark it against your workloads and contribute your own indexing innovations Together we can make pgvecto rs the best vector search extension Postgres has ever seen The potential is vast and we re just getting started Please join us on this journey to bring unprecedented vector search capabilities to the Postgres ecosystem Join our Discord community to connect with the developers and other users working to improve pgvecto rs Advertisement TimeThe mission of ModelZ is to simplify the process of taking machine learning models into production With experiences from AWS Tiktok and Kubeflow our team has extensive expertise in MLOps engineering So if you have any questions related to putting models into production please feel free to reach out by joining Discord or through modelz support tensorchord ai We re happy to help draw on our background building MLOps platforms across companies to provide guidance on any part of the model development to deployment workflow More products with ModelZ ModelZ A Managed serverless GPU platform to deploy your own modelsMosec A high performance serving framework for ML models offers dynamic batching and CPU GPU pipelines to fully exploit your compute machine Simple and faster alternative to NVIDIA Triton envd A command line tool that helps you create the container based environment for AI ML from development to the production Python is all you need to know to use this tool ModelZ llm OpenAI compatible API for LLMs and embeddings LLaMA Vicuna ChatGLM and many others	2023-08-07 04:51:54
海外TECH	DEV Community	Do we really need a specialized vector database?	https://dev.to/gaocegege/do-we-really-need-a-specialized-vector-database-5aci	Do we really need a specialized vector database With the popularity of LLM Large Language Model vector databases have also become a hot topic With just a few lines of simple Python code a vector database can act as a cheap but highly effective external brain for your LLM But do we really need a specialized vector database Why does LLM need vector search First let me briefly introduce why LLM needs to use vector search technology Vector search is a problem that has been around for a long time The process of finding the most similar object in a collection given an object is vector search Text images etc can be converted into a vector representation and the similarity problem of text images can be transformed into a vector similarity problem In the example above we convert different words into a three dimensional vector Therefore we can intuitively display the similarity between different words in a D space For example the similarity between student and school is higher than the similarity between student and food Returning to LLM the limitation of the context window length is a major challenge For instance ChatGPT has a context length limit of k tokens This poses a significant problem for LLM s context learning ability and negatively impacts the model s user experience However vector search provides an elegant solution to this problem Divide the text that exceeds the context length limit into shorter chunks and convert different chunks into vectors embeddings Before inputting the prompt to LLM convert the prompt into a vector embedding Search the prompt vector to find the most similar chunk vector Concatenate the most similar chunk vector with the prompt vector as the input to LLM This is like giving LLM an external memory which allows it to search for the most relevant information from this memory This memory is the ability brought by vector search If you want to learn more details you can read these articles Article and Article which explain it more clearly Why is vector database so popular In LLM the vector database has become an indispensable part and one of the most important reasons is its ease of use After being used in conjunction with OpenAI Embedding models such as text embedding ada it only takes about ten lines of code to convert a prompt query into a vector and perform the entire process of vector search def query query collection name top k Creates embedding vector from user query embedded query openai Embedding create input query model EMBEDDING MODEL data embedding near vector vector embedded query Queries input schema with vectorized user query query result client query get collection name with near vector near vector with limit top k do return query resultIn LLM vector search mainly plays a role in recall Simply put recall is finding the most similar objects in a candidate set In LLM the candidate set is all chunks and the most similar object is the chunk that is most similar to the prompt In the reasoning process of LLM vector search is regarded as the main implementation of recall It is easy to implement and can use OpenAI Embedding models to solve the most troublesome problem of converting text into vectors The remaining part is an independent and clean vector search problem which can be well completed by current vector databases Therefore the entire process is particularly smooth As the name suggests vector database is a database specifically designed for the special data type of vectors The similarity calculation of vectors was originally an O n complexity problem because it required comparing all vectors in the set pairwise Therefore the industry proposed the Approximate Nearest Neighbor ANN algorithm By using the ANN algorithm the vector index is constructed by pre calculating in the vector database using the idea of trading space for time which greatly speeds up the process of similarity calculation This is similar to the index in traditional databases Therefore vector databases not only have strong performance but also excellent ease of use making them a perfect match for LLM Really Perhaps a general purpose database would be better We ve talked about the advantages and benefits of vector databases but what are their limitations A blog post by SingleStore provides a good answer to this question Vectors and vector search are a data type and query processing approach not a foundation for a new way of processing data Using a specialty vector database SVDB will lead to the usual problems we see and solve again and again with our customers who use multiple specialty systems redundant data excessive data movement lack of agreement on data values among distributed components extra labor expense for specialized skills extra licensing costs limited query language power programmability and extensibility limited tool integration and poor data integrity and availability compared with a true DBMS There are two issues that I think are important The first is the issue of data consistency During the prototyping phase vector databases are very suitable and ease of use is more important than anything else However a vector database is an independent system that is completely decoupled from other data storage systems such as TP databases and AP data lakes Therefore data needs to be synchronized streamed and processed between multiple systems Imagine if your data is already stored in an OLTP database such as PostgreSQL To perform vector search using an independent vector database you need to first extract the data from the database then convert each data point into a vector using services such as OpenAI Embedding and then synchronize it to a dedicated vector database This adds a lot of complexity Furthermore if a user deletes a data point in PostgreSQL but it is not deleted in the vector database then there will be data inconsistency issues This issue can be very serious in actual production environments Update the embedding column for the documents tableUPDATE documents SET embedding openai embedding content WHERE length embedding Create an index on the embedding columnCREATE INDEX ON documents USING ivfflat embedding vector l ops WITH lists Query the similar embeddingsSELECT FROM documents ORDER BY embedding lt gt openai embedding hello world LIMIT On the other hand if everything is done in a general purpose database the user experience may be simpler than with an independent vector database Vectors are just one data type in a general purpose database not an independent system This way data consistency is no longer an issue The second issue is with query language The query language of vector databases is typically designed specifically for vector search so there may be many limitations in other types of queries For example in metadata filtering scenarios users need to filter based on certain metadata fields The filtering operators supported by some vector databases are limited In addition the supported data types for metadata are also very limited usually only including String Number List of Strings and Booleans This is not friendly for complex metadata queries If traditional databases can support the vector data type then the aforementioned issues do not exist Firstly data consistency is already taken care of as TP or AP databases are existing infrastructure in production environments Secondly the issue of query language no longer exists because vector data type is just one data type in the database so queries for vector data type can use the native query language of the database such as SQL Detailed explanationHowever it is unfair to only compare the disadvantages of vector databases There are several counterpoints to consider Ease of Use Vector databases are designed with ease of use in mind and users can easily work with them without worrying about the underlying implementation details However integrating them with other data storage systems can be a challenge as mentioned earlier Performance Vector databases have a significant advantage over traditional databases in terms of performance for certain use cases Their design for vector search allows for fast and efficient similarity searches on large scale datasets with high dimensional vectors Metadata Filtering While metadata filtering capabilities in vector databases may be limited they can still meet the needs of most business scenarios However for more complex metadata queries a hybrid approach may be needed where metadata is stored in a separate database or data lake and linked to the vector data in the vector database How can you address these issues In the following section I will provide my perspective by answering these questions Vector databases are easy to useWhile it is true that vector databases are easy to use this is not unique to them The ease of use of vector databases is mainly due to their abstraction of a specific domain which allows them to be specifically designed for the most commonly used machine learning programming language Python and optimized for vector search scenarios However if traditional databases could also support the vector data type they could offer similar ease of use In addition traditional databases can provide Python SDKs and other integrated tools to meet the needs of most scenarios as well as standard SQL interfaces to handle more complex query scenarios Therefore it is not necessary to use a vector database solely for its ease of use Another advantage of vector databases is their distributed design which allows them to scale horizontally to meet the data volume and QPS requirements of users However traditional databases can also meet these requirements through distributed systems Nevertheless the decision to use a distributed system should be based on the actual needs of the data volume and QPS requirements as well as the associated costs In summary while vector databases have their advantages traditional databases can also provide similar ease of use and distributed capabilities if they support the vector data type Therefore the choice between a vector database and a traditional database should be based on the specific needs of the application and the available resources Vector databases have better performanceTo investigate the performance of vector databases in LLM scenarios a naive benchmark of vector retrieval was conducted The benchmark involved N randomly initialized dimensional vectors and the query time for the top nearest neighbors was measured for different scales of N Two different methods were used for the test Numpy was used to perform real time calculation which executed completely accurate non precomputed nearest neighbor calculation Hnswlib was used to precompute approximate nearest neighbors The benchmark results show that at the scale of million vectors the delay of real time calculation using Numpy is approximately ms Using this as a benchmark we can compare the time spent on LLM inference after completing vector search For instance the B model requires approximately seconds for inference on Chinese characters on an Nvidia A GB Therefore even if the query time for real time accurate calculation of the similarity of million vectors using Numpy is considered it only accounts for of the total delay in the end to end LLM inference Thus in terms of delay the benefits brought by vector databases may be overshadowed by the delay of LLM itself in the current LLM scenario Therefore we need to also consider throughput The throughput of LLM is much lower than that of vector databases Thus I do not believe that throughput is the core issue in this scenario If performance is not the primary concern what factors will determine the user s choice I think it is the overall ease of use including ease of use for both usage and operation consistency and other solutions to database related issues Traditional databases have mature solutions for these problems while vector databases are still in the early stages of development Metadata filtering can still meet the needs of most business scenariosWhen considering metadata filtering it s important to note that it s not just a matter of the number of supported operators Consistency of data is also a crucial factor Metadata in vectors is essentially data in traditional databases while vectors themselves are indexes of the data Therefore it s reasonable to consider storing both vectors and metadata in traditional databases Traditional databases do have the capability to support vector data types and provide similar ease of use and distributed capabilities as vector databases Furthermore traditional databases have mature solutions to ensure data consistency and integrity such as transaction management and data backup and recovery Vectors in traditional databasesSince we see vectors as a new data type in traditional databases let s take a look at how to support vector data types in traditional databases using PostgreSQL as an example pgvector is an open source PostgreSQL plugin that supports vector data types pgvector uses exact calculation by default but it also supports building an IVFFlat index and precomputing ANN results using the IVFFlat algorithm sacrificing calculation accuracy for performance pgvector has done an excellent job of supporting vectors and is used by products such as supabase However the supported index algorithm is limited with only the simplest IVFFlat algorithm supported and no quantization or storage optimization is implemented Moreover the index algorithm of pgvector is not disk friendly and is designed for use in memory Therefore vector index algorithms designed for disk such as DiskANN are also valuable in the traditional database ecosystem Extending pgvector can be challenging due to its implementation in the C programming language Despite being open source for two years pgvector currently has only three contributors While the implementation of pgvector is not particularly complex it may be worth considering rewriting it in Rust Rewriting pgvector in Rust can enable the code to be organized in a more modern and extensible way Rust s ecosystem is also very rich with existing Rust bindings such as faiss rs As a result pgvecto rs was created pgvecto rs currently supports exact vector query operations and three distance calculation operators Work is underway to design and implement index support In addition to IVFFlat we also hope to support more indexing algorithms such as DiskANN SPTAG and ScaNN We welcome contributions and feedback from the community pgvecto rs offers a modern and extensible codebase with improved performance and concurrency Its design and implementation allow seamless integration with other machine learning libraries and tools making it an ideal choice for similarity search scenarios With ongoing development pgvecto rs aims to be a valuable tool for data scientists and machine learning practitioners Its support for various indexing algorithms and its ease of use make it a promising candidate for large scale similarity search applications We look forward to continuing development and contributions from the community call the distance function through operators square Euclidean distanceSELECT array lt gt array dot product distanceSELECT array lt gt array cosine distanceSELECT array lt gt array create tableCREATE TABLE items id bigserial PRIMARY KEY emb numeric insert valuesINSERT INTO items emb VALUES ARRAY ARRAY query the similar embeddingsSELECT FROM items ORDER BY emb lt gt ARRAY LIMIT query the neighbors within a certain distanceSELECT FROM items WHERE emb lt gt ARRAY lt FutureAs LLMs gradually move into production environments infrastructure requirements are becoming increasingly demanding The emergence of vector databases is an important addition to the infrastructure We do not believe that vector databases will replace traditional databases but rather that they will each play to their strengths in different scenarios The emergence of vector databases will also promote traditional databases to support vector data types We hope that pgvecto rs can become an important component of the Postgres ecosystem providing better vector support for Postgres Its implementation in Rust and support for various indexing algorithms make it a promising candidate for large scale similarity search applications We believe that its development and contributions from the community will help it become a valuable tool for data scientists and machine learning practitioners	2023-08-07 04:45:26
海外TECH	DEV Community	Differentiating onclick and addEventListener in JavaScript	https://dev.to/brainiacneit/differentiating-onclick-and-addeventlistener-in-javascript-30ke	Differentiating onclick and addEventListener in JavaScript OverviewThis article provides an insightful examination of the contrasting approaches to event handling in JavaScript the familiar onclick and the versatile addEventListener method By delving into the nuances of these two mechanisms we uncover the unique advantages they offer and the scenarios in which they excel Through comprehensive examples and practical use cases we ll dissect the syntax behavior and compatibility of both onclick and addEventListener empowering developers to make informed choices when implementing event driven interactions in their web applications Whether it s a straightforward click action or a more complex event management requirement this article equips readers with the knowledge to navigate between these two event handling paradigms effectively DefinitionsHere are the definitions onclick in HTML onclick is an HTML attribute used to attach JavaScript code that will execute when a specific element such as a button or a link is clicked by the user This attribute allows developers to define inline event handling directly within the HTML markup When the element is clicked the specified JavaScript code is triggered enabling interactivity and user initiated actions While simple to use onclick is limited to a single event handler and can become cumbersome when managing multiple events on the same element or handling more complex scenarios addEventListener in JavaScript addEventListener is a method in JavaScript that allows developers to dynamically attach event handlers to HTML elements It provides a more flexible and robust approach compared to inline event attributes like onclick With addEventListener multiple event listeners can be added to the same element and event handling can be more organized and maintainable It offers control over event propagation capturing and bubbling phases Additionally addEventListener accommodates various event types beyond just clicks expanding its utility for handling a wide range of user interactions and application behaviors Usage onclick lt DOCTYPE html gt lt html gt lt head gt lt title gt onclick Example lt title gt lt head gt lt body gt lt button id myButton gt Click me lt button gt lt script gt function handleClick alert Button clicked document getElementById myButton onclick handleClick lt script gt lt body gt lt html gt In this example the onclick attribute is used to directly assign a JavaScript function handleClick to the button s click event When the button is clicked the handleClick function is executed displaying an alert addEventListener lt DOCTYPE html gt lt html gt lt head gt lt title gt addEventListener Example lt title gt lt head gt lt body gt lt button id myButton gt Click me lt button gt lt script gt function handleClick alert Button clicked document getElementById myButton addEventListener click handleClick lt script gt lt body gt lt html gt In this example the addEventListener method is used to attach the same handleClick function to the button s click event This method provides more flexibility and allows for multiple event listeners to be added to the same element DifferencesDifference between addEventListener and onclick addEventListener addEventListener allows the addition of multiple events to a specific element It can accept a third argument that provides control over event propagation Events added using addEventListener can only be attached within lt script gt elements or in external JavaScript files Compatibility may be limited as it does not work in older versions of Internet Explorer which use attachEvent instead onclick onclick is used to attach a single event to an element It is essentially a property and may get overwritten Event propagation cannot be controlled directly with onclick onclick can also be added directly as an HTML attribute offering a simpler integration method It is widely supported and functions across various browsers The choice between addEventListener and onclick depends on the complexity of event management required and the compatibility needs of the application ConclusionIn conclusion understanding the distinctions between addEventListener and onclick is essential for effective event handling in JavaScript While both methods enable interaction and responsiveness they cater to different levels of complexity and compatibility requirements addEventListener emerges as a versatile tool offering the flexibility to attach multiple events to a single element Its capacity to control event propagation and its suitability for structured scripting make it a robust choice for modern applications However developers should be cautious of its limited compatibility with older browsers On the other hand onclick provides a straightforward means of attaching a single event to an element making it a suitable choice for simpler interactions Its direct integration as an HTML attribute streamlines implementation but may lack the comprehensive control and scalability offered by addEventListener In the end the selection between these methods hinges on the project s scope desired functionality and the targeted user base By grasping the strengths and limitations of each approach developers can make informed decisions crafting seamless and responsive web experiences tailored to their unique needs	2023-08-07 04:20:06
金融	日本銀行：RSS	消費活動指数	http://www.boj.or.jp/research/research_data/cai/index.htm	消費活動	2023-08-07 14:00:00
ニュース	BBC News - Home	Bibby Stockholm: First asylum seekers to board barge	https://www.bbc.co.uk/news/uk-66424923?at_medium=RSS&at_campaign=KARANGA	critics	2023-08-07 04:27:25
ニュース	BBC News - Home	HSBC executive sorry for saying UK 'weak' over China	https://www.bbc.co.uk/news/business-66424948?at_medium=RSS&at_campaign=KARANGA	recent	2023-08-07 04:35:59
ニュース	Newsweek	中国軍、華やかな「建軍節」にロケット軍のトップ2人を交代...秦剛前外相との関係も？	https://www.newsweekjapan.jp/stories/world/2023/08/post-102372.php	人民解放軍	2023-08-07 13:50:00
マーケティング	MarkeZine	【耳から学ぶ】星野リゾート、星野佳路氏に聞く「意思決定の軸」	http://markezine.jp/article/detail/43043	意思決定	2023-08-07 13:30:00
IT	週刊アスキー	『ドラゴンクエストモンスターズ3』ダウンロード版の予約受付を開始！	https://weekly.ascii.jp/elem/000/004/148/4148776/	nintendo	2023-08-07 13:45:00
IT	週刊アスキー	ScanSnapのモバイルアプリ「ScanSnap Home」が新登場	https://weekly.ascii.jp/elem/000/004/148/4148693/	scansnap	2023-08-07 13:30:00
IT	週刊アスキー	限定メニューも展開！「長崎スタジアムシティ」にとんかつ店「文治郎」とチョコレート専門店「QUON CHOCOLATE」が入居決定	https://weekly.ascii.jp/elem/000/004/148/4148754/	quonchocolate	2023-08-07 13:30:00
マーケティング	AdverTimes	西島秀俊さんが東京海上日動あんしん生命保険の新ＣＭで鼻歌を披露、懐かしいその歌とは？	https://www.advertimes.com/20230807/article429919/	東京海上日動あんしん生命保険	2023-08-07 04:41:53
海外TECH	reddit	果然是⛵, 那个yique来献花了	https://www.reddit.com/r/Youmo/comments/15k9r20/果然是_那个yique来献花了/	eekertoryoumolinkcomments	2023-08-07 04:02:24

コメント