投稿時間:2024-07-27 23:02:41 RSSフィード2024-07-27 23:00分まとめ(3件)

カテゴリー サイト名 記事タイトル リンクURL 頻出ワード・要約等 登録日
IT ロボスタ aibo「きなこ」の初お披露目に「動く25周年記念aibo」など見どころ満載!大盛況の「aiboファンミーティング18」写真レポート https://robotstart.info/2024/07/27/aibo-fanme18-report.html aibo,athepostaibo,firstappearedon 2024-07-27 13:26:09
Program AWSタグが付けられた新着投稿 - Qiita AWS認定DevOpsエンジニア・プロフェッショナル受験談&学習法 https://qiita.com/takahiro_fukushima/items/78294f1117647f41366d devops,したこと,エンジニア 2024-07-27 22:52:36
海外TECH Engadget Websites accuse AI startup Anthropic of bypassing their anti-scraping rules and protocol https://www.engadget.com/websites-accuse-ai-startup-anthropic-of-bypassing-their-anti-scraping-rules-and-protocol-133022756.html?src=rss Freelancer has accused Anthropic the AI startup behind the Claude large language models of ignoring its quot do not crawl quot robots txt protocol to scrape its websites data Meanwhile iFixit CEO Kyle Wiens said Anthropic has ignored the website s policy prohibiting the use of its content for AI model training Matt Barrie the chief executive of Freelancer told The Information that Anthropic s ClaudeBot is quot the most aggressive scraper by far quot His website allegedly got million visits from the company s crawler within a span of four hours which is quot probably about five times the volume of the number two quot AI crawler Similarly Wiens posted on X Twitter that Anthropic s bot hit iFixit s servers a million times in hours quot You re not only taking our content without paying you re tying up our devops resources quot he wrote nbsp Back in June Wired accused another AI company Perplexity of crawling its website despite the presence of the Robots Exclusion Protocol or robots txt A robots txt file typically contains instructions for web crawlers on which pages they can and can t access While compliance is voluntary it s mostly just been ignored by bad bots After Wired s piece came out a startup called TollBit that connects AI firms with content publishers reported that it s not just Perplexity that s bypassing robots txt signals While it didn t name names Business Insider said it learned that OpenAI and Anthropic were ignoring the protocol as well nbsp Barrie said Freelancer tried to refuse the bot s access requests at first but it ultimately had to block Anthropic s crawler entirely quot This is egregious scraping which makes the site slower for everyone operating on it and ultimately affects our revenue quot he added As for iFixit Wiens said the website has set alarms for high traffic and his people got woken up at AM due to Anthropic s activities The company s crawler stopped scraping iFixit after it added a line in its robots txt file that disallows Anthropic s bot in particular nbsp The AI startup told The Information that it respects robots txt and that its crawler quot respected that signal when iFixit implemented it quot It also said that it aims quot for minimal disruption by being thoughtful about how quickly it crawls the same domains quot which is why it s now investigating the case nbsp AI firms use crawlers to collect content from websites that they can use to train their generative AI technologies They ve been the target of multiple lawsuits as a result with publishers accusing them of copyright infringement To prevent more lawsuits from being filed companies like OpenAI have been striking deals with publishers and websites OpenAI s content partners so far include News Corp Vox Media the Financial Times and Reddit iFixit s Wiens seems open to the idea of signing a deal for the how to repair s website s articles as well telling Anthropic in a tweet he s willing to have a conversation about licensing content for commercial use If any of those requests accessed our terms of service they would have told you that use of our content expressly forbidden But don t ask me ask Claude If you want to have a conversation about licensing our content for commercial use we re right here pic twitter com CAkOQDnLjDーKyle Wiens kwiens July This article originally appeared on Engadget at 2024-07-27 13:30:22

コメント

このブログの人気の投稿

投稿時間:2021-06-17 05:05:34 RSSフィード2021-06-17 05:00 分まとめ(1274件)

投稿時間:2021-06-20 02:06:12 RSSフィード2021-06-20 02:00 分まとめ(3871件)

投稿時間:2020-12-01 09:41:49 RSSフィード2020-12-01 09:00 分まとめ(69件)