投稿時間:2021-08-06 02:25:19 RSSフィード2021-08-06 02:00 分まとめ(30件)

カテゴリー等 サイト名等 記事タイトル・トレンドワード等 リンクURL 頻出ワード・要約等/検索ボリューム 登録日
IT 気になる、記になる… 実働する「AirPower」の試作機を撮影した映像 https://taisy0.com/2021/08/06/143832.html airpower 2021-08-05 16:54:42
AWS AWS Architecture Blog Catch Important Moments in Sports with 5G and AWS Wavelength https://aws.amazon.com/blogs/architecture/catch-important-moments-in-sports-with-5g-and-aws-wavelength/ Catch Important Moments in Sports with G and AWS WavelengthTo enhance the viewing experience for spectators fans and players the sports industry is continuously evaluating ways to lower video latency With G networks can now provide high density radio air interfaces with high bandwidth and reliability This new technology especially benefits sports broadcasting and player tracking and analytics which need to be processed at the … 2021-08-05 16:28:38
AWS AWS Government, Education, and Nonprofits Blog Building digital capabilities to withstand future challenges, from cyberattacks to severe weather events https://aws.amazon.com/blogs/publicsector/building-digital-capabilities-withstand-future-challenges-cyberattacks-severe-weather-events/ Building digital capabilities to withstand future challenges from cyberattacks to severe weather eventsRecent events from public sector cyberattacks and severe weather events to the ongoing global COVID pandemic have revealed that many educational institutions as well as regional and local governments are not fully prepared to respond to these incidents At the same time large scale disruptive events illustrate how important it is for public sector organizations to respond rapidly to keep essential services running as well as quickly pivot to offer new services In an IDC survey of U S residents in September said they would like to continue virtual government services as a replacement for in person interactions while another IDC survey of U S teachers revealed that expect a growth in hybrid or remote learning to be a lasting change 2021-08-05 16:54:11
AWS AWS Secure Code Warrior on AWS: Customer Story | Amazon Web Services https://www.youtube.com/watch?v=hyoSsHI7Vgg Secure Code Warrior on AWS Customer Story Amazon Web ServicesIn this episode of AWS Community Chats Aley Hammer is joined with Pieter the CEO and Co Founder of Secure Code Warrior Pieter shares that Security is job zero at Secure Code Warrior and common pitfalls he sees customers making as well as advice for navigating the journey into the cloud Pieter also discusses what is most important to him in a cloud provider and the AWS technology they are planning to use that will further enhance the Secure Code Warrior offering Learn more about Secure Code Warrior Subscribe More AWS videos More AWS events videos ABOUT AWSAmazon Web Services AWS is the world s most comprehensive and broadly adopted cloud platform offering over fully featured services from data centers globally Millions of customers ーincluding the fastest growing startups largest enterprises and leading government agencies ーare using AWS to lower costs become more agile and innovate faster AWS AmazonWebServices CloudComputing 2021-08-05 16:47:37
python Pythonタグが付けられた新着投稿 - Qiita 【競プロ典型90問】002の解説(python) https://qiita.com/wihan23/items/3924ca66b3a7e055d73b 引用元競プロ典型問Github実装answerpy入力の受け取りNintinputカッコ列を格納する配列pareビット全探索foriinrangeltltNの時を、の時を文字列に追加していくlforjinrangeNifigtgtjamplelsel文字列の長さがNの時のみ、左から検索する。 2021-08-06 01:30:32
python Pythonタグが付けられた新着投稿 - Qiita Optiver Realized Volatility Predict 走り書き https://qiita.com/hamachiburi9/items/a94001c3577f7cb0c603 Wikipedia定義名前がややこしいが、あくまでRVは統計量なので平均値の仲間のようなものと思っている今回のコンペにおけるRealizedVolatility上記は金融工学におけるやや厳格なRealizedVolatilityを記載した。 2021-08-06 01:13:18
Program [全てのタグ]の新着質問一覧|teratail(テラテイル) 2桁の整数が1桁×2になってしまう https://teratail.com/questions/352961?rss=all 実験の数値をxlsxに記録していくコードを書いています。 2021-08-06 01:45:47
Program [全てのタグ]の新着質問一覧|teratail(テラテイル) polylang のStrings translationsで翻訳言語が保存されない https://teratail.com/questions/352960?rss=all polylangのStringstranslationsで翻訳言語が保存されないPolylangのstringnbsptranslationsで翻訳した言語が保存されません。 2021-08-06 01:10:30
Program [全てのタグ]の新着質問一覧|teratail(テラテイル) MoyaでのHTTPリクエストで404エラーなどが発生した際に、エラーハンドリングを行いたい https://teratail.com/questions/352959?rss=all MoyaでのHTTPリクエストでエラーなどが発生した際に、エラーハンドリングを行いたいswiftでMoyaを使ってHTTP通信を行うアプリを作っています。 2021-08-06 01:05:00
Ruby Rubyタグが付けられた新着投稿 - Qiita 【Rails】toastrでフラッシュメッセージを表示 https://qiita.com/k___na00/items/c0f331bb0e09fa952929 【Rails】toastrでフラッシュメッセージを表示ポートフォリオを作成中、手軽にフラッシュメッセージをおしゃれにしたくて何か便利なGemないかなと探していたらtoastrというものがあったので、導入しました。 2021-08-06 01:50:58
Ruby Railsタグが付けられた新着投稿 - Qiita 【Rails】toastrでフラッシュメッセージを表示 https://qiita.com/k___na00/items/c0f331bb0e09fa952929 【Rails】toastrでフラッシュメッセージを表示ポートフォリオを作成中、手軽にフラッシュメッセージをおしゃれにしたくて何か便利なGemないかなと探していたらtoastrというものがあったので、導入しました。 2021-08-06 01:50:58
海外TECH DEV Community How to Compare Arrays in JavaScript Efficiently https://dev.to/doabledanny/how-to-compare-arrays-in-javascript-efficiently-1p0 How to Compare Arrays in JavaScript EfficientlyIn this article I m going to show you two ways of solving a typical interview style question The first solution is more obvious and less efficient The second solution introduces a great problem solving tool frequency counter objects which greatly improves the efficiency Here s what you ll gain from reading this article A framework for approaching problemsA very useful highly performant problem solving techniqueAn improved ability to analyse functions and improve performanceI also made a YouTube video for those that like video If you enjoy the video consider subscribing to my channel The problem“Write a function called “squared which takes two arrays The function should return true if every value in the array has its value squared in the second array The frequency of values must be the same Your interviewerAt first I will show you the “Naïve way of solving the problem the more obvious way that isn t efficient I ll then show you an efficient way to solve the problem using “frequency counter objects This is a very handy technique to have in your problem solving toolbox your brain Understanding the problemProblem solving Before we attempt to write a solution it s very important to understand the problem to give some examples and the results we expect We can then use these examples as tests to ensure our solution is working correctly Examples Squared trueSquared falseSquared falseExample is true because yep that s in array yep that s in array yep that s in array Example is false because yep that s in array yep that s in array nope that s not in array Example is false because yep that s in array nope there is only one in array yep but we won t even get to this check because the function returned false beforehand The “naïve wayFirst we check if the arrays are not equal length If not we return false and get out of the function early because the frequency of values can t possibly be the same Next we loop over each number num in arr Inside the loop we use indexOf to look for the position of num in arr The value is assigned to the variable foundIndex If the value was not found indexOf returns So we can check if foundIndex and return false if so If all is good we move on and remove this value from arr using the splice method This ensures the frequency of values in both arrays are the same After looping over each number and all the checks pass we can return true function squared arr arr if arr length arr length return false for let num of arr let foundIndex arr indexOf num if foundIndex return false arr splice foundIndex return true PerformanceThis algorithm has a Big O n because we loop over every single item in the first array then inside this loop we are looping over every single item in the second array with indexOf at worst case If you don t know or have forgotten what Big O is check out this video Big O Notation in JavaScript It s an important topic If the arrays are of length n then the number of operations will be n n n Hence Big O n Now this is not quite true because the second array becomes shorter on each loop so on average we will only loop over half the second array n The Big O will be of n n n But Big O looks at big picture stuff and as the input approaches infinity the will be insignificant and so we simplify to Big O n A smarter way Frequency Counter Objects Big O n What are Frequency Counter Objects Frequency counters are objects that tally things up Here s two examples of where they would be useful The number of times a character appears in a stringThe number of times a number appears in an arrayUsing frequency counters can also significantly improve the performance of an algorithm as it can often remove the need to use nested for loops Here s what the frequency counter object for would look like let frequencyCounter All the numbers appear once apart from which appears twice The solutionTo create a frequency counter object we loop over the array in question We then create a key and give it a value of the current value or if it s the first time we ve encountered this number frequencyCounter num will be undefined and so we initialise the value to I used two for…of loops as I felt it was easier to read but it could also be done with just one for loop The frequency counter objects can then be compared We first check if each key squared from frequency counter is a key in frequency counter If not return false Next we check if the frequencies values are equal If not return false And if we get through all this unscathed we get to the bottom and return true function squared arr arr if arr length arr length return false let frequencyCounter let frequencyCounter Create frequencyCounter for let num of arr frequencyCounter num frequencyCounter num Create frequencyCounter for let num of arr frequencyCounter num frequencyCounter num Compare frequency counters for let key in frequencyCounter if key in frequencyCounter return false if frequencyCounter key frequencyCounter key return false return true PerformanceTo create frequencyCounter we loop over all the numbers in arr gt n loopsSame for frequencyCounter gt n loopsTo compare the frequency counters we loop over all the keys in frequencyCounter gt at worst case n loopsTotal n n n nResulting in a Big O n linear time complexity Much better than our first effort of with Big O n quadratic time complexity Awesome referencesI can attribute almost all of my knowledge of algorithms and data structures to one outstanding course JavaScript Algorithms and Data Structures Masterclass by Colt Steele If you prefer books JavaScript Data Structures and Algorithms An Introduction to Understanding and Implementing Core Data Structure and Algorithm Fundamentals by Sammie BaeIf you enjoyed this post consider subscribing to my YouTube channel it would be much appreciated Thanks for reading Have a great day 2021-08-05 16:15:25
Apple AppleInsider - Frontpage News Twelve South ActionSleeve 2 review: great for the fitness-focused https://appleinsider.com/articles/21/08/05/twelve-south-actionsleeve-2-review-great-for-the-fitness-focused?utm_medium=rss Twelve South ActionSleeve review great for the fitness focusedGet the Apple Watch out of the way of your workouts by using the Twelve South ActionSleeve a simple sleeve that places the Apple Watch on your bicep The ActionSleeve from Twelve SouthWhen wearing the Apple Watch on your wrist it is prone to accidental button presses when doing some exercises like push ups The goal of the ActionSleeve is simply to get the Apple Watch off your wrist and out of the way Read more 2021-08-05 16:50:00
Apple AppleInsider - Frontpage News Best deals for August 5 - $100 off 14TB hard drive, $50 off Google Mesh wi-fi, more https://appleinsider.com/articles/21/08/05/best-deals-for-august-5---100-off-14tb-hard-drive-50-off-google-mesh-wi-fi-more?utm_medium=rss Best deals for August off TB hard drive off Google Mesh wi fi moreThursday s best deals include up to off Nomad products off an EVGA gaming keyboard iTunes movie sales and more Deals Thursday August Shopping online for the best discounts and deals can be an annoying and challenging task So rather than sifting through miles of advertisements check out this list of sales we ve hand picked just for the AppleInsider audience Read more 2021-08-05 16:48:17
Apple AppleInsider - Frontpage News New York's updated Excelsior vaccine passport drops Apple Wallet support https://appleinsider.com/articles/21/08/05/new-yorks-updated-excelsior-vaccine-passport-drops-apple-wallet-support?utm_medium=rss New York x s updated Excelsior vaccine passport drops Apple Wallet supportNew York s new Excelsior Pass which documents vaccination status and will soon be required to enter many businesses in New York City has dropped support for Apple Wallet Credit New York StateThe state issued Excelsior passport allows users to prove that they have received a COVID vaccination The original Excelsior Pass however expired six months after a user s vaccination date Because of that New Yorkers will need to update to the Excelsior Pass Plus Read more 2021-08-05 16:26:22
海外TECH Engadget Inside the sexual harassment lawsuit at Activision Blizzard https://www.engadget.com/activision-blizzard-lawsuit-discrimination-abuse-video-163056567.html?src=rss Inside the sexual harassment lawsuit at Activision BlizzardWhen California s fair employment agency sued Activision Blizzard one of the largest video game studios in the world on July th it wasn t surprising to hear the allegations of systemic gender discrimination and sexual harassment at the company It wasn t a shock to read about male executives groping their female colleagues or loudly joking about rape in the office or completely ignoring women for promotions What was surprising was that California wanted to investigate Activision Blizzard at all considering these issues have seemingly been present since its founding in Activision Blizzard is a multibillion dollar publisher with employees and a roster of legendary franchises including Call of Duty Overwatch Diablo and World of Warcraft On July th California s Department of Fair Employment and Housing filed a lawsuit against Activision Blizzard alleging executives had fostered an environment of misogyny and frat boy rule for years violating equal pay laws and labor codes along the way This is about more than dirty jokes in the break room ーthe lawsuit highlights clear disparities in hiring compensation and professional growth between men and women at Activision Blizzard and it paints a picture of pervasive sexism and outright abuse in the workplace Here s a rundown of some of the allegations Just percent of all Activision Blizzard employees are women Top leadership roles are filled solely by white men Across the company women are paid less promoted slower and fired faster than men HR and executives fail to take complaints of harassment seriously Women of color in particular are micromanaged and overlooked for promotions A pervasive frat boy culture encourages behavior like “cube crawls where male employees grope and sexually harass female co workers at their desks It s been a few weeks since the lawsuit was filed and employees executives and players have all had a chance to respond Meanwhile additional reports of longstanding harassment and sexism at Activision Blizzard have continued to roll out including photos and stories of the “Cosby Suite which was specifically named in the filing According to the lawsuit this was a hotel room where male employees would gather to harass women at company events named after the rapist Bill Cosby nbsp Days after the filing Kotaku published photos of the supposed Cosby Suite showing male Activision Blizzard developers posing on a bed with a framed photo of Bill Cosby at BlizzCon Screenshots of conversations among the developers discussed gathering “hot chixx for the Coz and other insulting immature things especially when you remember these are middle aged men not middle schoolers One of the only executives actually named in the suit was Blizzard head J Allen Brack and it alleges he routinely ignored systemic harassment and failed to punish abusers Brack called the allegations “extremely troubling but this line was thrown back in his face on Twitter when independent developer Nels Anderson compared it to a video out of BlizzCon featuring Brack on the far left nbsp In the video a young woman asks the panel of World of Warcraft developers all six of whom are white men whether they ll ever create a female character that doesn t look like she just stepped out of a nbsp Victoria s Secret catalog The panelists laugh and one responds quot Which catalog would you like them to step out of quot They proceed to essentially dismiss her question At the end of the exchange Brack piles on and makes a joke about one of the new characters coming from a sexy cow catalog On August rd just two weeks after California filed its lawsuit Brack stepped down from his role as the president of Blizzard In his place will be GM Mike Ybarra and executive development VP Jen Oneal Oneal will be the first woman in a president role since Activision s founding in the lawsuit notes that there has never been a non white president or CEO of Activision Blizzard Activision Blizzard s initial response to the lawsuit was tragic with one leader calling the allegations meritless and distorted Activision Blizzard CEO Bobby Kotick who regularly gets into fights with shareholders over the ridiculous fortune he s amassed published his own response to the lawsuit where he essentially promised to listen better Unsurprisingly this didn t alleviate many employees concerns A petition in support of the lawsuit ended up gathering more than employee signatures and workers organized a walk out just eight days after the filing calling for systemic change at the studio Shareholders weren t bolstered by Kotick s response either Investors filed an additional class action lawsuit against Activision Blizzard on August rd alleging the company failed to raise potential regulatory issues stemming from its discriminatory culture Blizzard s head of HR Jesse Meschuk also left the company in the weeks following the initial lawsuit Meanwhile other major game developers have rallied behind the suit and former Activision Blizzard leaders have shared their support for employees apologizing for their parts in sustaining a toxic company culture This is later than it should have been Here s my response pic twitter com hiFaJRーChris Metzen ChrisMetzen July None of this is new As evidenced by the photos videos stats and personal stories flowing out of Activision Blizzard the company has operated on a bro first basis for decades and honestly it s been sustained by an industry that largely functions the same way In a wave of accusations against prominent male developers crashed over the industry and AAA studios like Ubisoft and Riot Games made headlines for fostering toxic workplace environments California is currently suing Riot over allegations of sexual harassment and gender discrimination in hiring and pay practices But even that s not new Women non binary people and marginalized folks in the video game industry have been speaking up about systemic harassment and discrimination for literal decades Sexism is apparent in the hiring and pay habits of many major studios and it s also clear in the games themselves which feature an overabundance of straight white male protagonists What is surprising this time around is that the lawsuit against Activision Blizzard kind of came out of nowhere It took a blockbuster media report to make California sue Riot in but the lawsuit against Activision Blizzard appeared on its own after years of quiet investigation by the Department of Fair Employment and Housing If sexism is systemic in the video game industry it feels like the system is finally fighting back 2021-08-05 16:30:56
海外TECH Network World Chip shortage has networking vendors scrambling https://www.networkworld.com/article/3628488/chip-shortage-has-networking-vendors-scrambling.html#tk.rss_all Chip shortage has networking vendors scrambling High tech vendors continue to battle supply chain problems and higher costs brought on by the current semiconductor shortage according to statements made in the most recent round of earnings calls As Network World reported in May COVID triggered an explosion of the global remote workforce which created extraordinary demand for new tech gear It also forced the shutdown of processor plants Restarting those plants and renewing supply chains to their pre pandemic state will be a lengthy process industry leaders warn To read this article in full please click here 2021-08-05 16:21:00
Cisco Cisco Blog Happy to go beyond standards again with the 25km ER-Lite 100G optic https://blogs.cisco.com/sp/happy-to-go-beyond-standards-again-with-the-25km-er-lite-100g-optic cisco 2021-08-05 16:00:52
海外科学 NYT > Science Climate Crisis Catches Power Companies Unprepared https://www.nytimes.com/2021/07/29/climate/electric-utilities-climate-change.html extreme 2021-08-05 16:20:50
金融 金融庁ホームページ バーゼル銀行監督委員会による「銀行規制と監督における比例適用-グローバルな共同調査」について掲載しました。 https://www.fsa.go.jp/inter/bis/20210805/20210805.html 銀行 2021-08-05 17:00:00
ニュース BBC News - Home Coronavirus: Transport secretary defends travel rule changes https://www.bbc.co.uk/news/uk-58100523 deadline 2021-08-05 16:25:36
ニュース BBC News - Home Covid-19: PM defends travel rules, and the piano prodigy flourishing in the pandemic https://www.bbc.co.uk/news/uk-58102844 coronavirus 2021-08-05 16:52:28
ニュース BBC News - Home Kaylee-Jayde Priest: Mother and boyfriend convicted https://www.bbc.co.uk/news/uk-england-birmingham-58106169 abdominal 2021-08-05 16:47:27
ニュース BBC News - Home Logan Mwangi: Mum and step-dad in court over death of boy, five https://www.bbc.co.uk/news/uk-wales-58053074 logan 2021-08-05 16:55:49
ニュース BBC News - Home Google Maps warns drivers about emission charges https://www.bbc.co.uk/news/technology-58102651 clean 2021-08-05 16:37:50
ニュース BBC News - Home Galahad v Dickens to be live on BBC Radio 5 Live https://www.bbc.co.uk/sport/boxing/58105357 bolotniks 2021-08-05 16:29:23
ニュース BBC News - Home Covid-19 in the UK: How many coronavirus cases are there in my area? https://www.bbc.co.uk/news/uk-51768274 cases 2021-08-05 16:25:01
ビジネス ダイヤモンド・オンライン - 新着記事 米国の香港市民に「安全な避難場所」提供へ 米大統領令 - WSJ発 https://diamond.jp/articles/-/278951 米大統領 2021-08-06 01:22:00
北海道 北海道新聞 クマ目撃の旭山公園、侵入防止へ閉鎖延長 札幌・中央区 https://www.hokkaido-np.co.jp/article/575433/ 中央区界川 2021-08-06 01:02:56
GCP Cloud Blog BigQuery Admin reference guide: Query optimization https://cloud.google.com/blog/topics/developers-practitioners/bigquery-admin-reference-guide-query-optimization/ BigQuery Admin reference guide Query optimizationLast week in the BigQuery reference guide we walked through query execution and how to leverage the query plan This week we re going a bit deeper covering more advanced queries and tactical optimization techniques  Here we ll walk through some query concepts and describe techniques for optimizing related SQL  Filtering dataFrom last week s post you already know that the execution details for a query show us how much time is spent reading data either from persistent storage federated tables or from the memory shuffle and writing data either to the memory shuffle or to persistent storage Limiting the amount of data that is used in the query or returned to the next stage can be instrumental in making the query faster and more efficient  Optimization techniques Necessary columns only Only select the columns necessary especially in inner queries SELECT is cost inefficient and may also hurt performance If the number of columns to return is large consider using SELECT EXCEPT to exclude unneeded columns Auto pruning with partitions and clusters Like we mentioned in our post on BigQuery storage partitions and clusters are used to segment and order the data Using a filter on columns that the data is partitioned or clustered on can drastically reduce the amount of data scanned   Expression order matters BigQuery assumes that the user has provided the best order of expressions in the WHERE clause and does not attempt to reorder expressions Expressions in your WHERE clauses should be ordered with the most selective expression first  The optimized example below is faster because it doesn t execute the expensive LIKE expression on the entire column content but rather only on the content from user anon Order by with limit Writing results for a query with an ORDER BY clause can result in Resources Exceeded errors Since the final sorting must be done on a single worker ordering a large result set can overwhelm the slot that is processing the data If you are sorting a large number of values use a LIMIT clause which will filter the amount of data passed onto the final slot  Understanding aggregationIn an aggregation query GROUP BYs are done in individual workers and then shuffled such that key value pairs of the same key are then in the same worker Further aggregation then occurs and is passed into a single worker and served RepartitioningIf too much data ends up on a single worker BigQuery may re partition the data Let s consider the example below The sources start writing to Sink and partitions within the memory shuffle tier Next the shuffle Monitor detects Sink is over the limit Now the partitioning scheme changes and the sources stop writing to Sink and instead start writing to Sink and  Optimizations Late aggregation Aggregate as late and as seldom as possible because aggregation is very costly The exception is if a table can be reduced drastically by aggregation in preparation for a join more on this below For example instead of a query like this where you aggregate in both the subqueries and the final SELECT You should only aggregate once in the outer query Nest repeated data Let s imagine you have a table showing retail transactions If you model one order per row and nest line items in an ARRAY field then you have cases where GROUP BY is no longer required For example looking at the total number of items in an order by using ARRAY LENGTH order id item id item id Understanding joinsOne powerful aspect of BigQuery is the ability to combine data and understand relationships and correlation information from disparate sources Much of the JOIN syntax is about expressing how that data should be combined and how to handle data when information is mismatched However once that relationship is encoded BigQuery still needs to execute it  Hash based joinsLet s jump straight into large scale join execution  When joining two tables on a common key BigQuery favors a technique called the hash based join or more simply a hash join With this technique we can process a table using multiple workers rather than moving data through a coordinating node   So what does hashing actually involve When we hash values we re converting the input value into a number that falls in a known range There s many properties of hash functions we care about for hash joins but the important ones are that our function is deterministic the same input always yields the same output value and has uniformity our output values are evenly spread throughout the allowed range of values With an appropriate hashing function we can then use the output to bucket values  For example if our hash function yields an output floating point value between and we can bucket by dividing that key range into N parts where N is the number of buckets we want  Grouping data based on this hash value means our buckets should have roughly the same number of discrete values but even more importantly all duplicate values should end up in the same bucket  Now that you understand what hashing does let s talk through joining To perform the hash join we re going to split up our work into three stages Stage Prepare the first tableIn BigQuery data for a table is typically split into multiple columnar files but within those files there s no sorting guarantee that ensures that the columns that represent the join key are sorted and colocated  So what we do is apply our hashing function to the join key and based on the buckets we desire we can write rows into different shuffle partitions  In the diagram above we have three columnar files in the first table and we ve using our hashing technique to split the data into four buckets color coded  Once the first stage is complete the rows of the first table are effectively split into four file like partitions in shuffle with duplicates co located Stage  Prepare the second tableThis is effectively the same work as the first stage but we re processing the other table we ll be joining data against The important thing to note here is that we need to use the same hashing function and therefore the same bucket grouping as we re aligning data In the diagram above the second table had four input files and thus four units of work and the data was written into a second set of shuffle partitions Stage consume the aligned data and perform the joinAfter the first two stages are completed we ve aligned the data in the two tables using a common hash function and bucketing strategy  What this means is that we have a set of paired shuffle partitions that correspond to the same hash range which means that rather than scanning potentially large sets of data we can execute the join in pieces as each worker is provided only the relevant data for doing it s subset of the join It s at this point that we care about the nature of the join operation again depending on the desired join relationship we may yield no rows a single row or many rows for any particular input row from the original input tables Now you can also get a better sense of how important having a good hashing function may be  if the output values are poorly distributed we have problems because we re much more likely to have a single worker that s slower and forced to do the majority of the work  Similarly if we picked our number of buckets poorly we may have issues due to having split the work too finely or too coarsely  Fortunately these are not insurmountable problems as we can leverage dynamic planning to fix this we simply insert query stages to adjust the shuffle partitions Broadcast joinsHash based joins are an incredibly powerful technique for joining lots of data but your data isn t always large enough to warrant it  For cases where one of the tables is small we can avoid all the alignment work altogether Broadcast joins work in cases where one table is small  In these instances it s easiest to replicate the small table into shuffle for faster access and then simply provide a reference to that data for each worker that s responsible for processing the other table s input files Optimization techniquesLargest table first  BigQuery best practice is to manually place the largest table first followed by the smallest and then by decreasing size Only under specific table conditions does BigQuery automatically reorder optimize based on table size Filter before joins WHERE clauses should be executed as soon as possible especially within joins so the tables to be joined are as small as possible We recommend reviewing the query execution details to see if filtering is happening as early as possible and either fix the condition or use a subquery to filter in advance of a JOIN Pre aggregate to limit table size As mentioned above aggregating tables before they are joined can help improve performance but only if the amount of data being joined is drastically reduced and tables are aggregated to the same level i e if there is only one row for every join key value Clustering on join keys When you cluster a table based on the key that is used to join the data is already co located which makes it easier for workers to split the data into the necessary partitions within the memory shuffle  A detailed query finding popular libraries in GithubNow that we understand some optimization techniques for filtering aggregating and joining data let s look at a complex query with multiple SQL techniques Walking through the execution details for this query should help you understand how data flows and mutates as it moves through the query plan so that you can apply this knowledge and understand what s happening behind the scenes in your own complex query   Thepublic Githubdata has one table that contains information about source code filenames while another contains the contents of these files  By combining the two together we can filter down to focus on interesting files and analyze them to understand which libraries are frequently used by developers Here s an example of a query that does this for developers using the Go programming language It scans files having the appropriate go extensions and looks for statements in the source code for importing libraries then counts how often those libraries are used and how many distinct code repositories use them In SQL it looks like this We can see from a casual read that we ve got lots of interesting bits here subqueries both a distributed join the contents and files tables array manipulation cross join unnest and powerful features such as regular expression filters and computing distinctness Detailed stages and stepsFirst let s examine the full details of the plan in a graph format  Here we re looking at the low level details of how this query is run as a set of stages  Let s work through the query stages in detail If you want a graphical representation similar to the one we re showing here check out this code sample Stage S S Reading and filtering from the files tableThe initial stage corresponding to the inner subquery of the SQL begins by processing the files table  We can see the first task is to read the input and immediately filter that to only pass through files with the appropriate suffix  We then group based on the id and and repo name as we re potentially working with many duplicates and we only want to process each distinct pair once In stage S we continue the GROUP BY operation each worker in the first stage only deduplicated the repo id pairs in their individual input file s the aggregate stage here is to combine those so that we ve deduplicated across all input rows in the files table Stage S Reading in the contents tableIn this stage we begin reading the source code in the contents table looking for import statements the syntax for referencing libraries in the Go language  We collect information about the id which will become the join key and the content which has matches You can also see that in both this stage and the previous S the output is split based on a BY HASH operation  This is the first part of starting the hash join where we begin to align join keys into distinct shuffle partitions  However anytime we re dealing with data where we want to divide the work we ll be splitting it into shuffle buckets with this operation Stages S SA RepartitioningThis query invoked several repartitioning stages  This is an example of the dynamic planner rebalancing data as it s working through the execution graph  Much of the internals of picking appropriate bucketing is based on heuristics as operations such as filtration can drastically change the amount of data flowing in and out of query stages In this particular query the query plan has chosen a non optimal bucketing strategy and is rebalancing the work as it goes  Also note that this partitioning is happening on both sides of what will become the joined data because we need to keep the partitioned data aligned as we enter the join Stage SB Executing the joinHere s where we begin correlating the data between the two inputs  You can see in this stage we have two input reads one for each side of the join and start computing counts  There s also some overloaded work here we consume the file contents to yield an array representing each individual library being imported and make that available to future stages Stages SC SD Partial AggregationsThese two stages are responsible for computing our top level statistics  we wanted to count the total number of times each library was referenced as well as the number of distinct repositories  We end up splitting that into two stages Stage SE SF Ordering and limitingOur query requested only the top libraries ordered first by distinct repository count and then total frequency of use  The last two stages are responsible for doing this sorting and reduction to yield the final result Other optimization techniquesAs a final thought we ll leave you with a few more optimization techniques that could help improve the performance of your queries Multiple WITH clauses The WITH statement in BigQuery is like a Macro At runtime the contents of the subquery will be inlined every place the alias is referenced This can lead to query plan explosion as seen by the plan executing the same query stages multiple times Instead try using a TEMP table String comparisons REGEXP CONTAINS can offer more functionality but it has a slower execution time compare to LIKE Make LIKE when the full power of regex is not needed e g wildcard matching  regexp contains dim test to dim like  test First or last record When trying to calculate the first or last record in a subset of your data using the ROW NUMBER function can fail with Resources Exceeded errors if there are too many elements to ORDER BY in a single partition Instead try using ARRAY AGG which runs more efficiently because the ORDER BY is allowed to drop everything except the top record on each GROUP BY For example this Becomes this See you next week Thanks again for tuning in this week Next up is data governance so be sure to keep an eye out for more in this series by following Leigha on LinkedIn and Twitter Related ArticleBigQuery Admin reference guide Query processingBigQuery is capable of some truly impressive feats be it scanning billions of rows based on a regular expression joining large tables Read Article 2021-08-05 16:30:00

コメント

このブログの人気の投稿

投稿時間:2021-06-17 05:05:34 RSSフィード2021-06-17 05:00 分まとめ(1274件)

投稿時間:2021-06-20 02:06:12 RSSフィード2021-06-20 02:00 分まとめ(3871件)

投稿時間:2020-12-01 09:41:49 RSSフィード2020-12-01 09:00 分まとめ(69件)