投稿時間:2022-01-30 05:15:41 RSSフィード2022-01-30 05:00 分まとめ(23件)

カテゴリー等 サイト名等 記事タイトル・トレンドワード等 リンクURL 頻出ワード・要約等/検索ボリューム 登録日
海外TECH MakeUseOf Blender UV Mapping: 7 Tips and Tricks for Beginners https://www.makeuseof.com/blender-uv-mapping-tips/ skill 2022-01-29 19:30:22
海外TECH MakeUseOf Music Production Glossary: The Definitions You Need to Know https://www.makeuseof.com/music-production-glossary/ Music Production Glossary The Definitions You Need to KnowWhether you just like messing around with music production in your spare time or want to make it big it s essential you know these terms 2022-01-29 19:30:22
海外TECH DEV Community Modern data warehouse patterns: ELT with Snowflake variants https://dev.to/biellls/modern-data-warehouse-patterns-elt-with-snowflake-variants-26b4 Modern data warehouse patterns ELT with Snowflake variants Leveraging semi structured data for resilience against schema changesAs data warehouse technologies get cheaper and better ELT is gaining momentum over ETL In this article we will show you how to leverage Snowflake s semi structured data to build integrations that are highly resistant to changes in schema while staying performant Schema changes are one of the most common things that can break a data pipeline adding and removing fields changes in types or length of the data etc so it is extremely useful to protect yourself against them Real world example Personal informationLet s assume we have a table with basic information about our clients The goal is to load the information into snowflake unchanged namesurnameageAnneHoustonJohnDoeWilliamWilliamsWe would usually create the following table in Snowflake CREATE TABLE clients name VARCHAR surname VARCHAR age NUMBER Notice how we don t specify the varchar s length or the number s precision and scale This is preferable because snowflake will automatically use the minimum size needed to store the data efficiently and if the source system changes the length of a varchar or the precision of a number your flows won t break An exception is when a number has decimals we will need to specify a precision and scale But if we do that our integration will fail if a field is removed and if a field is added we won t notice We are not resilient to schema changes To solve that we will instead create a table with just one variant field where we will load all the data no matter what fields it has CREATE TABLE clients raw src VARIANT In order to load the data we can dump it as JSON into a stage CREATE OR REPLACE FILE FORMAT json format TYPE JSON CREATE OR REPLACE STAGE mystage FILE FORMAT json format Let s create a file with some JSON data to load into the table Run the following in a shell echo name Anne surname Houston age gt tmp data jsonecho name John surname Doe age gt gt tmp data jsonecho name William surname Williams age gt gt tmp data jsonNext we run this in snowflake to upload the data to a stage and then load the data into the table put file tmp data json mystageCOPY INTO clients raw FROM mystage data json FILE FORMAT json format We can now query the data as SELECT min src age as age from clients raw Creating a viewIt is easy to query the data but it can be verbose and a little confusing to analysts who have never worked with unstructured data In order to make it transparent to the end user we can create a view that turns it into structured data CREATE VIEW clients ASSELECT src name VARCHAR AS name src surname VARCHAR AS surname src age NUMERIC AS ageFROM clients raw The same query from before would now be SELECT min age as age from clients And now it s indistinguishable from a structured table from the user s point of view Removing a column adding a columnSuppose that database admins realized that storing age in a column is not ideal since it needs to be updated every time a client has a birthday Instead he decides to drop the age column and store a date with their birthday The new table is as follows namesurnamebirthdayAnneHouston JohnDoe WilliamWilliams Let s create the new data echo name Anne surname Houston birthday gt tmp data jsonecho name John surname Doe birthday gt gt tmp data jsonecho name William surname Williams birthday gt gt tmp data jsonWe would usually append the data but to make this tutorial simple we will just replace the old data with the new one TRUNCATE TABLE clients raw COPY INTO clients raw FROM mystage data json FILE FORMAT json format put file tmp data json mystage Since we store all available data as a variant our integration will not break The view would not break either but the age would show as null try SELECT FROM clients The only thing we need to do to take advantage of the new field is to update the view For backwards compatibility we will still include the age as a calculation CREATE OR REPLACE VIEW clients ASSELECT src name VARCHAR as name src surname VARCHAR as surname src birthday DATE as birthday DATEDIFF years src birthday CURRENT DATE as ageFROM clients raw That s it our pipelines never broke and there s no need to change our data flows or source table definitions If a new field gets added to the table and no one notices it s still getting staged into snowflake in the variant so the moment someone requests the field in the view we ll be able to see it without needing to backfill the data Doesn t this take up more space than regular tables Isn t it slower to query This excerpt from Snowflake s docs answers the question For data that is mostly regular and uses only native types strings and integers the storage requirements and query performance for operations on relational data and data in a VARIANT column is very similar For better pruning and less storage consumption we recommend flattening your object and key data into separate relational columns if your semi structured data includes Dates and timestamps especially non ISO dates and timestamps as string valuesNumbers within stringsArraysNon native values such as dates and timestamps are stored as strings when loaded into a VARIANT column so operations on these values could be slower and also consume more space than when stored in a relational column with the corresponding data type So in terms of performance and storage it should be really similar albeit a little slower An exception would be if we need to query the birthday because it s stored as a string as we will see in the following section Improving performanceBecause variants store dates as strings they are not as efficient to filter by This is only an issue if the table is large and you intend to query the table by that date Let s see an example of how to improve performance in that case CREATE OR REPLACE TABLE clients raw src VARIANT birthday DATE COPY INTO clients raw FROM select as src to date birthday DATE AS birthday FROM mystage data json FILE FORMAT json format And modify the view CREATE OR REPLACE VIEW clientsSELECT src name VARCHAR as name src surname VARCHAR as surname birthday lt We changed this to get the column directly DATEDIFF years birthday CURRENT DATE as age lt Here tooFROM clients raw Now queries filtering by birthday or getting MAX birthday for example will be much faster What is the best way to load the data The most efficient way to load the data into a table is by using a COPY command since Snowflake can optimize a bulk load It can t do that with insert statements Here are some of the most popular ways to load the data into snowflake each with their advantges and disadvantages CSV A gzipped CSV is the fastest way to load structured data into snowflake It takes more space than parquet It can also be loaded into a variant column with the right casting see example later The data can not be loaded easily into a variant JSON Can be easily loaded into a variant but it takes a lot of space in your data lake Avro Built in schema easily loaded into a variant or into a structured table Takes more space than parquet Parquet Columnar storage that has a better compression than the other options and can easily be loaded into a structured or unstructured table It is slower than CSV to load into a structured table Bottom LineLoading data into Snowflake using this method is a great way to save you a lot of headaches and minimise data pipeline failures It is a good rule of thumb to always use this method unless you will be loading an extremely large amount of data and need the extra performance that a structured table will give you If you decide to create a structured table instead of using this method be aware that the pipelines will break on any schema change What are the best tools to load data like this Any ETL ELT tool that is flexible enough for instance Airflow can be adapted to use this method You can also check out our ETL tool that encourages this pattern and other modern best practices for data engineering Sources storing semi structured data in a variant column vs flattening the nested structure 2022-01-29 19:02:14
海外TECH Engadget Spotify reportedly has a very limited set of COVID content guidelines https://www.engadget.com/spotify-very-limited-covid-content-guidelines-194545334.html?src=rss Spotify reportedly has a very limited set of COVID content guidelinesWhen Spotify started removing Neil Young s playlist from its service it defended its practices against misinformation and said that it had already pulled over COVID related podcast episodes Young threatened to remove his catalog from the service over allegations that Joe Rogan is spreading COVID vaccine misinformation through his podcast Despite what Spotify said The Joe Rogan Experience is still available on the platform and Spotify s COVID content policy as seen by The Verge might be able to explain why that s the case nbsp Apparently even Spotify s employees are upset with the company s partnership with Rogan due to his views on COVID Company head of global communications Dustee Jenkins reportedly addressed those concerns on Spotify s Slack and told employees that a team had already reviewed multiple controversial Joe Rogan Experience episodes and found that they quot didn t meet the threshold for removal quot She called members of the team who did the internal review quot some of the best experts in the space quot and also said that Spotify is working with third parties to help it evolve its policies quot What Spotify hasn t done is move fast enough to share these policies externally and are working to address that as soon as possible quot she added nbsp While Spotify has yet to share those policies The Verge posted a copy of the healthcare guidelines section which prohibits quot Content that promotes dangerous false or deceptive content about healthcare that may cause offline harm and or pose a direct threat to public health such as Denying the existence of AIDS or COVID Encouraging the deliberate contracting of a serious or life threatening disease or illnessSuggesting that consuming bleach can cure various illnesses and diseasesSuggesting that wearing a mask will cause the wearer imminent life threatening physical harmPromoting or suggesting that the vaccines are designed to cause death quot There s a lot podcasters can get away with with such a narrow and limited set of rules In comparison YouTube makes it clear that any content with claims that contradict local health authorities or WHO is prohibited on its website It s not just suggestions that wearing a mask will cause harm that s prohibited on the Google owned service but also claims that masking does not help prevent the contraction or transmission of COVID A podcast host on Spotify can say the latter without repercussions Spotify also doesn t have a rule prohibiting claims that ivermectin is a safe and effective treatment for the virus nbsp Back in December a group of scientists and doctors sent an open letter to Spotify asking it to implement a misinformation policy after Rogan guested Dr Robert Malone on his show In the controversial episode Malone claimed people only believe that COVID vaccines are effective due to quot mass formation psychosis quot The group also listed several quot misleading and false claims quot Rogan made on his podcast throughout the pandemic including the time he said mRNA vaccines are quot gene therapy quot and another when he promoted the use of ivermectin to treat COVID 2022-01-29 19:45:45
海外TECH CodeProject Latest Articles A Generic Form of the NameValueCollection https://www.codeproject.com/Articles/5323395/A-Generic-Form-of-the-NameValueCollection namevaluecollection 2022-01-29 19:49:00
海外科学 NYT > Science When Omicron Isn’t So Mild https://www.nytimes.com/2022/01/29/health/omicron-chronic-illness.html pandemic 2022-01-29 19:21:24
ニュース BBC News - Home Storm Malik: Boy, 9, dies after tree falls during storm https://www.bbc.co.uk/news/uk-60183035?at_medium=RSS&at_campaign=KARANGA disruption 2022-01-29 19:19:21
ニュース BBC News - Home Sergio Mattarella: At 80, Italy president re-elected on amid successor row https://www.bbc.co.uk/news/world-europe-60183929?at_medium=RSS&at_campaign=KARANGA candidate 2022-01-29 19:24:43
ニュース BBC News - Home Barcelona sign winger Traore on loan from Wolves https://www.bbc.co.uk/sport/football/60185112?at_medium=RSS&at_campaign=KARANGA wolves 2022-01-29 19:26:09
ニュース BBC News - Home Former Great Britain winger Drummond dies aged 63 https://www.bbc.co.uk/sport/rugby-league/60185283?at_medium=RSS&at_campaign=KARANGA drummond 2022-01-29 19:20:20
ビジネス ダイヤモンド・オンライン - 新着記事 セブン-イレブンの大ヒットカレーパン開発秘話、「鈴木敏文イズム」は今も - 「超一流」の流儀 https://diamond.jp/articles/-/294722 中興の祖 2022-01-30 04:55:00
ビジネス ダイヤモンド・オンライン - 新着記事 コンプレックスを“武器”に60歳以降を幸せにする藤原和博式ルール - from AERAdot. https://diamond.jp/articles/-/294205 fromaeradot 2022-01-30 04:50:00
ビジネス ダイヤモンド・オンライン - 新着記事 絶景写真で旅気分を楽しもう!世界「自然」遺産を大特集【地球の歩き方】 - 地球の歩き方ニュース&レポート https://diamond.jp/articles/-/294203 絶景写真で旅気分を楽しもう世界「自然」遺産を大特集【地球の歩き方】地球の歩き方ニュースレポート国や首都、グルメや島など、世界について学ぶことができると好評の地球の歩き方【旅の図鑑シリーズ】に、待望の第弾が登場ユネスコの世界遺産に登録されているすべての自然遺産件と複合遺産件を冊にまとめました。 2022-01-30 04:45:00
ビジネス ダイヤモンド・オンライン - 新着記事 フォード「マスタング」の加速をモデルごとに比較!最新EVも含めた結果は? - 男のオフビジネス https://diamond.jp/articles/-/294201 caranddriver 2022-01-30 04:40:00
ビジネス ダイヤモンド・オンライン - 新着記事 有名人がSNS上で「非健康的な食べもの」を拡散するインパクトとは? - ヘルスデーニュース https://diamond.jp/articles/-/294207 有名人がSNS上で「非健康的な食べもの」を拡散するインパクトとはヘルスデーニュース社会的に大きなインパクトを与える有名人が、ソーシャルメディア上で企業広告とは無関係に、非健康的な食品に関する情報を拡散しているとする論文が、「JAMANetworkOpen」に月日掲載された。 2022-01-30 04:35:00
ビジネス ダイヤモンド・オンライン - 新着記事 新日本酒紀行「瑞冠」 - 新日本酒紀行 https://diamond.jp/articles/-/293770 瀬戸内海 2022-01-30 04:30:00
ビジネス ダイヤモンド・オンライン - 新着記事 「良いリーダーか悪いリーダーか」がすぐバレる決定的な瞬間とは? - 優れたリーダーはみな小心者である。 https://diamond.jp/articles/-/292471 職場 2022-01-30 04:25:00
ビジネス ダイヤモンド・オンライン - 新着記事 ひろゆきが呆れる「頭が悪い人のお金の使い方」ワースト1 - 1%の努力 https://diamond.jp/articles/-/294150 youtube 2022-01-30 04:20:00
ビジネス ダイヤモンド・オンライン - 新着記事 「定年退職した人や年金受給者」が確定申告でトクするケースは? - トクする確定申告・青色申告 https://diamond.jp/articles/-/294619 「定年退職した人や年金受給者」が確定申告でトクするケースはトクする確定申告・青色申告退職金をもらったり、年金を受け取ったりしている人には、確定申告をする義務のある人とない人がいます。 2022-01-30 04:15:00
ビジネス ダイヤモンド・オンライン - 新着記事 今、若い人の「組織への所属意識」は 本当に下がっているのか? - だから、この本。 https://diamond.jp/articles/-/293494 部下から「転職の相談」をされたとき、マネジャーとしてどんな対応をするだろうか。 2022-01-30 04:10:00
ビジネス ダイヤモンド・オンライン - 新着記事 なぜ、まず優秀な人材から辞めていくのか?「優良企業」を去る人たちのホンネ - チームが自然に生まれ変わる https://diamond.jp/articles/-/293300 なぜ、まず優秀な人材から辞めていくのか「優良企業」を去る人たちのホンネチームが自然に生まれ変わる『チームが自然に生まれ変わる』と『心理的安全性のつくりかた』それぞれの著者である李英俊さんと石井遼介さんによる対談シリーズも、いよいよ最終回となる。 2022-01-30 04:05:00
ビジネス 東洋経済オンライン ミャンマーのクーデターで機能不全に陥るASEAN カンボジアとシンガポールが今後のカギを握るが…… | アジア諸国 | 東洋経済オンライン https://toyokeizai.net/articles/-/507508?utm_source=rss&utm_medium=http&utm_campaign=link_back asean 2022-01-30 04:50:00
ビジネス 東洋経済オンライン 「もう服は一生買わなくても大丈夫」と気づいた日 「お金もち=おしゃれ」という方程式からの脱出 | 買わない生活 | 東洋経済オンライン https://toyokeizai.net/articles/-/507524?utm_source=rss&utm_medium=http&utm_campaign=link_back 東洋経済オンライン 2022-01-30 04:30:00

コメント

このブログの人気の投稿

投稿時間:2021-06-17 05:05:34 RSSフィード2021-06-17 05:00 分まとめ(1274件)

投稿時間:2021-06-20 02:06:12 RSSフィード2021-06-20 02:00 分まとめ(3871件)

投稿時間:2020-12-01 09:41:49 RSSフィード2020-12-01 09:00 分まとめ(69件)